2.22 2.26 2.28 Directory UMM :wiley:Public:college:statistics:Johnson:

2.18 2.19 2.20

2.21 2.22

2.23 2.24 Company A B . . . . . . . . . . . . . . . . . 68 Mean salary 27,000 23,500 Median salary 22,000 25,000 CHAPTER 2 DESCRIBING PATTERNS IN DATA The data displays and summary numbers we have considered to this point are helpful in organizing large sets of numbers, and they are particularly appropriate for measurements of quantitative variables. Given the four observations 3 1 4 calculate the sample variance, sample standard deviation, range, and interquar- tile range. Calculate the sample mean, variance, and standard deviation for each of the following data sets: a. 6, 9, 7, 9, 14 b. 23, 29, 22, 26 c. 1 1, .8, 2, 1.6, 2.9 Annual salaries in thousands of dollars for ten of the top-ranking officers in a large corporation are given here: 175 150 210 650 425 230 190 260 300 250 Calculate the sample mean and sample median. Comment on the appropriate- ness of these numbers as summary measures of top executive salaries. Refer to Exercise 2.20. Obtain the sample quartiles of top executive salaries. Sketch three density histograms: a symmetric histogram, one with a long right- hand tail, and one with a long left-hand tail. Keeping in mind that the mean is the center of gravity, or balancing point, of the data distribution and that the median divides the data in half, indicate the relative positions of the sample mean and the sample median on each of your density histograms. Vendors doing business with a particular state were sampled to determine the economic impact of state business on their gross sales. A sample of 15 firms that provide services to the state had the following percentages of total annual sales as a result of sales to the state: 27 0 12 0 14 9 1 2 1 1 0 1 5 3 7 6 5 0 1 0 1 0 3 2 3 0 7 0 a. Find the sample median, first quartile, and third quartile. b. Find the range and interquartile range. c. Find the sample 90th percentile. The mean and median salaries of machinists employed by two competing companies, A and B, are as follows: EXERCISES 2 2 2.25

a. 2.26

2.27

b, 2.28

4 4 4 x Hint: M Hint: Minitab or similar program recommended x Hint: Q Q s x x n n computing formula x x x x , , , , , , , , 69

2.5 NUMERICAL SUMMARIES OF DATA DISTRIBUTIONS

DeathClm.dat 1 2 Assume that the salaries are set in accordance with job competence and that the overall quality of workers is about the same in the two companies. a. Which company offers a better prospect to a machinist having superior ability? Explain. b. Where can a medium-quality machinist expect to earn more? Explain. Consider the data on workers per vehicle for the 10 most productive vehicle assembly plants listed in Table 2.4 see Example 2.8 . a. Plot these data as a dot diagram. Calculate the sample mean, , and indicate the mean on the dot diagram. b. Calculate the 5 trimmed mean. [ Round 10 .05 up to the next integer when determining the number of observations to delete from the ordered data. ] c. Calculate the sample median, , and indicate the median on the dot diagram in part Compare the mean and median. What does the discrepancy if any tell you about the symmetry of this data set? Consider the RD expenditures as a percentage of sales given in Example 2.2. a. Calculate the 5 trimmed mean. [ Round 12 .05 up to the next integer when determining the number of observations to delete from the ordered data. ] b. Compare the 5 trimmed mean with the sample mean see Exercise 1.37 of Chapter 1 . c. Calculate the sample median. d. Discuss the relative effects of outliers on the 5 trimmed mean, the sample mean, and the median for this particular example. Consider the death claim amounts given in Example 2.10. a. Calculate the sample mean, , and the 5 trimmed mean. [ See Example 2.10 for the ordered claims. Round 31 .05 up to the next integer when determining the number of observations to delete from the ordered set. ] b. Calculate the first and third quartiles, and , and determine the interquartile range, IR. c. Using the data and the median from Example 2.10 and the results in part display the boxplot or modified boxplot for the death claim amounts. The sample variance may be written 1 1 1 This formula is known as the for the sample variance. It leads to faster calculations because it uses the basic quantities, and , directly and does not require the intermediate quantities . You are given the four observations 1 000 000 1 000 001 1 000 000 1 000 000 1 3 2 2 2 1 1 2 2 2 2 n n i i i i i i i All departments Natural sciences Engineering Social sciences GRE quantitative scores Humanities and arts Education 200 300 400 500 600 700 800 2.29 2.30 4 4 x x s n assuming your hand-held calculator can keep only 8 digits in any number. 70 CHAPTER 2 DESCRIBING PATTERNS IN DATA h j h j h j Calculate the sample variance using the definitional formula, 1 Calculate the sample variance using the computing formula Compare the results. Consider the statement, “Although the computing formula leads, in general, to faster calculations, it may lead to inaccurate results because of round-off error for a set of uniformly large numbers.” Do you agree? Side-by-side boxplots of GRE Quantitative scores for students admitted to graduate study in departments classified as Natural Sciences, Engineering, and so forth are shown here. The boxplots are based on GRE scores accumulated over a five-year period. a. Using the vertical scale in the diagram, interpret the boxplot for Education. b. Which departments tend to have the highest Quantitative scores? Which departments have the most highly concentrated Quantitative scores about the median score as measured by the interquartile range? Which departments have the largest range of Quantitative scores? c. Looking at the boxplot for all departments, would you say the distribution of Quantitative scores is symmetric or skewed? Justify your choice. The scores of which departments are the most heavily skewed? Are these scores skewed to the right or left? Select the appropriate phrase to make the sentence correct. a. The mean of a data set with the outliers eliminated will be smaller than; larger than; smaller or larger than; equal to the average of the data set with the outliers included. b. The standard deviation of a data set with the outliers eliminated will be smaller than; larger than; smaller or larger than; equal to the standard deviation of the data set with the outliers included. c. The median of a data set with the outliers eliminated will be smaller than; larger than; smaller or larger than; equal to the median of the data set with the outliers included. 2 2 1 2 2 n i i 4 2.31 2.32 Age at Age at Age at Name Inauguration Name Inauguration Name Inauguration Graduate Graduate Graduate City Degree City Degree City Degree Minitab or similar program recommended M Q Q Minitab or similar program recommended Fortune, 71 1. Washington 57 15. Buchanan 65 29. Harding 55 2. J. Adams 61 16. Lincoln 52 30. Coolidge 51 3. Jefferson 57 17. A. Johnson 56 31. Hoover 54 4. Madison 57 18. Grant 46 32. F. D. Roosevelt 51 5. Monroe 58 19. Hayes 54 33. Truman 60 6. J. Q. Adams 57 20. Garfield 49 34. Eisenhower 62 7. Jackson 61 21. Arthur 50 35. Kennedy 43 8. Van Buren 54 22. Cleveland 47 36. L. Johnson 55 9. W. H. Harrison 68 23. B. Harrison 55 37. Nixon 56 10. Tyler 51 24. Cleveland 55 38. Ford 61 11. Polk 49 25. McKinley 54 39. Carter 52 12. Taylor 64 26. T. Roosevelt 42 40. Reagan 69 13. Fillmore 50 27. Taft 51 41. Bush 64 14. Pierce 48 28. Wilson 56 42. Clinton 46 Raleigh Durham 11.6 Dayton 6.9 Norfolk 6.5 New York 10.5 Denver 9.4 Oakland 10.8 Boston 11.2 Detroit 6.4 Oklahoma City 7.3 Seattle 8.8 Ft. Lauderdale 6.4 Orlando 6.1 Austin 10.3 Fort Worth 6.2 Phoenix 6.9 Chicago 8.7 Grand Rapids 5.6 Pittsburgh 6.8 Houston 7.9 Greensboro 5.1 Portland 7.6 San Jose 12.0 Hartford 10.3 Richmond 7.7 Philadelphia 8.3 Honolulu 7.9 Rochester 8.7 Minneapolis 7.7 Indianapolis 7.4 Sacramento 6.9 Albany 10.2 Jacksonville 5.6 St. Louis 3.8 Atlanta 8.1 Kansas City 7.5 Salt Lake City 7.0 Baltimore 9.2 Las Vegas 4.7 San Antonio 6.8 Birmingham 6.6 Los Angeles 7.8 San Diego 8.8 Buffalo 7.5 Louisville 6.7 San Francisco 12.8 Charlotte 5.2 Memphis 6.4 Scranton 5.1 Cincinnati 7.1 Miami 7.6 Tampa 5.6 Cleveland 6.5 Milwaukee 6.5 Tulsa 6.0 Columbus 8.0 Nashville 7.1 Washington, D.C. 15.8 Dallas 8.2 New Orleans 6.9 West Palm Beach 7.6

2.5 NUMERICAL SUMMARIES OF DATA DISTRIBUTIONS

AgePres.dat CitGrad.dat The following table shows the age at inauguration of each U.S. president. a. Make a stem-and-leaf diagram of the age at inauguration. Let the leaf unit 1. b. Find the median, , and the first and third quartiles, and . The article “The Best Cities for Knowledge Workers” Nov. 15, 1993 states that one measure of the brainpower that employers need is the number of workers 25 years old and older who hold a postbaccalaureate graduate degree. Consider the following table. 1 3 Density Required proportion Variable Fixed interval

2.6 THE NORMAL DENSITY FUNCTION