Summary and Key Formulas

We extended the concept of data description to summarize the relations between two qualitative variables. Here cross-tabulations were used to develop percentage comparisons. We examined plots for summarizing the relations between quantitative and qualitative variables and between two quantitative variables. Material presented here namely, summarizing relations among variables will be discussed and expanded in later chapters on chi-square methods, on the analysis of variance, and on regression. Key Formulas 1. Median, grouped data 2. Sample mean

3.

Sample mean, grouped data 4. Sample variance s 2 ⫽ 1 n ⫺ 1 a i y i ⫺ y 2 y 艑 a i f i y i n y ⫽ a i y i n Median ⫽ L ⫹ w f m .5n ⫺ cf b 5. Sample variance, grouped data 6. Sample standard deviation 7. Sample coefficient of variation 8. Correlation coefficient r ⫽ a n i⫽ 1 冢 x i ⫺ x s x 冣冢 y i ⫺ y s y 冣 CV ⫽ s |y| s ⫽ 1s 2 s 2 艑 1 n ⫺ 1 a i f i y i ⫺ y 2

3.10 Exercises

3.3 Describing Data on a Single Variable: Graphical Methods Gov. 3.1 The U.S. government spent more than 2.5 trillion in the 2006 fiscal year. How is this in- credible sum of money spent? The following table provides broad categories which demonstrate the expenditures of the Federal government for domestic and defense programs. 2006 Expenditures Federal Program Billions of Dollars National Defense 525 Social Security 500 Medicare Medicaid 500 National Debt Interest 300 Major Social-Aid Programs 200 Other 475 a. Construct a pie chart for these data. b. Construct a bar chart for these data. c. Construct a pie chart and bar chart using percentages in place of dollars. d. Which of the four charts is more informative to the tax-paying public? Bus. 3.2 A major change appears to be taking place with respect to the type of vehicle the U.S. pub- lic is purchasing. The U.S. Bureau of Economic Analysis in their publication Survey of Current Business February 2002 provide the data given in the following table. The numbers reported are in thousands of units—that is, 9,436 represents 9,436,000 vehicles sold in 1990. Year Type of Vehicle 1990 1995 1997 1998 1999 2000 2001 2002 Passenger Car 9,436 8,687 8,273 8,142 8,697 8,852 8,422 8,082 SUVLight Truck 4,733 6,517 7,226 7,821 8,717 8,965 9,050 9,036 a. Would pie charts be appropriate graphical summaries for displaying these data? Why or why not?

b. Construct a bar chart that would display the changes across the 12 years in the public’s

choice in vehicle. c. Do you observe a trend in the type of vehicles purchased? Do you feel this trend will continue if there was a substantial rise in gasoline prices? Med. 3.3 It has been reported that there has been a change in the type of practice physicians are selecting for their career. In particular, there is concern that there will be a shortage of family practice physicians in future years. The following table contains data on the total number of office- based physicians and the number of those physicians declaring themselves to be family practice physicians. The numbers in the table are given in thousands of physicians. Source: Statistical Abstract of the United States: 2003 Year 1980 1990 1995 1998 1999 2000 2001 Family Practice 47.8 57.6 59.9 64.6 66.2 67.5 70.0 Total Office-Based Physicians 271.3 359.9 427.3 468.8 473.2 490.4 514.0

a. Use a bar chart to display the increase in the number of family practice physicians

from 1990 to 2002.

b. Calculate the percent of office-based physicians who are family practice physicians and

then display this data in a bar chart. c. Is there a major difference in the trend displayed by the two bar charts? Env.

3.4 The regulations of the board of health in a particular state specify that the fluoride level

must not exceed 1.5 parts per million ppm. The 25 measurements given here represent the fluoride levels for a sample of 25 days. Although fluoride levels are measured more than once per day, these data represent the early morning readings for the 25 days sampled. .75 .86 .84 .85 .97 .94 .89 .84 .83 .89 .88 .78 .77 .76 .82 .72 .92 1.05 .94 .83 .81 .85 .97 .93 .79 a. Determine the range of the measurements. b. Dividing the range by 7, the number of subintervals selected, and rounding, we have a class interval width of .05. Using .705 as the lower limit of the first interval, construct a frequency histogram.

c. Compute relative frequencies for each class interval and construct a relative frequency

histogram. Note that the frequency and relative frequency histograms for these data have the same shape.

d. If one of these 25 days were selected at random, what would be the chance probability

that the fluoride reading would be greater than .90 ppm? Guess predict what propor- tion of days in the coming year will have a fluoride reading greater than .90 ppm.