6.17 6.19 Directory UMM :wiley:Public:college:statistics:Chase:

CHAPTER 6 a b 6.14 Category Frequency 6.15 In Exercises 6.16–6.19, determine whether each is a probability distribution. Give reasons for your answer.

6.16 6.17

6.18 6.19

6.20 1 2 1 2 2 5 3 3 1 1 1 2 4 3 6 5 1 1 1 2 2 3 6 5 1 5 1 36 3 36 5 36 7 36 9 36 11 36 4 4 4 4 4 Probability Distributions for Discrete Random Variables x P x x P x x P x x P x x P x 2 2 2 x x x x x x x x x Find the probability distribution of in parts a and b: A name is to be randomly selected from those who voted. Let 0 if a male is selected. Otherwise, let 1. A name is to be randomly selected from the female voting-age population. Let 0 if the selected female voted. Otherwise, let 1. For adults without known heart disease, a desirable total cholesterol level is less than 200. Between 200 and 239 is categorized as borderline, and above 239 is too high. The following data are from a sample of 825 participants in the Framingham Heart Study with no prior heart disease: Desirable 234 Borderline 297 Too high 294 Let equal 1, 2, or 3 for desirable, borderline, or too high, respectively. Find the probability distribution for . In a population of 3000 people, 1470 were classified as blood type O, 1140 as type A, 300 as type B, and the remaining as type AB. Let 0, 1, 2, or 3 if a person’s blood type is O, A, B, or AB, respectively. Find the probability distribution for . 2 3 3 4 1 1 9 6 5 2 11 13 The probability distribution for the number of electronic instruments produced per hour by an assembly line is as follows: 1 2 3 4 5 6 10 6.4 6.4 MEAN AND VARIANCE 6.21 Number Correct Frequency 1 4 1 2 1 4 4 4 x x N , x P x 4 4 Mean and Variance x P x x In Example 6.3, we discussed an experiment that consisted of a student randomly guessing at the answers to a two-question true–false quiz. The random variable was number of correct guesses 0, 1, or 2. The probability distribution was given by The population for this random variable would be the string of values of that would occur if this experiment were performed over and over by many students. The probability distribution provides a theoretical description of this population, and we ought to be able to use it to predict the mean and variance of all these values. We will try to simulate this population by considering what the data values should theoretically look like if the experiment is performed by a large number of students, say 40 000 students, and the number of correct guesses is recorded for each student. Examining the probability distribution , we see that in an ideal situation we should expect 0 to occur about of the time. Thus we expect about 10,000 zeros. Similarly, we expect about 20,000 ones and 10,000 twos. Therefore, we expect the mean of all these values to be 1 4 x Construct a probability histogram. A business executive was sensitive to the possibility of being accused of religious discrimination in his hiring practices. He found that of 600 employees, 204 were Catholic, 228 were Protestant, 72 were Jewish, with the rest classified as Other. Let 0, 1, 2, or 3 if a person’s religion is Catholic, Protestant, Jewish, or Other, respectively. Find the probability distribution and construct the probability histogram. 1 2 10,000 1 20,000 2 10,000 11 CHAPTER 6 4 ` ` 4 4 ` ` 4 ` ` 4 4 4 4 4 ` ` 4 4 4 4 x N , , , , , , , , , , P P P P P P x P x x x P x N N x P x mean x x P x Note: P x P x x x x P x x P x x P x variance x x P x standard deviation P x Probability Distributions for Discrete Random Variables Definition Definitions sum of all values 010 000 120 000 210 000 40 000 10 000 20 000 10 000 1 2 40 000 40 000 40 000 1 2 1 Notice that 1 1 1 2 2 2 So we can write 1 1 2 2 1 This is the sum of the values of for each distinct value of . We write this as which works out to be 1 for the random variable under discussion. The equation for the mean does not involve the population size anywhere. In fact, no matter how large the value of , we would be led by the preceding reasoning to the same formula for predicting the mean. Therefore, we will agree that the mean will be given by this formula. We generalize this idea in the following definition. Given a discrete random variable with probability distribution , the of the random variable is defined as is also referred to as the mean of the probability distribution . We could go through a discussion very similar to the one just given to discover a suitable formula for population variance in terms of . However, this would be unnecessarily repetitious. Perhaps the simplest thing to do is to observe that whereas the population mean is an average of values, the population variance is an average of values of . Then the same discussion that led to a formula for the mean, , would lead to a similar formula for . The only difference is that instead of summing the values of , we would sum values of . This leads to the following definitions. Given a discrete random variable with probability distribution , the of the random variable is defined as The is the square root of the variance. We also use the phrase variance or standard deviation of the distribution . 1 1 1 4 2 4 1 1 1 4 2 4 2 2 2 2 2 1 2 1 2 1 2 12 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 2 ? 2 ? 2 ? m m m m m m m s m s m s s 6.4 1 2 1 2 4 4 4 ` 4 ` 4 4 4 ` 4 ` 4 4 4 4 P x x s x x P x P P x P x P P x P x x P x In future work, we recommend using the alternative formula unless is a whole number. Mean and Variance EXAMPLE 6.7 Solution alternative formula for x P x 2 We often do not know enough about the population to find . In this case, we might obtain a sample from the population and use the sample mean to estimate , and to estimate . Consider the experiment of randomly selecting a family with one child. Let the number of girls 0 or 1. For this random variable, calculate the mean, variance, and standard deviation. Assuming a girl and boy are equally likely, the probability distribution is given by Now 1 1 1 By definition, the variance is 1 1 There is an that is a little simpler computationally, but which gives the same result. The formula is [ ] This formula tells us to square each distinct value of , multiply this by , and then add all these products. We then subtract from this sum. 2 2 1 1 1 2 2 2 2 2 2 2 1 1 2 2 1 1 1 1 1 4 2 4 2 4 1 1 4 2 2 2 2 2 1 13 ? ? ? ? ? 2 ? 2 ? 2 ? ? ? ? 2 m s m s m s s m m m CHAPTER 6 Table 6.1 [ ] 4 10 4 10 1 10 1 10 4 4 4 10 10 10 4 8 16 10 10 10 1 3 9 10 10 10 1 4 16 10 10 10 19 45 10 10 2 4 4 4 ` 4 ` 4 4 4 x P x P P x 4 Probability Distributions for Discrete Random Variables EXAMPLE 6.8 Solution EXAMPLE 6.9 x P x x P x x P x x P x x x P x 2 ? ? ? ? ? For the random variable of Example 6.7 number of girls in a one-child family, calculate the variance and standard deviation using the alternative formula. We saw in Example 6.7 that . Hence [ ] 1 1 1 Sometimes it is helpful to organize our computations in the form of a table. This is especially recommended if there are more than, say, four or five values for . We illustrate how such a table is used in the following example. Consider the following probability distribution: We will use Table 6.1 to find the mean, variance, and standard deviation. 1 2 2 2 2 2 2 2 1 2 1 1 1 1 2 2 4 4 1 1 4 2 x P x x P x 1 2 3 4 1 2 3 4 Totals 14 ? 2 ? ? 2 ? ? 2 m s m s 8 6.4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 x P x . x P x . . . . . . x x x expected value E x E x x P x x x x x x Mean and Variance Remark Expected value Definition EXAMPLE 6.10 Solution Expected Value x P x 2 1 9 [ ] 1 9 4 5 3 61 89 89 94 Note that the mean of the distribution in Example 6.9 is 1.9. This is different from the mean of the individual values 1, 2, 3, 4, which is 2.5. Occasionally, the mean of the distribution is the same as the mean of the individual values. One example is when the distribution is symmetric. See the distribution discussed at the beginning of this section. is just another name for the mean of a random variable. Using this name for the mean is often helpful, because many problems are stated in terms of expectation when the mean is involved. For example, an insurance company might be interested in its expected earnings per customer. This is really just the average. For a random variable , we define the of the random variable, denoted by , to be the mean of the random variable. Therefore, for a discrete random variable, An insurance company sells a life insurance policy with a face value of 1000 and a yearly premium of 20. If .1 of the policyholders can be expected to die in the course of a year, what would be the company’s expected earnings per policyholder in any year? Let the amount of money earned by the company from an arbitrary randomly selected policyholder in a year. If the policyholder survives the year, then 20. If the policyholder dies, the company must pay out 1000. This minus the 20 premium means that the company loses 980. In other words, on this policyholder the company earns 980 i.e., 980. The probability that the policyholder dies i.e., 980 is .001. So the probability that the policyholder lives i.e., 20 is .999. Therefore, the probability distribution is The expected earnings per policyholder are 19 10 2 2 2 2 45 10 20 .999 980 .001 15 ? ? 2 2 2 ? 2 2 2 m s m s m CHAPTER 6 † p 4 4 ` 4 4 4 4 4 4 4 4 ` 4 ` 4 4 E x x P x . . . . x . . expected daily winnings, E x E x x P x . . . . . . . Probability Distributions for Discrete Random Variables For a discussion of this and other casino games, see Koshy, T., Santa Monica, Calif.: Goodyear, 1979, Section 4.2. This application was given by John Allen Paulos in New York: Vintage Books, 1990, p. 47. The same problem is discussed for various sample sizes by Richard J. Larsen and Morris L. Marx in Englewood Cliffs, N.J.: Prentice-Hall, 1986, p. 158. Finite Mathematics and Calculus with Applications, Innumeracy—Mathematical Illiteracy and Its Conse- quences, Introduction to Mathematical Statistics and Its Applications, EXAMPLE 6.11 THE NUMBERS GAME Solution EXAMPLE 6.12 x P x p 2 20 999 980 001 19 98 98 19 The company can expect to earn 19 per policyholder on the average. This game consists of selecting a three-digit number. If you guess the right number, you are paid 700 for each dollar you bet. Each day there is a new winning number. If a person bets 1 each day for 1 year, how much money can he expect to win or lose? Let the amount won on a given day. If the correct number is guessed, then he wins 700 1 699. Do not forget that it costs 1 to bet. Otherwise, he loses, so his winnings in this case would be 1. There are 1000 possible three-digit numbers: 000 to 999. The probability of winning selecting the correct three digits is 001. The probability of losing is 999. The probability distribution is Now we find the : 1 999 699 001 999 699 30 Thus he can expect to lose 30 cents per day on the average. For the year, he can expect to lose 30365 109 50. A clinic tests blood for a disease that occurs in 1 of the population. The blood samples arrive in batches of 50. The clinic director wondered whether a portion of the blood from each vial should be taken and one test performed on the pooled samples. Then if the test were negative, all samples would be negative. If the test were positive, then each sample would be tested to see which were positive. Under 1 999 1000 1000 † 1 .999 699 .001 16 ? 2 2 2 2 ? 2 2 2 6.4 For the probability distributions in Exercises 6.22–6.25, find a the mean, b the variance, and c the standard deviation. Also, d describe each of the distributions as skewed to the left, skewed to the right, or symmetric. To organize your computations, it may help to use a table like Table 6.1.

6.22 6.23