Normal Approximation to the Binomial

6.5 Normal Approximation to the Binomial

Probabilities associated with binomial experiments are readily obtainable from the formula b(x; n, p) of the binomial distribution or from Table A.1 when n is small. In addition, binomial probabilities are readily available in many computer software packages. However, it is instructive to learn the relationship between the binomial and the normal distribution. In Section 5.5, we illustrated how the Poisson dis- tribution can be used to approximate binomial probabilities when n is quite large and p is very close to 0 or 1. Both the binomial and the Poisson distributions

188 Chapter 6 Some Continuous Probability Distributions are discrete. The first application of a continuous probability distribution to ap-

proximate probabilities over a discrete sample space was demonstrated in Example

6.12, where the normal curve was used. The normal distribution is often a good approximation to a discrete distribution when the latter takes on a symmetric bell shape. From a theoretical point of view, some distributions converge to the normal as their parameters approach certain limits. The normal distribution is a conve- nient approximating distribution because the cumulative distribution function is so easily tabled. The binomial distribution is nicely approximated by the normal in practical problems when one works with the cumulative distribution function. We now state a theorem that allows us to use areas under the normal curve to approximate binomial properties when n is sufficiently large.

Theorem 6.3: If X is a binomial random variable with mean μ = np and variance σ 2 = npq, then the limiting form of the distribution of

as n → ∞, is the standard normal distribution n(z; 0, 1). It turns out that the normal distribution with μ = np and σ 2 = np(1 − p) not

only provides a very accurate approximation to the binomial distribution when n is large and p is not extremely close to 0 or 1 but also provides a fairly good approximation even when n is small and p is reasonably close to 1/2.

To illustrate the normal approximation to the binomial distribution, we first draw the histogram for b(x; 15, 0.4) and then superimpose the particular normal curve having the same mean and variance as the binomial variable X. Hence, we draw a normal curve with

μ = np = (15)(0.4) = 6 and σ 2 = npq = (15)(0.4)(0.6) = 3.6. The histogram of b(x; 15, 0.4) and the corresponding superimposed normal curve,

which is completely determined by its mean and variance, are illustrated in Figure

11 13 15 Figure 6.22: Normal approximation of b(x; 15, 0.4).

6.5 Normal Approximation to the Binomial 189 The exact probability that the binomial random variable X assumes a given

value x is equal to the area of the bar whose base is centered at x. For example, the exact probability that X assumes the value 4 is equal to the area of the rectangle with base centered at x = 4. Using Table A.1, we find this area to be

P (X = 4) = b(4; 15, 0.4) = 0.1268,

which is approximately equal to the area of the shaded region under the normal curve between the two ordinates x 1 = 3.5 and x 2 = 4.5 in Figure 6.23. Converting to z values, we have

Figure 6.23: Normal approximation of b(x; 15, 0.4) and

b(x; 15, 0.4).

x=7

If X is a binomial random variable and Z a standard normal variable, then P (X = 4) = b(4; 15, 0.4) ≈ P (−1.32 < Z < −0.79)

= P (Z < −0.79) − P (Z < −1.32) = 0.2148 − 0.0934 = 0.1214. This agrees very closely with the exact value of 0.1268.

The normal approximation is most useful in calculating binomial sums for large values of n. Referring to Figure 6.23, we might be interested in the probability that X assumes a value from 7 to 9 inclusive. The exact probability is given by

P (7 ≤ X ≤ 9) =

b(x; 15, 0.4) −

b(x; 15, 0.4)

which is equal to the sum of the areas of the rectangles with bases centered at x = 7, 8, and 9. For the normal approximation, we find the area of the shaded

region under the curve between the ordinates x 1 = 6.5 and x 2 = 9.5 in Figure 6.23. The corresponding z values are

1 = 0.26 and z

190 Chapter 6 Some Continuous Probability Distributions Now,

P (7 ≤ X ≤ 9) ≈ P (0.26 < Z < 1.85) = P (Z < 1.85) − P (Z < 0.26)

Once again, the normal curve approximation provides a value that agrees very closely with the exact value of 0.3564. The degree of accuracy, which depends on how well the curve fits the histogram, will increase as n increases. This is particu- larly true when p is not very close to 1/2 and the histogram is no longer symmetric. Figures 6.24 and 6.25 show the histograms for b(x; 6, 0.2) and b(x; 15, 0.2), respec- tively. It is evident that a normal curve would fit the histogram considerably better when n = 15 than when n = 6.

0 1 2 3 4 5 6 0 123456789 11 13 15 x

Figure 6.24: Histogram for b(x; 6, 0.2). Figure 6.25: Histogram for b(x; 15, 0.2). In our illustrations of the normal approximation to the binomial, it became

apparent that if we seek the area under the normal curve to the left of, say, x, it is more accurate to use x + 0.5. This is a correction to accommodate the fact that a discrete distribution is being approximated by a continuous distribution. The correction +0.5 is called a continuity correction. The foregoing discussion leads to the following formal normal approximation to the binomial.

Normal Let X be a binomial random variable with parameters n and p. For large n, X Approximation to has approximately a normal distribution with μ = np and σ 2 = npq = np(1 − p) the Binomial and Distribution

P (X ≤ x) =

b(k; n, p)

k=0

≈ area under normal curve to the left of x + 0.5

and the approximation will be good if np and n(1 − p) are greater than or equal to 5.

As we indicated earlier, the quality of the approximation is quite good for large n. If p is close to 1/2, a moderate or small sample size will be sufficient for a reasonable approximation. We offer Table 6.1 as an indication of the quality of the

6.5 Normal Approximation to the Binomial 191 approximation. Both the normal approximation and the true binomial cumulative

probabilities are given. Notice that at p = 0.05 and p = 0.10, the approximation is fairly crude for n = 10. However, even for n = 10, note the improvement for p = 0.50. On the other hand, when p is fixed at p = 0.05, note the improvement of the approximation as we go from n = 20 to n = 100.

Table 6.1: Normal Approximation and True Cumulative Binomial Probabilities

p = 0.50, n = 10 r Binomial Normal Binomial Normal Binomial Normal

p = 0.05, n = 10

p = 0.10, n = 10

n = 100 r Binomial Normal Binomial Normal Binomial Normal

Example 6.15: The probability that a patient recovers from a rare blood disease is 0.4. If 100 people are known to have contracted this disease, what is the probability that fewer than 30 survive?

Solution : Let the binomial variable X represent the number of patients who survive. Since n = 100, we should obtain fairly accurate results using the normal-curve approxi- mation with

μ = np = (100)(0.4) = 40 and σ = npq = (100)(0.4)(0.6) = 4.899. To obtain the desired probability, we have to find the area to the left of x = 29.5.

192 Chapter 6 Some Continuous Probability Distributions The z value corresponding to 29.5 is

and the probability of fewer than 30 of the 100 patients surviving is given by the shaded region in Figure 6.26. Hence,

P (X < 30) ≈ P (Z < −2.14) = 0.0162.

⫺ 2.14 0 x

0 1.16 2.71 x

Figure 6.26: Area for Example 6.15. Figure 6.27: Area for Example 6.16.

Example 6.16:

A multiple-choice quiz has 200 questions, each with 4 possible answers of which only 1 is correct. What is the probability that sheer guesswork yields from 25 to

30 correct answers for the 80 of the 200 problems about which the student has no knowledge? Solution : The probability of guessing a correct answer for each of the 80 questions is p = 1/4. If X represents the number of correct answers resulting from guesswork, then

P (25 ≤ X ≤ 30) =

b(x; 80, 1/4).

x=25

Using the normal curve approximation with

we need the area between x 1 = 24.5 and x 2 = 30.5. The corresponding z values are

= 1.16 and z 2 =

The probability of correctly guessing from 25 to 30 questions is given by the shaded region in Figure 6.27. From Table A.3 we find that

P (25 ≤ X ≤ 30) =

b(x; 80, 0.25) ≈ P (1.16 < Z < 2.71)

x=25

= P (Z < 2.71) − P (Z < 1.16) = 0.9966 − 0.8770 = 0.1196.

Exercises 193

Exercises

6.24 A coin is tossed 400 times. Use the normal curve a sample of 100 individuals and decide to accept the approximation to find the probability of obtaining

claim if 75 or more are cured. (a) between 185 and 210 heads inclusive;

(a) What is the probability that the claim will be re- (b) exactly 205 heads;

jected when the cure probability is, in fact, 0.8? (c) fewer than 176 or more than 227 heads.

(b) What is the probability that the claim will be ac- cepted by the government when the cure probabil-

6.25 A process for manufacturing an electronic com-

ity is as low as 0.7?

ponent yields items of which 1% are defective. A qual- ity control plan is to select 100 items from the process,

6.31 One-sixth of the male freshmen entering a large and if none are defective, the process continues. Use state school are out-of-state students. If the students the normal approximation to the binomial to find

are assigned at random to dormitories, 180 to a build- (a) the probability that the process continues given the ing, what is the probability that in a given dormitory sampling plan described;

at least one-fifth of the students are from out of state? (b) the probability that the process continues even if

6.32 A pharmaceutical company knows that approx- the process has gone bad (i.e., if the frequency imately 5% of its birth-control pills have an ingredient of defective components has shifted to 5.0% defec- that is below the minimum strength, thus rendering tive).

the pill ineffective. What is the probability that fewer than 10 in a sample of 200 pills will be ineffective?

6.26 A process yields 10% defective items. If 100 items are randomly selected from the process, what

6.33 Statistics released by the National Highway is the probability that the number of defectives

Traffic Safety Administration and the National Safety Council show that on an average weekend night, 1 out

(a) exceeds 13? of every 10 drivers on the road is drunk. If 400 drivers (b) is less than 8?

are randomly checked next Saturday night, what is the probability that the number of drunk drivers will be

6.27 The probability that a patient recovers from a (a) less than 32? delicate heart operation is 0.9. Of the next 100 patients (b) more than 49? having this operation, what is the probability that

(c) at least 35 but less than 47? (a) between 84 and 95 inclusive survive?

(b) fewer than 86 survive? 6.34 A pair of dice is rolled 180 times. What is the probability that a total of 7 occurs 6.28 Researchers at George Washington University (a) at least 25 times? and the National Institutes of Health claim that ap- proximately 75% of people believe “tranquilizers work (b) between 33 and 41 times inclusive? very well to make a person more calm and relaxed.” Of (c) exactly 30 times? the next 80 people interviewed, what is the probability that

6.35 A company produces component parts for an en- (a) at least 50 are of this opinion?

gine. Parts specifications suggest that 95% of items (b) at most 56 are of this opinion?

meet specifications. The parts are shipped to cus- tomers in lots of 100.

6.29 If 20% of the residents in a U.S. city prefer a (a) What is the probability that more than 2 items in white telephone over any other color available, what is

a given lot will be defective? the probability that among the next 1000 telephones (b) What is the probability that more than 10 items in

installed in that city

a lot will be defective?

(a) between 170 and 185 inclusive will be white? 6.36 A common practice of airline companies is to (b) at least 210 but not more than 225 will be white?

sell more tickets for a particular flight than there are seats on the plane, because customers who buy tickets

6.30 A drug manufacturer claims that a certain drug do not always show up for the flight. Suppose that cures a blood disease, on the average, 80% of the time. the percentage of no-shows at flight time is 2%. For To check the claim, government testers use the drug on

a particular flight with 197 seats, a total of 200 tick-

194 Chapter 6 Some Continuous Probability Distributions ets were sold. What is the probability that the airline

6.38 A telemarketing company has a special letter- overbooked this flight?

opening machine that opens and removes the contents of an envelope. If the envelope is fed improperly into

6.37 The serum cholesterol level X in 14-year-old the machine, the contents of the envelope may not be boys has approximately a normal distribution with removed or may be damaged. In this case, the machine mean 170 and standard deviation 30.

is said to have “failed.”

(a) Find the probability that the serum cholesterol (a) If the machine has a probability of failure of 0.01, level of a randomly chosen 14-year-old boy exceeds

what is the probability of more than 1 failure oc- 230.

curring in a batch of 20 envelopes? (b) In a middle school there are 300 14-year-old boys. (b) If the probability of failure of the machine is 0.01

Find the probability that at least 8 boys have a and a batch of 500 envelopes is to be opened, what serum cholesterol level that exceeds 230.

is the probability that more than 8 failures will occur?

Dokumen yang terkait

Optimal Retention for a Quota Share Reinsurance

0 0 7

Digital Gender Gap for Housewives Digital Gender Gap bagi Ibu Rumah Tangga

0 0 9

Challenges of Dissemination of Islam-related Information for Chinese Muslims in China Tantangan dalam Menyebarkan Informasi terkait Islam bagi Muslim China di China

0 0 13

Family is the first and main educator for all human beings Family is the school of love and trainers of management of stress, management of psycho-social-

0 0 26

THE EFFECT OF MNEMONIC TECHNIQUE ON VOCABULARY RECALL OF THE TENTH GRADE STUDENTS OF SMAN 3 PALANGKA RAYA THESIS PROPOSAL Presented to the Department of Education of the State Islamic College of Palangka Raya in Partial Fulfillment of the Requirements for

0 3 22

GRADERS OF SMAN-3 PALANGKA RAYA ACADEMIC YEAR OF 20132014 THESIS Presented to the Department of Education of the State College of Islamic Studies Palangka Raya in Partial Fulfillment of the Requirements for the Degree of Sarjana Pendidikan Islam

0 0 20

A. Research Design and Approach - The readability level of reading texts in the english textbook entitled “Bahasa Inggris SMA/MA/MAK” for grade XI semester 1 published by the Ministry of Education and Culture of Indonesia - Digital Library IAIN Palangka R

0 1 12

A. Background of Study - The quality of the english textbooks used by english teachers for the tenth grade of MAN Model Palangka Raya Based on Education National Standard Council (BSNP) - Digital Library IAIN Palangka Raya

0 0 15

1. The definition of textbook - The quality of the english textbooks used by english teachers for the tenth grade of MAN Model Palangka Raya Based on Education National Standard Council (BSNP) - Digital Library IAIN Palangka Raya

0 0 38

CHAPTER IV DISCUSSION - The quality of the english textbooks used by english teachers for the tenth grade of MAN Model Palangka Raya Based on Education National Standard Council (BSNP) - Digital Library IAIN Palangka Raya

0 0 95