The Binomial Probability Distribution
3.4 The Binomial Probability Distribution
There are many experiments that conform either exactly or approximately to the following list of requirements:
1. The experiment consists of a sequence of n smaller experiments called trials, where n is fixed in advance of the experiment.
2. Each trial can result in one of the same two possible outcomes (dichoto- mous trials), which we generically denote by success (S) and failure (F). The assignment of the S and F labels to the two sides of the dichotomy is arbitrary.
3. The trials are independent, so that the outcome on any particular trial does not influence the outcome on any other trial.
4. The probability of success P(S) is constant from trial to trial; we denote this probability by p.
118 Chapter 3 Discrete random Variables and probability Distributions
DEFINITION
An experiment for which Conditions 1–4 (a fixed number of dichotomous, independent, homogenous trials) are satisfied is called a binomial experiment.
ExamplE 3.27 Consider each of the next n vehicles undergoing an emissions test, and let S denote
a vehicle that passes the test and F denote one that fails to pass. Then this experi- ment satisfies Conditions 1–4. Tossing a thumbtack n times, with S 5 point up and
F5 point down, also results in a binomial experiment, as would the experiment in which the gender (S for female and F for male) is determined for each of the next n children born at a particular hospital.
n
Many experiments involve a sequence of independent trials for which there are more than two possible outcomes on any one trial. A binomial experiment can then
be created by dividing the possible outcomes into two groups. ExamplE 3.28 The color of pea seeds is determined by a single genetic locus. If the two alleles
at this locus are AA or Aa (the genotype), then the pea will be yellow (the phe- notype), and if the allele is aa, the pea will be green. Suppose we pair off 20 Aa seeds and cross the two seeds in each of the ten pairs to obtain ten new geno- types. Call each new genotype a success S if it is aa and a failure otherwise. Then with this identification of S and F, the experiment is binomial with n 5 10 and
p5P (aa genotype). If each member of the pair is equally likely to contribute a or
A, then p 5 P sad ? Psad 5 s.5ds.5d 5 .25.
n
ExamplE 3.29 The pool of prospective jurors for a certain case consists of 50 individuals, of whom
35 are employed. Suppose that 6 of these individuals are randomly selected one by one to sit in the jury box for initial questioning by lawyers for the defense and the prosecution. Label the ith person selected (the ith trial) as a success S if he or she is employed and a failure F otherwise. Then
P (S on first trial) 5
50 and
P(S on second trial) 5 P(SS) 1 P(FS)
5 P(second S ufirst S)P(first S)
1 P(second S u first F)P(first F)
Similarly, it can be shown that P(S on ith trial) 5 .70 for i 5 3, 4, 5, 6. However, if the first five individuals selected are all S, then only 30 Ss remain for the sixth selection. Thus,
P (S on sixth trial u SSSSS) 5 30y45 5 .667 whereas
P (S on sixth trial u FFFFF) 5 35y45 5 .778
The experiment is not binomial because the trials are not independent. In general, if sampling is without replacement, the experiment will not yield independent trials.
3.4 the Binomial probability Distribution 119
Now consider a large county that has 500,000 individuals in its jury pool, of whom 400,000 are employed. A sample of 10 individuals from the pool is chosen without replacement. Again the ith trial is regarded as a success S if the ith individual is employed. The important difference between this and the previous scenario is that the size of the population being sampled is very large relative to the sample size. In this case
P (S on 10 u S on first 9) 5
P (S on 10 u F on first 9) 5
These calculations suggest that although the trials are not exactly independent, the conditional probabilities differ so slightly from one another that for practical
purposes the trials can be regarded as independent with constant P(S) 5 .8. Thus, to
a very good approximation, the experiment is binomial with n 5 10 and p 5 .8. n
We will use the following rule of thumb in deciding whether a “without- replacement” experiment can be treated as being binomial.
RulE Consider sampling without replacement from a dichotomous population of size N. If the sample size (number of trials) n is at most 5 of the population size, the experiment can be analyzed as though it were a binomial experiment.
By “analyzed,” we mean that probabilities based on the binomial experiment assumptions will be quite close to the actual “without-replacement” probabilities, which are typically more difficult to calculate. In the first scenario of Example 3.29, n yN 5 6y50 5 .12 . .05, so the binomial experiment is not a good approximation,
but in the second scenario, n yN 5 10y500,000 V .05. the Binomial random Variable
and distribution
In most binomial experiments, it is the total number of S’s, rather than knowledge of exactly which trials yielded S’s, that is of interest.
DEFINITION
The binomial random variable X associated with a binomial experiment consisting of n trials is defined as
X5 the number of S’s among the n trials
Suppose, for example, that n 5 3. Then there are eight possible outcomes for the experiment:
SSS SSF SFS SFF FSS FSF FFS FFF
From the definition of X, X(SSF) 5 2, X(SFF) 5 1, and so on. Possible values for
X in an n-trial experiment are x 5 0, 1, 2,…, n. We will often write X , Bin(n, p) to indicate that X is a binomial rv based on n trials with success probability p.
120 Chapter 3 Discrete random Variables and probability Distributions
NOTaTION
Because the pmf of a binomial rv X depends on the two parameters n and p, we denote the pmf by b(x; n, p).
Consider first the case n 5 4 for which each outcome, its probability, and corre- sponding x value are displayed in Table 3.1. For example,
P(SSFS) 5 P(S) ? P(S) ? P(F) ? P(S) (independent trials)
5 p ? p ? (1 2 p) ? p (constant P(S))
5p 3 ? (1 2 p)
Table 3.1 Outcomes and Probabilities for a Binomial Experiment with Four Trials
Outcome
x Probability Outcome x Probability
SSSS
4 p 4 FSSS 3 p 3 (1 2 p)
SSSF 3p 3 (1 2 p)
FSSF 2 p 2 (1 2 p) 2
SSFS 3p 3 (1 2 p)
FSFS 2 p 2 (1 2 p) 2 SF 2p 2 (1 2 p) 2 FSFF 1 p (1 2 p) 3
SFSS 3p 3 (1 2 p)
FS 2 p 2 (1 2 p) 2 SFSF 2p 2 (1 2 p) 2 FFSF 1 p (1 2 p) 3 SFFS 2p 2 (1 2 p) 2 FFFS 1 p (1 2 p) 3
SFFF 1 p(1 2 p) 3 FFFF 0 (1 2 p) 4
In this special case, we wish b(x; 4, p) for x 5 0, 1, 2, 3, and 4. For b(3; 4, p), let’s identify which of the 16 outcomes yield an x value of 3 and sum the probabili- ties associated with each such outcome:
b (3; 4, p) 5 P(FSSS) 1 P(SFSS) 1 P(SSFS) 1 P(SSSF) 5 4p 3 (1 2 p) There are four outcomes with X 5 3 and each has probability p 3 (1 2 p) (the order
of S’s and F’s is not important, only the number of S’s), so
number of outcomes
probability of any particular
5 with X 5 3 6 5 outcome with X 5 3 6
b (3; 4, p) 5
Similarly, b(2; 4, p) 5 6p 2 (1 2 p) 2 , which is also the product of the number of out-
comes with X 5 2 and the probability of any such outcome. In general,
number of sequences of
5 length n consisting of x S’s 6 5 particular such sequence 6
probability of any
b (x; n, p) 5
Since the ordering of S’s and F’s is not important, the second factor in the previ- ous equation is p x (1 2 p) n2x (e.g., the first x trials resulting in S and the last n 2 x resulting in F). The first factor is the number of ways of choosing x of the n trials to
be S’s—that is, the number of combinations of size x that can be constructed from n distinct objects (trials here).
H 0 otherwise
_ n2x x + p (1 2 p) x5 0, 1, 2,…, n
n x
ThEOREm b (x; n, p) 5
3.4 the Binomial probability Distribution 121
ExamplE 3.30 Each of six randomly selected cola drinkers is given a glass containing cola S and one
containing cola F. The glasses are identical in appearance except for a code on the bottom to identify the cola. Suppose there is actually no tendency among cola drink- ers to prefer one cola to the other. Then p 5 P(a selected individual prefers S) 5 .5, so with X 5 the number among the six who prefer S, X , Bin(6,.5).
Thus
P (X 5 3) 5 b(3; 6, .5) 5
The probability that at least three prefer S is
x5 o 3 o x5 3 1 x 2
(.5) x (.5) 62x 5 .656
and the probability that at most one prefers S is
P (X 1) 5
x5 o 0
b (x; 6, .5) 5 .109
n
using Binomial tables
Even for a relatively small value of n, the computation of binomial prob- abilities can be tedious. Appendix Table A.1 tabulates the cdf F(x) 5 P(X x) for n5 5, 10, 15, 20, 25 in combination with selected values of p corresponding to dif- ferent columns of the table. Various other probabilities can then be calculated using the proposition on cdf’s from Section 3.2. A table entry of 0 signifies only that the probability is 0 to three significant digits since all table entries are actually positive.
NOTaTION
For X , Bin(n, p), the cdf will be denoted by
y5 o 0
0, 1,…, n
ExamplE 3.31 Suppose that 20 of all copies of a particular textbook fail a certain binding strength test. Let X denote the number among 15 randomly selected copies that fail the test. Then X has a binomial distribution with n 5 15 and p 5 .2.
1. The probability that at most 8 fail the test is
P (X 8) 5
b o (y; 15, .2) 5 B(8; 15, .2)
y5 0
which is the entry in the x 5 8 row and the p 5 .2 column of the n 5 15 binomial table. From Appendix Table A.1, the probability is B(8; 15, .2) 5 .999.
2. The probability that exactly 8 fail is
P (X 5 8) 5 P(X 8) 2 P(X 7) 5 B(8; 15, .2) 2 B(7; 15, .2) which is the difference between two consecutive entries in the p 5 .2 column.
The result is .999 2 .996 5 .003.
Statistical software packages such as Minitab and R will provide the pmf or cdf almost instantaneously upon request for any value of p and n ranging from 2 up into the millions. There is also an R command for calculating the probability that X lies in some interval.
122 Chapter 3 Discrete random Variables and probability Distributions
3. The probability that at least 8 fail is
P(X 8) 5 1 2 P(X 7) 5 1 2 B(7; 15, .2)
entry in x 5 7
1 row of p 5 .2 column 2
4. Finally, the probability that between 4 and 7, inclusive, fail is
P(4 X 7) 5 P(X 5 4, 5, 6, or 7) 5 P(X 7) 2 P(X 3)
5 B(7; 15, .2) 2 B(3; 15, .2) 5 .996 2 .648 5 .348
Notice that this latter probability is the difference between entries in the x 5 7 and x5 3 rows, not the x 5 7 and x 5 4 rows.
n
ExamplE 3.32 An electronics manufacturer claims that at most 10 of its power supply units need
service during the warranty period. To investigate this claim, technicians at a testing laboratory purchase 20 units and subject each one to accelerated testing to simulate use during the warranty period. Let p denote the probability that a power supply unit needs repair during the period (the proportion of all such units that need repair). The laboratory technicians must decide whether the data resulting from the experiment supports the claim that p .10. Let X denote the number among the 20 sampled that need repair, so X , Bin(20, p). Consider the decision rule:
Reject the claim that p .10 in favor of the conclusion that p . .10 if x 5 (where x is the observed value of X), and consider the claim plausible if x 4.
The probability that the claim is rejected when p 5 .10 (an incorrect conclusion) is
P (X 5 when p 5 .10) 5 1 2 B(4; 20, .1) 5 1 2 .957 5 .043 The probability that the claim is not rejected when p 5 .20 (a different type of
incorrect conclusion) is
P (X 4 when p 5 .2) 5 B(4; 20, .2) 5 .630
The first probability is rather small, but the second is intolerably large. When
p5 .20, so that the manufacturer has grossly understated the percentage of units that need service, and the stated decision rule is used, 63 of all samples will result in the manufacturer’s claim being judged plausible!
One might think that the probability of this second type of erroneous conclu- sion could be made smaller by changing the cutoff value 5 in the decision rule to
something else. However, although replacing 5 by a smaller number would yield a probability smaller than .630, the other probability would then increase. The only way to make both “error probabilities” small is to base the decision rule on an
experiment involving many more units.
n
the Mean and Variance of X
For n 5 1, the binomial distribution becomes the Bernoulli distribution. From Example 3.18, the mean value of a Bernoulli variable is m 5 p, so the expected
number of S’s on any single trial is p. Since a binomial experiment consists of n trials, intuition suggests that for X , Bin(n, p), E(X) 5 np, the product of the number of trials and the probability of success on a single trial. The expression for V(X) is not so intuitive.
3.4 the Binomial probability Distribution 123
pROpOSITION
If X , Bin(n, p), then E(X) 5 np, V(X) 5 np(1 2 p) 5 npq, and s X 5 Ïnpq
(where q 5 1 2 p).
Thus, calculating the mean and variance of a binomial rv does not necessitate evalu- ating summations. The proof of the result for E(X) is sketched in Exercise 64.
ExamplE 3.33 If 75 of all purchases at a certain store are made with a credit card and X is the
number among ten randomly selected purchases made with a credit card, then
X , Bin(10, .75). Thus E(X) 5 np 5 (10)(.75) 5 7.5, V(X) 5 npq 5 10(.75)(.25) 5 1.875, and s 5 Ï1.875 5 1.37. Again, even though X can take on only integer val- ues, E(X) need not be an integer. If we perform a large number of independent bino- mial experiments, each with n 5 10 trials and p 5 .75, then the average number of S ’s per experiment will be close to 7.5.
The probability that X is within 1 standard deviation of its mean value is P (7.5 2 1.37 X 7.5 1 1.37) 5 P(6.13 X 8.87) 5 P(X 5 7 or 8) 5 .532.
n
ExERcisEs Section 3.4 (46–67)
46. Compute the following binomial probabilities directly
48. NBC News reported on May 2, 2013, that 1 in 20 chil-
from the formula for b(x; n, p):
dren in the United States have a food allergy of some
a. b(3; 8, .35)
sort. Consider selecting a random sample of 25 children
b. b(5; 8, .6)
and let X be the number in the sample who have a food
c. P(3 X 5) when n 5 7 and p 5 .6
allergy. Then X , Bin(25, .05).
d. P(1 X) when n 5 9 and p 5 .1
a. Determine both P(X 3) and P(X , 3). b. Determine P(X 4).
47. The article “Should You Report That Fender-
Bender?” (Consumer Reports, Sept. 2013: 15) reported
c. Determine P (1 X 3).
that 7 in 10 auto accidents involve a single vehicle (the
d. What are E(X) and s X ?
article recommended always reporting to the insurance
e. In a sample of 50 children, what is the probability
company an accident involving multiple vehicles).
that none has a food allergy?
Suppose 15 accidents are randomly selected. Use
49. A company that produces fine crystal knows from expe-
Appendix Table A.1 to answer each of the following
rience that 10 of its goblets have cosmetic flaws and
questions.
must be classified as “seconds.”
a. What is the probability that at most 4 involve a single
a. Among six randomly selected goblets, how likely is
vehicle?
it that only one is a second?
b. What is the probability that exactly 4 involve a single
b. Among six randomly selected goblets, what is the
vehicle?
probability that at least two are seconds?
c. What is the probability that exactly 6 involve multi-
c. If goblets are examined one by one, what is the prob-
ple vehicles?
ability that at most five must be selected to find four
d. What is the probability that between 2 and 4, inclu-
that are not seconds?
sive, involve a single vehicle?
50. A particular telephone number is used to receive both
e. What is the probability that at least 2 involve a single
voice calls and fax messages. Suppose that 25 of the
vehicle?
incoming calls involve fax messages, and consider a
f. What is the probability that exactly 4 involve a single
sample of 25 incoming calls. What is the probability
vehicle and the other 11 involve multiple vehicles?
that
124 Chapter 3 Discrete random Variables and probability Distributions
a. At most 6 of the calls involve a fax message?
version is within 1 standard deviation of the mean
b. Exactly 6 of the calls involve a fax message?
value?
c. At least 6 of the calls involve a fax message?
c. The store currently has seven rackets of each version.
d. More than 6 of the calls involve a fax message?
What is the probability that all of the next ten cus- tomers who want this racket can get the version they
51. Refer to the previous exercise.
want from current stock?
a. What is the expected number of calls among the 25 that involve a fax message?
55. Twenty percent of all telephones of a certain type are
b. What is the standard deviation of the number among
submitted for service while under warranty. Of these,
the 25 calls that involve a fax message?
60 can be repaired, whereas the other 40 must be
c. What is the probability that the number of calls among
replaced with new units. If a company purchases ten of
the 25 that involve a fax transmission exceeds the
these telephones, what is the probability that exactly two
expected number by more than 2 standard deviations?
will end up being replaced under warranty?
52. Suppose that 30 of all students who have to buy a text
56. The College Board reports that 2 of the 2 million high
for a particular course want a new copy (the successes!),
school students who take the SAT each year receive
whereas the other 70 want a used copy. Consider ran-
special accommodations because of documented dis-
domly selecting 25 purchasers.
abilities ( Los Angeles Times, July 16, 2002 ). Consider
a. What are the mean value and standard deviation of
a random sample of 25 students who have recently
the number who want a new copy of the book?
taken the test.
b. What is the probability that the number who want
a. What is the probability that exactly 1 received a spe-
new copies is more than two standard deviations
cial accommodation?
away from the mean value?
b. What is the probability that at least 1 received a spe-
c. The bookstore has 15 new copies and 15 used
cial accommodation?
copies in stock. If 25 people come in one by one
c. What is the probability that at least 2 received a spe-
to purchase this text, what is the probability that
cial accommodation?
all 25 will get the type of book they want from
d. What is the probability that the number among the 25
current stock? [Hint: Let X 5 the number who
who received a special accommodation is within 2
want a new copy. For what values of X will all 25
standard deviations of the number you would expect
get what they want?]
to be accommodated?
d. Suppose that new copies cost 100 and used copies
e. Suppose that a student who does not receive a special
cost 70. Assume the bookstore currently has 50 new
accommodation is allowed 3 hours for the exam,
copies and 50 used copies. What is the expected value
whereas an accommodated student is allowed
of total revenue from the sale of the next 25 copies
4.5 hours. What would you expect the average time
purchased? Be sure to indicate what rule of expected
allowed the 25 selected students to be?
value you are using. [Hint: Let h(X) 5 the revenue
57. A certain type of flashlight requires two type-D batter-
when X of the 25 purchasers want new copies. Express
ies, and the flashlight will work only if both its batteries
this as a linear function.]
have acceptable voltages. Suppose that 90 of all batter-
53. Exercise 30 (Section 3.3) gave the pmf of Y, the number
ies from a certain supplier have acceptable voltages.
of traffic citations for a randomly selected individual
Among ten randomly selected flashlights, what is the
insured by a particular company. What is the probability
probability that at least nine will work? What assump-
that among 15 randomly chosen such individuals
tions did you make in the course of answering the ques-
a. At least 10 have no citations?
tion posed?
b. Fewer than half have at least one citation?
58. A very large batch of components has arrived at a
c. The number that have at least one citation is between
distributor. The batch can be characterized as accept-
5 and 10, inclusive?
able only if the proportion of defective components is
54. A particular type of tennis racket comes in a midsize
at most .10. The distributor decides to randomly
version and an oversize version. Sixty percent of all cus-
select 10 components and to accept the batch only if
tomers at a certain store want the oversize version.
the number of defective components in the sample is
a. Among ten randomly selected customers who want
at most 2.
this type of racket, what is the probability that at
a. What is the probability that the batch will be
least six want the oversize version?
accepted when the actual proportion of defectives is
b. Among ten randomly selected customers, what is the
probability that the number who want the oversize
b. Let p denote the actual proportion of defectives in the batch. A graph of P(batch is accepted) as a func-
“Between a and b, inclusive” is equivalent to (a X b).
tion of p, with p on the horizontal axis and P(batch
3.4 the Binomial probability Distribution 125
is accepted) on the vertical axis, is called the operat-
b. For what value of p is V(X) maximized? [Hint:
ing characteristic curve for the acceptance sampling
Either graph V(X) as a function of p or else take a
plan. Use the results of part (a) to sketch this curve
derivative.]
for 0 p 1.
63. a. Show that b(x; n, 1 2 p) 5 b(n 2 x; n, p).
c. Repeat parts (a) and (b) with “1” replacing “2” in the
b. Show that B(x; n, 1 2 p) 5 1 2 B(n 2 x 2 1; n, p).
acceptance sampling plan.
[Hint: At most x S’s is equivalent to at least (n 2 x)
d. Repeat parts (a) and (b) with “15” replacing “10” in
F ’s.]
the acceptance sampling plan.
c. What do parts (a) and (b) imply about the necessity of
e. Which of the three sampling plans, that of part (a),
including values of p greater than .5 in Appendix
(c), or (d), appears most satisfactory, and why?
Table A.1?
59. An ordinance requiring that a smoke detector be
64. Show that E(X) 5 np when X is a binomial random
installed in all previously constructed houses has been
variable. [Hint: First express E(X) as a sum with lower
in effect in a particular city for 1 year. The fire depart-
limit x 5 1. Then factor out np, let y 5 x 2 1 so that the
ment is concerned that many houses remain without
sum is from y 5 0 to y 5 n 2 1, and show that the sum
detectors. Let p 5 the true proportion of such houses
equals 1.]
having detectors, and suppose that a random sample of
25 homes is inspected. If the sample strongly indicates
65. Customers at a gas station pay with a credit card (A),
that fewer than 80 of all houses have a detector, the
debit card (B), or cash (C). Assume that successive cus-
fire department will campaign for a mandatory inspec-
tomers make independent choices, with P(A) 5 .5,
tion program. Because of the costliness of the program,
P (B) 5 .2, and P(C) 5 .3.
the department prefers not to call for such inspections
a. Among the next 100 customers, what are the mean
unless sample evidence strongly argues for their neces-
and variance of the number who pay with a debit
sity. Let X denote the number of homes with detectors
card? Explain your reasoning.
among the 25 sampled. Consider rejecting the claim that
b. Answer part (a) for the number among the 100 who
p .8 if x 15.
don’t pay with cash.
a. What is the probability that the claim is rejected
66. An airport limousine can accommodate up to four passen-
when the actual value of p is .8?
gers on any one trip. The company will accept a maximum
b. What is the probability of not rejecting the claim
of six reservations for a trip, and a passenger must have a
when p 5 .7? When p 5 .6?
reservation. From previous records, 20 of all those mak-
c. How do the “error probabilities” of parts (a) and (b)
ing reservations do not appear for the trip. Answer the
change if the value 15 in the decision rule is replaced
following questions, assuming independence wherever
by 14?
appropriate.
60. A toll bridge charges 1.00 for passenger cars and 2.50
a. If six reservations are made, what is the probability
for other vehicles. Suppose that during daytime
that at least one individual with a reservation cannot
hours, 60 of all vehicles are passenger cars. If 25 vehi-
be accommodated on the trip?
cles cross the bridge during a particular daytime period,
b. If six reservations are made, what is the expected
what is the resulting expected toll revenue? [Hint: Let
number of available places when the limousine
X5 the number of passenger cars; then the toll revenue
departs?
h (X) is a linear function of X.]
c. Suppose the probability distribution of the number of
61. A student who is trying to write a paper for a course
reservations made is given in the accompanying
has a choice of two topics, A and B. If topic A is cho-
table.
sen, the student will order two books through interli- brary loan, whereas if topic B is chosen, the student will order four books. The student believes that a good
Number of reservations
paper necessitates receiving and using at least half the
Probability
books ordered for either topic chosen. If the probabil- ity that a book ordered through interlibrary loan actually arrives in time is .9 and books arrive indepen-
Let X denote the number of passengers on a randomly
dently of one another, which topic should the student
selected trip. Obtain the probability mass function
choose to maximize the probability of writing a good
of X.
paper? What if the arrival probability is only .5 instead
67. Refer to Chebyshev’s inequality given in Exercise 44.
of .9?
Calculate P( uX 2 mu ks) for k 5 2 and k 5 3 when
62. a. For fixed n, are there values of p(0 p 1) for which
X , Bin(20, .5), and compare to the corresponding
V (X) 5 0? Explain why this is so.
upper bound. Repeat for X , Bin(20, .75).
126 Chapter 3 Discrete random Variables and probability Distributions