Chebyshev’s inequality

14.28 Chebyshev’s inequality

As mentioned earlier, a small value for the variance means that it is unlikely that a random variable X will deviate much from its expected value. To make this statement more precise we introduce the absolute value

which measures the actual distance between

and E(X). How likely is it that this distance is more than a given amount? To answer this question we must determine the probability

where c is a given positive number. In the continuous case we have

c] = 1 c] = 1 P[E(X)

c X E(X) + c]

therefore, the calculation of this probability can be accomplished once the density function is known, Of course,

is unknown this method gives no information. However, if the variance is known, we can obtain an upper bound for this probability. This upper bound is provided by the following theorem of P. L. Chebyshev

a famous Russian mathematician who made many important contributions to probability theory and other

563 THEOREM 14.11. CHEBYSHEV ' S INEQUALITY . Let be a one-dimensional random variable

Chebyshev’s inequality

withjnite expectation E(X) and variance Var Then for every positive number c we have

Proof. In the continuous case we have Var (X) =

[t

dt

dt

Because of the coefficient of

> c]. Therefore, when we divide by

on the right is

we obtain (14.44). This completes the proof for the continuous case; the discrete case may be similarly treated.

Chebyshev’s inequality tells us that the larger we make c the smaller the probability is that

> c. In other words, it is unlikely that X will be very far from E(X); it is even more unlikely if the variance Var (X) is small. If we replace c by ka, where k > 0 and denotes the standard deviation, =

(X), Chebyshev’s inequality becomes

That is, the probability that will differ from its expected value by more than k standard deviations does not exceed

For example, when k = 10 this inequality tells us that the probability

does not exceed 0.010. In other words, the probability is no more than 0.010 that an observed value of

will differ from the expected value by more than ten standard deviations. Similarly, when k = 3 we find that the probability = 0.111.. . that an observed value will differ from the mean by more than three standard deviations.

Chebyshev’s inequality is a general theorem that applies to all distributions. In many applications the inequality can be strengthened when more information is known about the particular distribution. For example, if

has a binomial distribution with parameters n andp it can be shown (by use of the normal approximation to the binomial distribution) that for large n the probability is about 0.003 that an observed value will differ from the mean by more than three standard deviations. (For this result, n

12 suffices.) This is much smaller than the probability 0.11 I provided by Chebyshev’s inequality.

EXAMPLE . Testing a coin for fairness. We want to decide whether or not a particular coin is fair by tossing it 10,000 times and recording the number of heads. For a fair coin the random variable

which counts the number of heads has a binomial distribution with parameters n = 10,000 and p =

is np = 5,000 and the standard deviation is =

The mean of

= 50. (See Example 2 in Section 14.25.) As mentioned above, the

Calculus of probabilities

by more than is about 0.003. Therefore, let us agree to say that a coin is not fair if the number of heads in 10,000 tosses differs from the mean by more than 30. Since E(X) = 5,000 and

= 150, we would say the coin is unfair if the number of heads in 10,000 tosses is less than 4,850 or more than 5,150.