Laws of large numbers

14.29 Laws of large numbers

In connection with coin-tossing problems, it is often said that the probability of tossing heads with a perfectly balanced coin is

This does not mean that if a coin is tossed twice it will necessarily come up heads exactly once. Nor does it mean that in 1000 tosses heads

will appear exactly 500 times. Let us denote by h(n) the number of heads that occur in n

tosses. Experience shows that even for very large

the ratio

is not necessarily

However, experience also shows that this ratio does seem to approach

as n increases, although it may oscillate considerably above and below in the process. This suggests that it might be possible to prove that

Unfortunately, this cannot be done. One difficulty is that the number h(n) depends not only on n but also on the particular experiment being performed. We have no way of knowing in advance how h(n) will vary from one experiment to another. But the real trouble is that it is possible (although not very likely) that in some particular experiment the ratio

may not tend to at all. For example, there is no reason to exclude the possibility of getting heads on every toss of the coin, in which case h(n) = n and

1. Therefore, instead of trying to prove the formula in

we shall find it more reasonable (and more profitable) to ask how likely it is that

will differ from by a certain amount. In other words, given some positive number

we seek the probability

By introducing a suitable random variable and using Chebyshev’s inequality we can get

a useful upper bound to this probability, a bound which does not require an explicit knowledge of h(n). This leads to a new limit relation that serves as an appropriate substitute for (14.45).

No extra effort is required to treat the more general case of a Bernoullian sequence of

trials, in which the probability of “success” is p and the probability of “failure” is q.

(In coin tossing, “success” can mean “heads” and for p we may take Let denote the random variable which counts the number of successes in

independent trials. Then has a binomial distribution with expectation E(X) = np and variance Var

= npq.

Hence Chebyshev’s inequality is applicable; it states that (14.46)

> c)

Since we are interested in the ratio which we may call the relative frequency of success,

Laws of large numbers

we divide the inequality

> c by and rewrite (14.46) as

Since this is valid for every c > 0, we may let depend on n and write = en, where is a fixed positive number. Then (14.47) becomes

The appearance of in the denominator on the right suggests that we let This leads to the limit formula

=0 foreveryfixed

called the law of large numbers for the Bernoulli distribution. It tells us that, given any > 0 (no matter how small), the probability that the relative frequency of success differs from

by more than is a function of which tends to 0 as This limit relation gives a mathematical justification to the assignment of the probability for tossing heads with a perfectly balanced coin.

The limit relation in (14.48) is a special case of a more general result in which the “relative frequency”

independent random variables having the same expectation and variance. This more general theorem is usually referred to as the weak law of large numbers; it may be stated as follows:

is replaced by the arithmetic mean of

ben independent random variables, each having the same expectation and the same variance, say

THEOREM 14.12. WEAK

LARGE NUMBERS .

Let

n. a new random variable (called the arithmetic mean of

..., by the equation

Then, for

> 0, we have

An equivalent statement is

Calculus of probabilities

We apply Chebyshev’s inequality to For this we need to know the expectation and variance of

These are

and

(See Exercise 5 in Section 14.27.) Chebyshev’s inequality becomes > c) Letting

and replacing c by we obtain (14.49) and hence (14.50). To show that the limit relation in (14.48) is a special case of Theorem 14.12, we

assume each has the possible values 0 and with probabilities = 1) = and = 0) = 1 . Then

is the relative frequency of success in independent trials, E(X) = p , and (14.49) reduces to (14.48).

Theorem 14.12 is called a weak law because there is also a strong law of large numbers which (under the same hypotheses) states that

The principal difference between (14.51) and (14.50) is that the operations “limit” and “probability” are interchanged. It can be shown that the strong law implies the weak law, but not conversely.

Notice that the strong law in (14.51) seems to be closer to formula (14.45) than (14.50) is. In fact, (14.51) says that we have

= m “almost always,” that is, with probability

1. When applied to coin tossing, in particular, it says that the failure of Equation (14.45) is no more likely than the chance of tossing a fair coin repeatedly and always getting heads. The strong law really shows why probability theory corresponds to experience and to our

intuitive feeling of what probability “should be.”

The proof of the strong law is lengthy and will be omitted. Proofs appear in the books listed as References 1, 3, 8, and 10 at the end of this chapter.