MODES OF CONVERGENCE
Chapter 8 MODES OF CONVERGENCE
We have a statistic T which is a measurable function of the data
T n = T (X 1 , ..., X n ),
and we would like to know what happens to T n as n → ∞. It turns out that the limit is easier to work with than T n itself. The plan is to use the limit as approximation
device. We think of a sequence T 1 ,T 2 , ... which have distribution functions F 1 ,F 2 , ... .
Definition
A sequence of random variables T 1 ,T 2 , ... converges in probability
to a random variable T (denoted by T p
n −→ T ) if, for every ε > 0, lim P[ |T n − T | > ε] = 0.
n→∞
Definition
A sequence of random variables T 1 ,T 2 , ... converges in mean square
to a random variable T (denoted by T ms
which is equivalent to (a) V ar (T n ) → 0 and (b) E (T n ) − E (T ) → 0 because of the triangle inequality.
Theorem 19 Convergence in mean square implies convergence in probability.
Proof. By the MarkovChebychev inequality,
E 2 |T
n −T|
P[ |T n − T | ≥ ε] ≤
→ 0.
ε 2
Modes of Convergence
Note that if T n > 0,
so that if E(T n )
→ 0, this is sufficient for T n −→ 0. But the converse of the theorem is not necessarily true. To see this consider
p
ε
the following random variable
⎧ ⎨
n 1 with probability
0 1 with probability 1 −
n
1 1 2 Then P [T 21
n ≥ ε] = n for any ε > 0, and P [T n ≥ ε] = n → 0. But E (T n ) =n n =
n → ∞.
A famous consequence of the theorem is the (Weak) Law of Large Numbers
Theorem 20 WEAK LAW of LARGE NUMBERS Let X 1 , ..., X n
be i.i.d. with
E (X 2
i ) = μ and V ar (X i )=σ < ∞, and T n = X. Then for all ε > 0,
lim p P[ |T
n − μ| > ε] = 0, i.e., T n −→ μ.
n→∞
The proof is easy because
2 σ 2
E[(T n − μ) ]=
→ 0.
n
as we have shown. In fact, the result can be proved with only the hypothesis that
E |X| < ∞ by using a truncation argument.
Another application of the previous theorem is to the empirical distribution function, i.e.,
The next result is very important for applying the Law of Large Numbers beyond the simple sum of iid’s.
Modes of Convergence
Theorem 21 p CONTINUOUS MAPPING THEOREM If T
n −→ μ a constant and
g(.) is a continuous function at μ, then
g (T p
n ) −→ g (μ) .
Proof. Let ε > 0. By the continuity of g at μ, ∃ η > 0 such that if
|x − μ| < η ⇒ |g (x) − g (μ)| < ε
Let A n = {|T n − μ| < η} and B n = {|g (T n ) − g (μ)| < ε}. But when A n is true so is
B n , i.e., A n ⊂B n . Since P (A n ) → 1, we must have that P (B n ) → 1.
Now we look at the sample variance
We know that:
by the continuous mapping theorem. Combining these two results we get
s 2 −→ σ .
2 p
Finally, notice that when dealing with a vector T
n = (T n1 , ..., T nk ) , we have
where kxk = x x
is the Euclidean norm, if and only if
|T p
nj −T j | −→ 0
Modes of Convergence
for all j = 1, ..., k. The if part is no surprise and follows from the continuous mapping theorem. The only if part follows as if kT n − T k < ε then |T nj −T j | < η for each j
and some η > 0.
Definition
A sequence of random variables T 1 ,T 2 , ... converges almost surely
to a random variable T (denoted by T as
n −→ T ) if, for every ε > 0, P [ lim |T n − T | < ε] = 1.
n→∞
This result is generally harder to establish than convergence in probability, i.e., there are not simple sufficient conditions based on mean and variance. Almost sure convergence implies convergence in probability but not vice versa. Note again that vector convergence is equivalent to componentwise convergence. Continuous mapping theorem is obvious. Let
A= {ω : T n (ω) → T (ω)} , P (A) = 1
On this set A, we have
g [T n (ω)] → g [T (ω)] ,
by ordinary continuity.
Theorem 22 STRONG LAW of LARGE NUMBERS. If E |X| < ∞, then
T as
n (ω) −→ E (X) .
We can have Strong Law of Large Numbers applied to empirical distribution functions and to sample variances (from the continuous mapping theorem) etc.