getdoca70a. 117KB Jun 04 2011 12:04:48 AM
Elect. Comm. in Probab. 6 (2001) 55–65
ELECTRONIC
COMMUNICATIONS
in PROBABILITY
A NOTE ON TALAGRAND’S
CONCENTRATION INEQUALITY
DMITRIY PANCHENKO
Department of Mathematics and Statistics, The University of New Mexico,
Albuquerque, NM, 87131, USA
email: [email protected]
submitted February 7, 2001 Final version accepted April 24, 2001
AMS 2000 Subject classification: 60E15, 28A35
Concentration of measure, empirical processes
Abstract
In this paper we revisit Talagrand’s proof of concentration inequality for empirical processes.
We give a different proof of the main technical lemma that garantees the existence of a certain kernel. Moreover, we generalize the result of Talagrand to a family of kernels which in
one particular case allows us to produce the Poissonian bound without using the truncation
argument. In section 2 we give some examples of application of the abstract concentration
inequality to empirical processes that demonstrate some interesting properties of Talagrand’s
kernel method.
1
Introduction and the proof of main lemma
This paper was motivated by the Section 4 of the “New concentration inequalities in product
spaces” by Michel Talagrand. For the most part we will keep the same notations with possible
minor changes. We slightly weaken the definition of the distance m(A, x) introduced in [8],
but, essentially, this is what is used in the proof of the concentration inequality for empirical
processes. Moreover, we introduce a parametrized family of distances mα (A, x) for α > 0,
which will allow us to produce one example of interest in Section 2. The case of α = 1
essentially corresponds to the distance m(A, x) in [8], and will also be used in several examples
of Section 2. The Theorem 1 below is almost identical to Theorem 4.2 in [8] and we assume
that the reader is familiar with the proof. The main technical step, Proposition 4.2 in [8], is
proved differently and constitutes the statement of Lemma 1 below.
Let Ωn be a measurable product space with a product measure µn . Consider a probability
measure ν on Ωn and x ∈ Ωn . If Ci = {y ∈ Ωn : yi 6= xi }, we consider the image of the
restriction of ν to Ci by the map y → yi , and its Radon-Nikodym derivative di with respect to
µ. As in [8] we assume that Ω is finite and each point is measurable with a positive measure.
Let m be a number of atoms in Ω and p1 , . . . , pm be their probabilities. By the definition of
55
56
Electronic Communications in Probability
di we have
Z
g(yi )dν(y) =
Ci
Z
g(yi )di (yi )dµ(yi ).
Ω
For α > 0 we define a function ψα (x) by
2
x /(4α), when x ≤ 2α,
ψα (x) =
x − α,
when x ≥ 2α.
We set
mα (ν, x) =
XZ
ψα (di )dµ and mα (A, x) = inf{mα (ν, x) : ν(A) = 1}.
i≤n
For each α > 0 let Lα be any positive number satisfying the following inequality:
2Lα (e1/Lα − 1)
≤ α.
1 + 2Lα
(1.1)
The following theorem holds.
Theorem 1 Let α > 0 and Lα satisfy (1.1). Then for any n and A ⊆ Ωn we have
Z
1
1
.
mα (A, x)dµn (x) ≤ n
exp
Lα
µ (A)
(1.2)
Proof. As we mentioned above the proof is identical to the proof of Theorem 4.2 in [8] where
Proposition 4.2 is substituted by the Lemma 1 below, and the case of n = 1 must be adjusted
to the new definition of ψα . Namely, one has to prove that
p + (1 − p) exp{
1
1
1
pψα ( )} ≤
Lα
p
p
for all p ∈ [0, 1]. This can be rewritten as
exp{
1+p
1
1
pψα ( )} ≤
Lα
p
p
or
pψα ( p1 )
log(1 + p1 )
≤ Lα .
By the definition of ψα one has to consider two separate cases:
1. If p−1 ≥ 2α then (1 − αp)/ log(1 + p1 ) ≤ Lα ,
2. If p−1 ≤ 2α then 1/4αp log(1 + 1p ) ≤ Lα .
Taking into account (1.1) we must show that for all L > 0
1−p
1
1
4L(e1/L − 1)
2L(e1/L − 1)
≤ L log(1 + ), if
≥
,
1 + 2L
p
p
1 + 2L
1
4L(e1/L − 1)
8L2 (e1/L − 1)
1
,
if
≤
.
≤
1 + 2L
p
1 + 2L
p log(1 + p1 )
The proof of both of these inequalities constitutes a tedious exersize in calculus and is boring
enough not to include it in this paper.
Talagrand’s Concentration Inequality
57
Lemma 1 Let g1 ≥ g2 ≥ . . . ≥ gm > 0. For α > 0 and Lα satistying (1.1) there exist
{kji : 1 ≤ j < i ≤ m} such that
kji ≥ 0, and
X
kji pj ≤ 1 for all i ≤ m,
(1.3)
j
ELECTRONIC
COMMUNICATIONS
in PROBABILITY
A NOTE ON TALAGRAND’S
CONCENTRATION INEQUALITY
DMITRIY PANCHENKO
Department of Mathematics and Statistics, The University of New Mexico,
Albuquerque, NM, 87131, USA
email: [email protected]
submitted February 7, 2001 Final version accepted April 24, 2001
AMS 2000 Subject classification: 60E15, 28A35
Concentration of measure, empirical processes
Abstract
In this paper we revisit Talagrand’s proof of concentration inequality for empirical processes.
We give a different proof of the main technical lemma that garantees the existence of a certain kernel. Moreover, we generalize the result of Talagrand to a family of kernels which in
one particular case allows us to produce the Poissonian bound without using the truncation
argument. In section 2 we give some examples of application of the abstract concentration
inequality to empirical processes that demonstrate some interesting properties of Talagrand’s
kernel method.
1
Introduction and the proof of main lemma
This paper was motivated by the Section 4 of the “New concentration inequalities in product
spaces” by Michel Talagrand. For the most part we will keep the same notations with possible
minor changes. We slightly weaken the definition of the distance m(A, x) introduced in [8],
but, essentially, this is what is used in the proof of the concentration inequality for empirical
processes. Moreover, we introduce a parametrized family of distances mα (A, x) for α > 0,
which will allow us to produce one example of interest in Section 2. The case of α = 1
essentially corresponds to the distance m(A, x) in [8], and will also be used in several examples
of Section 2. The Theorem 1 below is almost identical to Theorem 4.2 in [8] and we assume
that the reader is familiar with the proof. The main technical step, Proposition 4.2 in [8], is
proved differently and constitutes the statement of Lemma 1 below.
Let Ωn be a measurable product space with a product measure µn . Consider a probability
measure ν on Ωn and x ∈ Ωn . If Ci = {y ∈ Ωn : yi 6= xi }, we consider the image of the
restriction of ν to Ci by the map y → yi , and its Radon-Nikodym derivative di with respect to
µ. As in [8] we assume that Ω is finite and each point is measurable with a positive measure.
Let m be a number of atoms in Ω and p1 , . . . , pm be their probabilities. By the definition of
55
56
Electronic Communications in Probability
di we have
Z
g(yi )dν(y) =
Ci
Z
g(yi )di (yi )dµ(yi ).
Ω
For α > 0 we define a function ψα (x) by
2
x /(4α), when x ≤ 2α,
ψα (x) =
x − α,
when x ≥ 2α.
We set
mα (ν, x) =
XZ
ψα (di )dµ and mα (A, x) = inf{mα (ν, x) : ν(A) = 1}.
i≤n
For each α > 0 let Lα be any positive number satisfying the following inequality:
2Lα (e1/Lα − 1)
≤ α.
1 + 2Lα
(1.1)
The following theorem holds.
Theorem 1 Let α > 0 and Lα satisfy (1.1). Then for any n and A ⊆ Ωn we have
Z
1
1
.
mα (A, x)dµn (x) ≤ n
exp
Lα
µ (A)
(1.2)
Proof. As we mentioned above the proof is identical to the proof of Theorem 4.2 in [8] where
Proposition 4.2 is substituted by the Lemma 1 below, and the case of n = 1 must be adjusted
to the new definition of ψα . Namely, one has to prove that
p + (1 − p) exp{
1
1
1
pψα ( )} ≤
Lα
p
p
for all p ∈ [0, 1]. This can be rewritten as
exp{
1+p
1
1
pψα ( )} ≤
Lα
p
p
or
pψα ( p1 )
log(1 + p1 )
≤ Lα .
By the definition of ψα one has to consider two separate cases:
1. If p−1 ≥ 2α then (1 − αp)/ log(1 + p1 ) ≤ Lα ,
2. If p−1 ≤ 2α then 1/4αp log(1 + 1p ) ≤ Lα .
Taking into account (1.1) we must show that for all L > 0
1−p
1
1
4L(e1/L − 1)
2L(e1/L − 1)
≤ L log(1 + ), if
≥
,
1 + 2L
p
p
1 + 2L
1
4L(e1/L − 1)
8L2 (e1/L − 1)
1
,
if
≤
.
≤
1 + 2L
p
1 + 2L
p log(1 + p1 )
The proof of both of these inequalities constitutes a tedious exersize in calculus and is boring
enough not to include it in this paper.
Talagrand’s Concentration Inequality
57
Lemma 1 Let g1 ≥ g2 ≥ . . . ≥ gm > 0. For α > 0 and Lα satistying (1.1) there exist
{kji : 1 ≤ j < i ≤ m} such that
kji ≥ 0, and
X
kji pj ≤ 1 for all i ≤ m,
(1.3)
j