Proof. We set S
i
= X
n 1
+ X
n 2
+ · · · + X
n i
, i
≤ n and S
= 0. Equation 13 implies that e
X
n
≤ 1 + X
n
+ X
2 n
ϕm hence
E[e
S
] ≤ E[1 + X
n
+ X
2 n
ϕme
S
n −1
] = E
n −1
X
i=1
X
n
e
S
i
− e
S
i −1
+ E[1 + X
2 n
ϕme
S
n −1
] = E
n −1
X
i=1
X
n
X
n i
tanhX
n i
2 X
n i
e
S
i
+ e
S
i −1
+ ϕmE[X
2 n
e
S
n −1
] + E[e
S
n −1
] ≤ E
n −1
X
i=1
kE[X
n
|F
n i
]X
n i
k
∞
e
S
n i
+ e
S
i −1
2 + ϕmE[E[X
2 n
|F
n n
−1
]e
S
n −1
] + E[e
S
n −1
] ≤ 1 + q
n
+ ϕmv
n
sup
i ≤n−1
E[e
S
i
] ≤ e
q
n
+ϕmv
n
sup
i ≤n−1
E[e
S
i
] where q
n
and v
n
are the terms corresponding to k = n in the definition of q and v. This proves the result by induction.
2.2 Applications
2.2.1 Deviation bounds
In this section we give the deviation inequalities that can be deduced from the preceding exponential inequalities. We generalize the Bernstein inequality in Equations 18 and 22, and the Hoeffding
inequality in Equation 21; one could get Bennett inequalities through a similar process, we refer to Appendix B of [
? ]. In the martingale case, Equations 19 and 20 do not assume that the
variables are bounded, but sums of squares are involved.
Theorem 4. With the notations of Theorem 1 we have for any A, y
PS ≥ A, 〈X 〉 ≤ y ≤ exp
−
A
2
2 y + 8q + 2Am 3
18
PS ≥ A, [X
+
] + 〈X
−
〉 ≤ y ≤ exp
− A
2
2 y + 8q
19 PS
≥ A, [X ] + 2〈X 〉 ≤ 3 y ≤ exp
− A
2
2 y + 8q
. 20
761
With the notations of Theorem 2 and 3 we have for any A, y PS
≥ A, X
i
b
2 i
≤ 4 y ≤ exp
− A
2
2 y + 32q
21 PS
≥ A ≤ exp
− A
2
2v + 2q + 2Am 3
.
22 In the martingale case, 21 remains true if we allow a
i
and b
i
to be an F
i −1
-measurable random variable.
R
EMARK
. Equation 21 is analogous to Corollary 3a of [6]. Proof. Applying the bound 9 to the variables t X
i
for some t 0, we get
log PS ≥ A, 〈X 〉 ≤ y ≤ log E[exp{tS − A −
t
2
〈X 〉 − t
2
y t
2
m
2
e
t m
− tm − 1}] ≤ 4t
2
q + y
m
2
e
t m
− tm − 1 − tA ≤
y + 8q m
2
e
t m
− tm − 1 − tA. The optimization of this expression w.r.t. t
≥ 0 is classical in the theory of Bennett and Bernstein inequalities and delivers 18; see for instance the Appendix B of [
? ]. The second inequality is
deduced from 10 with the same method: for V = [X
+
] + 〈X
−
〉 or V = [X ] + 2〈X 〉3 one has log PS
≥ A, V ≤ y ≤ log E[e
tS −tA−t
2
V − y2
] ≤ 4t
2
q + y t
2
2 − tA
and we take t = A y + 8q.
Equations 21 and 22 are obtained similarly.
2.2.2 Bounded difference inequalities
The above results lead straightforwardly to bounded difference inequalities by using a classical martingale argument of Maurey [
? ]. Equation 26 is the McDiarmid inequality [? ]. Equation 25
is a Bernstein inequality in the same context.
Theorem 5. Let Y = Y
1
, . . . Y
n
be a zero-mean sequence of independent variables with values in some measured space E. Let f be a measurable function on E
n
with real values. Set S = f Y
− E[ f Y ] D
k
y, z = f Y
1
, . . . Y
k −1
, y, Y
k+1
. . . Y
n
− f Y
1
, . . . Y
k −1
, z, Y
k+1
. . . Y
n
Φ
k
= sup
y,z
E[D
k
y, z|Y
1
, . . . Y
k −1
] 23
∆
k
= f Y − E[ f Y |Y
1
, . . . Y
k −1
, Y
k+1
. . . Y
n
] = E[D
k
Y
k
, Y
′ k
|Y ] m = sup
k
ess sup ∆
k
762
where Y
′ k
is an independent copy of Y
k
. We assume the measurability of Φ
k
. Then for any A, y PS
≥ A, X
k
Φ
2 k
≤ 4 y ≤ exp
− A
2
2 y
24 PS
≥ A, X
k
E[∆
2 k
|Y
1
, . . . Y
k −1
] ≤ y ≤ exp
− A
2
2 y + 2Am 3
.
25 In particular
PS ≥ A ≤ exp
−
2A
2
P
k
δ
2 k
26
δ
k
= kD
k
Y
k
, Y
′ k
k
∞
. R
EMARK
. Let us mention that if f has the form f Y = sup
g ∈Γ
gY for some finite class of functions Γ then, with obvious notations,
D
k
y, z = sup
g ∈Γ
gY
1
, . . . Y
k −1
, y, Y
k+1
. . . Y
n
− sup
g ∈Γ
gY
1
, . . . Y
k −1
, z, Y
k+1
. . . Y
n
≤ sup
g ∈Γ
{gY
1
, . . . Y
k −1
, y, Y
k+1
. . . Y
n
− gY
1
, . . . Y
k −1
, z, Y
k+1
. . . Y
n
} = sup
g ∈Γ
D
g k
y, z in particular
δ
k
≤ sup
g ∈Γ
δ
g k
. This is a classical argument in the theory of concentration inequalities. Proof. We shall utilize 21 and 18 with
X
k
= E[ f Y |F
k
] − E[ f Y |F
k −1
] F
k
= σY
1
, . . . Y
k
. We have already pointed out that q = 0 since X
k
is a martingale difference. Let us define the random variables
L
k
= inf
y
E[F
k
y|Y
1
, . . . Y
k −1
] U
k
= sup
y
E[F
k
y|Y
1
, . . . Y
k −1
]. The equation
L
k
≤ E[ f Y |F
k
] ≤ U
k
implies L
k
− E[ f Y |F
k −1
] ≤ X
k
≤ U
k
− E[ f Y |F
k −1
] and since U
k
− L
k
= Φ
k
we can apply 21 with b
k
= Φ
k
and get 24. Clearly X
k
rewrites X
k
= E[∆
k
|F
k
] hence E[X
2 k
|F
k −1
] ≤ E[∆
2 k
|F
k −1
], 〈X 〉 ≤ V , and 25 follows from 18. 763
2.2.3 Inequalities for suprema of U-statistics