getdoc78d2. 329KB Jun 04 2011 12:04:32 AM
J
i
on
Electr
o
u
a
rn l
o
f
P
c
r
ob
abil
ity
Vol. 13 (2008), Paper no. 5, pages 107–134.
Journal URL
http://www.math.washington.edu/~ejpecp/
Concentration of the Spectral Measure for Large
Random Matrices with Stable Entries
Christian Houdré
∗
Hua Xu
†
Abstract
We derive concentration inequalities for functions of the empirical measure of large random
matrices with infinitely divisible entries, in particular, stable or heavy tails ones. We also
give concentration results for some other functionals of these random matrices, such as the
largest eigenvalue or the largest singular value.
Key words: Spectral Measure, Random Matrices, Infinitely divisibility, Stable Vector, Concentration.
AMS 2000 Subject Classification: Primary 60E07, 60F10, 15A42, 15A52.
Submitted to EJP on June 12, 2007, final version accepted January 18, 2008.
∗
†
Georgia Institute of Technology, School of Mathematics, Atlanta, Georgia, 30332, [email protected]
Georgia Institute of Technology, School of Mathematics, Atlanta, Georgia, 30332, [email protected]
107
1
Introduction and Statements of Results:
Large random matrices have recently attracted a lot of attention in fields such as statistics, mathematical physics or combinatorics (e.g., see Mehta [24], Bai and Silverstein [4], Johnstone [18],
Anderson, Guionnet and Zeitouni [2]). For various classes of matrix ensembles, the asymptotic
behavior of the, properly centered and normalized, spectral measure or of the largest eigenvalue
is understood. Many of these results hold true for matrices with independent entries satisfying
some moment conditions (Wigner [35], Tracy and Widom [33], Soshnikov [28], Girko [8], Pastur
[25], Bai [3], Götze and Tikhomirov [9]).
There is relatively little work outside the independent or finite second moment assumptions. Let
us mention Soshnikov [30] who, using ideas from perturbation theory, studied the distribution of
the largest eigenvalue of Wigner matrices with entries having heavy tails. (Recall that a real (or
complex) Wigner matrix is a symmetric (or Hermitian) matrix whose entries Mi,i , 1 ≤ i ≤ N ,
and Mi,j , 1 ≤ i < j ≤ N , form two independent families of iid (complex valued in the Hermitian
case) random variables.) In particular, (see [30]), for a properly normalized Wigner matrix
with entries belonging to the domain of attraction of an α-stable law, limN →∞ PN (λmax ≤ x) =
exp (−x−α ) (here λmax is the largest eigenvalue of such a normalized matrix). Further, Soshnikov
and Fyodorov [32], using the method of determinants, derived results for the largest singular
value of K × N rectangular matrices with independent Cauchy entries, showing that the largest
singular value of such a matrix is of order K 2 N 2 (see also the survey article [31], where band
and sparse matrices are studied).
On another front, Guionnet and Zeitouni [10], gave concentration results for functionals of the
empirical spectral measure of, self-adjoint, random matrices whose entries are independent and
either satisfy a logarithmic Sobolev inequality or are compactly supported. They obtained, for
such matrices, the subgaussian decay of the tails of the empirical spectral measure when it
deviates from its mean. They also noted that their technique could be applied to prove results
for the largest eigenvalue or for the spectral radius of such matrices. Alon, Krivelevich and Vu
[1] further obtained concentration results for any of the eigenvalues of a Wigner matrix with
uniformly bounded entries (see, Ledoux [20] for more developments and references).
Our purpose in the present work is to deal with matrices whose entries form a general infinitely
divisible vector, and in particular a stable one (without independence assumption). As well
known, unless degenerated, an infinitely divisible random variable cannot be bounded. We
obtain concentration results for functionals of the corresponding empirical spectral measure,
allowing for any type of light or heavy tails. The methodologies developed here apply as well to
the largest eigenvalue or to the spectral radius of such random matrices.
Following the lead of Guionnet and Zeitouni [10], let us start by setting our notation and
framework.
Let MN ×N (C) be the set of N ×N Hermitian matrices with complex entries, which is throughout
equipped with the Hilbert-Schmidt (or Frobenius or entrywise Euclidean) norm:
v
u N
p
uX
kMk = tr(M∗ M) = t
|Mi,j |2 .
i,j=1
Let f be a real valued function on R. The function f can be viewed as mapping MN ×N (C) to
MN ×N (C). Indeed, for M = (Mi,j )1≤i,j≤N ∈ MN ×N (C), so that M = UDU∗ , where D is a
108
diagonal matrix, with real entries λ1 , ..., λN , and U is a unitary matrix, set
f (λ1 ) 0
···
0
0
f (λ2 ) · · ·
0
f (M) = Uf (D)U∗ , f (D) =
..
..
.. .
..
.
.
.
.
0
0
· · · f (λN )
Let tr(M) =
PN
i=1 Mi,i
be the trace operator on MN ×N (C) and set also
N
1 X
trN (M) =
Mi,i .
N
i=1
ForP a N × N random Hermitian matrix with eigenvalues λ1 , λ2 , ..., λN , let FN (x) =
N
1
i=1 1{λi ≤x} be the corresponding empirical spectral distribution function. As well known,
N
2
if M is a N × N Hermitian Wigner matrix with
√ E[M1,1 ] = E[M1,2 ] = 0, E[|M1,2 | ] = 1,
2
and E[M1,1 ] < ∞, the spectral measure of M/ N converges to the semicircle law: σ(dx) =
√
4 − x2 1{|x|≤2} dx/2π ([2]).
We study below the tail behavior of either the spectral measure or the linear statistic of f (M)
for classes of matrices M. Still following Guionnet and Zeitouni, we focus on a general random
matrix XA given as follows:
1
XA = ((XA )i,j )1≤i,j≤N , XA = X∗A , (XA )i,j = √ Ai,j ωi,j ,
N
√
I )
R +
−1ωi,j
with (ωi,j )1≤i,j≤N = (ωi,j
1≤i,j≤N , ωi,j = ωj,i , and where ωi,j , 1 ≤ i ≤ j ≤ N is a
√
R + −1P I , 1 ≤ i ≤ j ≤ N , with P I = δ
complex valued random variable with law Pi,j = Pi,j
0
i,j
i,i
(by the Hermite property). Moreover, the matrix A = (Ai,j )1≤i,j≤N is Hermitian with, in most
cases, non-random complex valued entries uniformly bounded, say, by a.
Different choices for the entries of A allow to cover various types of ensembles. For instance,√if
ωi,j , 1 ≤ i < j ≤ N , and ωi,i , 1 ≤ i ≤ N , are iid N (0, 1) random variables, taking Ai,i = 2
and Ai,j = 1, for 1 ≤ i < j ≤ N gives the GOE (Gaussian Orthogonal Ensemble). If
R , ω I , 1 ≤ i < j ≤ N , and ω R , 1 ≤ i ≤ N , are iid N (0, 1) random variables, taking A
ωi,j
i,i = 1
i,i
i,j
√
and Ai,j = 1/ 2, for 1 ≤ i < j ≤ N gives the GUE (Gaussian Unitary Ensemble) (see [24]).
R , ω I , 1 ≤ i < j ≤ N , and ω R , 1 ≤ i ≤ N , are two independent families of real
Moreover, if ωi,j
i,j
i,i
valued random variables, taking Ai,j = 0 for |i − j| large and Ai,j = 1 otherwise, gives band
matrices.
Proper choices of non-random Ai,j also make it possible to cover Wishart matrices, as seen in
the later part of this section. In certain instances, A can also be chosen to be random, like in the
case of diluted matrices, in which case Ai,j , 1 ≤ i ≤ j ≤ N , are iid Bernoulli random variables
(see [10]).
2
R , ω R , ω I ), 1 ≤ i < j ≤ N ,
On RN , let PN be the joint law of the random vector X = (ωi,i
i,j
i,j
R are 1 ≤ i ≤ N . Let EN be the corresponding
where it is understood that the indices for ωi,i
expectation. Denote by µ̂N
A the empirical spectral measure of the eigenvalues of XA , and further
note that
Z
1
f (x)µ̂N
trN f (XA ) = tr(f (XA )) =
A (dx),
N
R
109
for any bounded Borel function f . For a Lipschitz function f : Rd → R, set
|f (x) − f (y)|
,
kx − yk
x6=y
kf kLip = sup
where throughout k·k is the Euclidean norm, and where we write f ∈ Lip(c) whenever kf kLip ≤ c.
Each element M of MN ×N (C) has a unique collection of eigenvalues λ = λ(M) = (λ1 , · · · , λN )
listed in non increasing order according to multiplicity in the simplex
S N = {λ1 ≥ · · · ≥ λN : λi ∈ R, 1 ≤ i ≤ N },
qP
N
2
where throughout S N is equipped with the Euclidian norm kλk =
i=1 λi . It is a classical
result sometimes called Lidskii’s theorem ([26]), that the map MN ×N (C) → S N which associates
to each Hermitian matrix its ordered list of real eigenvalues is 1-Lipschitz ([11], [19]). For
a matrix XA under consideration with eigenvalues λ(XA ), it is then clear that the map ϕ :
2
R , ωR , ωI )
7→ λ(XA ) is Lipschitz, from (RN , k · k) to (S N , k · k), with Lipschitz
(ωi,i
i,j 1≤i 0, let pγ = inf x > 0 : 0 < V 2 (x)/x2 ≤
kuk≤x
ª
γ . Let f ∈ Lip(1), then for any γ such that ν̄(pγ ) ≤ 1/4,
(i) any median m(f (X)) of f (X) satisfies
|m(f (X)) − f (0)| ≤ G1 (γ) := pγ
³√
(ii) the mean EN [f (X)] of f (X), if it exists, satisfies
|EN [f (X)] − f (0)| ≤ G2 (γ) := pγ
´
γ + 3kγ (1/4) + Eγ ,
³√
´
γ + kγ (1/4) + Eγ ,
where kγ (x), x > 0, is the solution, in y, of the equation
¶
µ
y
= ln x,
y − (y + γ) ln 1 +
γ
and where
Eγ =
Ã
2
N ³
X
k=1
hek , βi −
Z
pγ 0, let
n
o
Lipb (c) = f : R → R : kf kLip ≤ c, kf k∞ ≤ b ,
while for a fixed compact set K ⊂ R, with diameter |K| = sup |x − y|, let
x,y∈K
LipK (c) := {f : R → R : kf kLip ≤ c, supp(f ) ⊂ K},
where supp(f ) is the support of f .
R , ωR , ωI )
law PN ∼
Theorem 1.2. Let X = (ωi,i
i,j 1≤i 0. Let T = sup{t ≥ 0 : EN etkXk < +∞}
and let h−1 be the inverse of
Z
¡
¢
kuk eskuk − 1 ν(du),
0 < s < T.
h(s) =
RN
2
(i) For any compact set K ⊂ R,
³
´
N
N
P
sup |trN (f (XA )) − E [trN (f (XA ))] | ≥ δ
f ∈LipK (1)
8|K|
≤
exp
δ
√
for all δ > 0 such that δ 2 < 8 2a|K|h (T − ) /N .
111
(
−
Z
8
0
2
√N δ
2a|K|
)
h−1 (s)ds ,
(1.4)
(ii)
PN
³
sup |trN (f (XA )) − EN [trN (f (XA ))] | ≥ δ
f ∈Lipb (1)
´
)
( Z N δ2
√
2aC(δ,b)
C(δ, b)
−1
h (s)ds ,
exp −
≤
δ
0
(1.5)
√
2aC(δ, b)h(T − )/N , where
¶
µ√ ³
´
2a
G2 (γ) + h(t0 ) + b ,
C(δ, b) = C √
N
for all δ > 0 such that δ 2 ≤
with G2R(γ) as in Proposition 1.1, C a universal constant, and with t0 the solution, in t, of
t
th(t) − 0 h(s)ds − ln(12b/δ) = 0.
Remark 1.3. (i) The order of C(δ, b) in part (ii) can be made more specific. Indeed, it will
be clear from the proof of this theorem (see (2.39)), that for any fixed t∗ , 0 < t∗ ≤ T ,
R t∗
µ √ ³ 12b
´¶
h(s)ds
2a ln δ
0
C(δ, b) ≤ C √
+
+ G2 (γ) .
t∗
t∗
N
(ii) As seen from the
£ proof
¤ (see (2.38)),
£
¤in the statement of the above theorem, G2 (γ) can be
N
N
replaced by E kXk . Now E kXk is of order N , since
q £ ¤
£
¤
£
¤
EN Xj2 ,
(1.6)
N min EN |Xj | ≤ EN kXk ≤ N max
j=1,2,...,N 2
j=1,2,...,N 2
where the Xj , j = 1, 2, . . . , N 2 are the components of X. Actually, an estimate more
precise than (1.6) is given by a result of Marcus and Rosiński [22] which asserts that if
E[X] = 0, then
£
¤ 17
1
x0 ≤ E kXk ≤ x0 ,
4
8
where x0 is the solution of the equation:
V 2 (x) U (x)
+
= 1,
x2
x
R
with V 2 (x) as defined before and U (x) = kuk≥x kukν(du), x > 0.
(1.7)
(iii) As usual, one can easily pass from the mean EN [trN (f )] to any median m(trN (f )) in either
(1.4) or (1.5). Indeed, for any 0 ≤ δ ≤ 2b, if
sup |trN (f ) − m(trN (f ))| ≥ δ,
f ∈Lipb (1)
there exist a function f ∈ Lipb (1) and a median m(trN (f )) of trN (f ), such that either trN (f ) − m(trN (f )) ≥ δ or trN (f ) − m(trN (f )) ≤ −δ. Without loss of generality assuming the former, otherwise dealing with the latter with −f , consider the func2
tion g(y) = min (d(y, A), δ) /2, y ∈ RN , where A = {trN (f ) ≤ m(trN (f )}. Clearly
112
g ∈ Lipb (1), EN [trN (g)] ≤ δ/4, and therefore trN (g) − EN [trN (g)] ≥ δ/4, which indicates that
¯ δ
¯
sup ¯trN (g) − EN [trN (g)] ¯ ≥ .
4
g∈Lipb (1)
Hence,
P
N
µ
¯
¯
sup ¯trN (f ) − m(trN (f ))¯ ≥ δ
f ∈Lipb (1)
≤P
N
µ
¶
¶
¯ δ
¯
N
¯
¯
sup trN (g) − E [trN (g)] ≥
.
4
g∈Lipb (1)
(1.8)
(iv) When the entries of X are independent, and under a finite exponential moment assumption,
the dependency in N of the function h (above and below) can sometimes be improved. We
refer the reader to [15] where some of these generic problems are discussed and tackled.
Next, recall (see [7], [19]) that the Wasserstein distance between any two probability measures
µ1 and µ2 on R is defined by
Z
¯
¯Z
¯
¯
f dµ2 ¯.
dW (µ1 , µ2 ) = sup ¯ f dµ1 −
(1.9)
f ∈Lipb (1)
R
R
Hence, Theorem 1.2 actually gives a concentration result, with respect to the Wasserstein disN N
tance, for the empirical spectral measure µ̂N
A , when it deviates from its mean E [µ̂A ].
As in [10], we can also obtain a concentration result for the distance between any particular
probability measure and the empirical spectral measure.
R , ωR , ωI )
Proposition 1.4. Let X = (ωi,i
law PN ∼
i,j
i,j 1≤i 0. Let T = sup{t > 0 : EN etkXk < +∞}
R
and let h−1 be the inverse of h(s) = RN 2 kuk(eskuk − 1)ν(du), 0 < s < T . Then, for any
probability measure µ,
¡
¢
N
N
PN dW (µ̂N
A , µ) − E [dW (µ̂A , µ)] ≥ δ ≤ exp
for all 0 < δ <
√
½
−
Z
0
Nδ
√
2a
¾
h−1 (s)ds ,
(1.10)
2ah (T − ) /N .
Of particular importance is the case of an infinitely divisible vector having boundedly supported
Lévy measure. We then have:
R , ωR , ωI )
N ∼
Corollary 1.5. Let X = (ωi,i
i,j
i,j 1≤i 0 : ν(x : kxk > r) = 0}, let
¡ 0, ν)
¢ that
R
2
2
V = V (R) = RN 2 kuk2 ν(du), and for x > 0 let
ℓ(x) = (1 + x) ln(1 + x) − x.
113
(i) For any δ > 0,
PN
³
sup |trN (f (XA )) − EN [trN (f (XA ))]| ≥ δ
f ∈Lipb (1)
C(δ, b)
exp
≤
δ
(
´
µ
¶)
N Rδ 2
V2
− 2ℓ √
,
R
2aC(δ, b)V 2
(1.11)
where
µ√ ³
¶
¢´
2a
V 2 ¡ t0 R
C(δ, b) = C √
e
−1 +b ,
G2 (γ) +
R
N
with G2 (γ) as in Proposition 1.1, C a universal constant, and t0 the solution, in t, of
´
V 2³
12b
tR
tR
.
tRe
−
e
+
1
= ln
2
R
δ
(ii) For any probability measure µ on R, and any δ > 0,
¡
¢
N
N
PN dW (µ̂N
A , µ) − E [dW (µ̂A , µ)] ≥ δ
!)
Ã
! Ã
(
V2
Nδ
N Rδ 2
Nδ
.
− √
+
ln 1 + √
≤ exp √
2aR
2aR R2
2aV 2
(1.12)
Remark 1.6. (i) As in Theorem 1.2, the dependency of C(δ, b) in δ and b can be made more
precise. A key step in the proof of (1.11) is to choose τ such that
EN [trN (1{|XA |≥τ } )] ≤ δ/12b,
and then C(δ, b) is determined by τ . Minimizing, in t, the right hand side of (2.38), leads
to the following estimate
´ !)
(
à ³ √N
√ τ − G2 (γ)
R
2
V
2a
EN [trN (1{|XA |≥τ } )] ≤ exp − 2 ℓ
,
R
V2
where ℓ(x) = (1 + x) ln(1 + x) − x. For x ≥ 1, 2ℓ(x) ≥ x ln x. Hence one can choose τ to
be the solution, in x, of the equation
x xR
12b
ln 2 = 2 ln
.
R V
δ
It then follows that C(δ, b) can be taken to be
µ√ ³
¶
´
2a
C √
G2 (γ) + τ + b .
N
Without the finite exponential moment assumption, an interesting class of random matrices with
infinitely divisible entries are the ones with stable entries, which we now analyze.
Recall that X in Rd is α-stable, (0 < α < 2), if its Lévy measure ν is given, for any Borel set
B ∈ B(Rd ), by
Z +∞
Z
dr
(1.13)
1B (rξ) 1+α ,
σ(dξ)
ν(B) =
r
0
S d−1
114
where σ, the spherical component of the Lévy measure, is a finite positive measure on S d−1 , the
unit sphere of Rd . Since the expected value of the spectral measure of a matrix with α-stable
entries might not exist, we study the deviation from a median. Here is a sample result.
R , ωR , ωI )
Theorem 1.7. Let 0 < α < 2, and let X = (ωi,i
i,j 1≤i
√
(1.14)
h
i1/α
2
, and where C(α) = 4α (2 − α + eα)/α(2 − α).
2a 2σ(S N −1 )C(α)
(ii) Let λmax (XA ) be the largest eigenvalue of XA , and let m(λmax (XA )) be any median of
λmax (XA ), then
P
N
¡
√ α σ(S N 2 −1 )
λmax (XA ) − m(λmax (XA )) ≥ δ ≤ C(α)( 2a)
,
N α/2 δ α
¢
(1.15)
i1/α
√
√ h
2
whenever δ N > 2a 2σ(S N −1 )C(α)
, and where C(α) = 4α (2 − α + eα)/α(2 − α).
I
Remark 1.8. Let M be a Wigner matrix whose entries Mi,i , 1≤i≤N , MR
i,j , 1≤i 0,
P(|M1,1 | > δ) =
L(δ)
,
δα
for some slowly varying positive function L such that lim L(tδ)/L(δ) = 1, for all t > 0. Soshδ→∞
nikov [30] showed that, for any δ > 0,
−α
lim PN (λmax (b−1
),
N M) ≥ δ) = 1 − exp(−δ
N →∞
where bN is a normalizing factor such that lim N 2 L(bN )/bαN = 2 and where λmax (b−1
N M) is
N →∞
2
2
−ǫ
α
the largest eigenvalue of b−1
/bN = 0 and lim bN /N α +ǫ = 0, for any
N M. In fact lim N
N →∞
N →∞
ǫ > 0. As stated in [13], when the random vector X is in the domain of attraction of an α-stable
distribution, concentration inequalities similar to (1.14) or (1.15) can be obtained for general
Lipschitz function. In particular, if the Lévy measure of X is given by
Z
Z +∞
L(r)dr
ν(B) =
σ(dξ)
1B (rξ) 1+α ,
(1.16)
2
r
S N −1
0
for some slowly varying function L on [0, +∞), and if we still choose the normalizing factor bN
2
such that limN →∞ σ(S N −1 )L(bN )/bαN is constant, then,
115
¡
¢
−1
PN λmax (b−1
N M) − m(λmax (bN M)) ≥ δ
≤
whenever
2
C(α)σ(S N −1 )2α/2
(δbN )α ≥ 21+α/2 C(α)σ(S N
bαN
2 −1
´
³
L bN √δ2
δα
,
(1.17)
√ ¢
¡
)L bN δ/ 2 .
2
Now, recall that for an N 2 dimensional vector with iid entries, σ(S N −1 ) = N 2 (σ̂(1) + σ̂(−1)),
where σ̂(1) is short for σ(1, 0, . . . , 0) and similarly for σ̂(−1). Thus, for fixed N , our result gives
the correct order of the upper bound for large values of δ, since for δ > 1,
1
e−1
−α
≤ 1 − e−δ ≤ α .
α
eδ
δ
Moreover, in the stable case, L(δ) becomes constant, and bN = N 2/α . Since λmax (N −2/α
is
√ M)
−2/α
a Lipschitz function of the entries of the matrix M with Lipschitz constant at most 2N
,
for any median m(λmax (N −2/α M)) of λmax (N −2/α M), we have,
¡
¢
¡
¢
σ̂(1)+ σ̂(−1) 1
2
2
N
−α
−α
P λmax (N M) − m(λmax (N M)) ≥ δ ≤ C(α)
,
(1.18)
δα
2α/2
£
¡
¢¤1/α
whenever δ ≥ 2C(α) σ̂(1) + σ̂(−1)
. Furthermore, using Theorem 1 in [14], it is not
difficult to see that m(λmax (N −2/α M)) can be upper and lower bounded independently of N .
Finally, an argument as in Remark 1.15 below will give a lower bound on λmax (N −2/α M) of the
same order as (1.18).
The following proposition will give an estimate on any median of a Lipschitz function of X,
where X is a stable vector. It is the version of Proposition 1.1 for α-stable vectors.
R , ωR , ωI )
Proposition 1.9. Let 0 < α < 2, and let X = (ωi,i
i,j 1≤i 0, is the solution, in y, of the equation
µ
α
y− y+
4(2 − α)
¶
µ
¶
4(2 − α)y
ln 1 +
= ln x,
α
and where
E=
Ã
2
N ³
X
k=1
Z
hek , βi− ¡
¢
2
4σ(S N −1 ) 1/α
+
α
Z
1 0 : EK,N [etkXk ] < +∞}
and let h−1 be the inverse of
Z
kuk(eskuk − 1)ν(du),
0 < s < T.
h(s) =
R2KN
120
Then,
³
´
R δ/√2 −1
h (s)ds
,
PK,N λmax (M1/2 ) − EK,N [λmax (M1/2 )] ≥ δ ≤ e− 0
(1.29)
for all 0 < δ < h (T − ).
R , YI )
(ii) Let X = (Yi,j
i,j 1≤i≤K,1≤j≤N be an α-stable random vector with Lévy measure ν given
R
R +∞
by ν(B) = S 2KN −1 σ(dξ) 0 1B (rξ)dr/r1+α . Then,
³
´
√
σ(S 2KN −1 )
,
PK,N λmax (M1/2 ) − m(λmax (M1/2 )) ≥ δ ≤ C(α)( 2)α
δα
√ £
¤1/α
and where C(α) = 4α (2 − α + eα)/α(2 − α).
whenever δ > 2a 2σ(S 2KN −1 )C(α)
Remark 1.15. (i) As already mentioned, Soshnikov and Fyodorov ([32]) obtained the asymptotic behavior of the largest eigenvalue of the Wishart matrix Y∗ Y when the entries of,
the K × N matrix, Y are iid Cauchy random variables. They further argue that although
the typical eigenvalues of Y∗ Y are of order KN , the correct order for the largest one is
K 2 N 2 . The above corollary combined with Remark 1.10 and the estimate (1.19), shows
that when the entries of Y form an α-stable random vector, the largest eigenvalue of Y∗ Y
is of order at most σ(S 2KN −1 )2/α . There is also a lower concentration result, described
next, which leads to a lower bound on the order of this largest eigenvalue. Thus, from
these two estimates, if the entries of Y are iid α-stable, the largest eigenvalue of Y ∗ Y is
of order K 2/α N 2/α .
(ii) Let X ∼ ID(β, 0, ν) in Rd , then (see Lemma 5.4 in [6]) for any x > 0, and any norm
k · kN on Rd ,
n
¡
¢ 1³
¡©
ª¢o´
P kXkN ≥ x ≥
1 − exp − ν u ∈ Rd : kukN ≥ 2x
.
4
R , Y I ), which we denote by kXk , if
But, λmax (M1/2 ) is a norm of the vector X = (Yi,j
λ
i,j
2KN
X is a stable vector in R
.
³
´
PK,N λmax (M1/2 ) − m(λmax (M1/2 )) ≥ δ
³
´
= PK,N λmax (M1/2 ) ≥ δ + m(λmax (M1/2 ))
n
¡©
¡
¢ª¢o´
1³
≥
1 − exp − ν λmax (M1/2 ) ≥ 2 δ + m(λmax (M1/2 ))
4
n
¡©
¡
¢ª¢o´
1³
1 − exp − ν kXkλ ≥ 2 δ + m(λmax (M1/2 ))
≥
4
¡ 2KN −1 ¢
¾¶
µ
½
σ̃ Sk·k
1
λ
¢α
,
=
1 − exp − ¡
4
α δ + m(λmax (M1/2 ))
(1.30)
2KN −1
where Sk·k
is the unit sphere relative to the norm k · kλ and where σ̃ is the spherical
λ
part of the Lévy measure corresponding to this norm. Moreover, if the components of
X¡ are independent,
in which case the Lévy measure is supported on the axes of R2KN ,
¢
2KN −1
is of order KN , and so, as above, the largest eigenvalue of M1/2 is of order
σ̃ Sk·k
λ
K 1/α N 1/α .
121
(iii) For any function f such that g(x) = f (x2 ) is Lipschitz with Lipschitz constant kgkLip :=
2
|||f |||L , tr(g(XA ))
√ = tr(f (X
√ A )) is a Lipschitz function of the entries of Y with Lipschitz
constant at most 2|||f |||L K + N . Hence, under the assumptions of part (i) of Corollary
1.14,
³
K +N´
K,N
K,N
P
trN (f (M)) − E
[trN (f (M))] ≥ δ
± N
½ Z √
¾
≤ exp
−
2(K+N )δ |||f |||L
h−1 (s)ds ,
(1.31)
0
p
for all 0 < δ < |||f |||L h (T − ) / 2(K + N ).
(iv) Under the assumptions of part (ii) of Corollary 1.14, for any function f such that g(x) =
f (x2 ) is Lipschitz with kgkLip = |||f |||L , and any median m(trN (f (M))) of trN (f (M)) we
have:
µ
¶
K +N
K,N
P
trN (f (M)) − m(trN (f (M))) ≥ δ
N
α
|||f |||L
σ(S 2KN −1 )
,
(1.32)
≤ C(α) p
δα
2α (K + N )α
£
¤1/α p
/ 2(K + N ), and where C(α) = 4α (2 − α +
whenever δ > |||f |||L 2σ(S 2KN −1 )C(α)
eα)/α(2 − α).
Remark 1.16. In the absence of finite exponential moments, the methods described in the
present paper extend beyond the heavy tail case and apply to any random matrix whose entries
on and above the main diagonal form an infinitely divisible vector X. However, to obtain explicit
concentration estimates, we do need explicit bounds on V 2 and on ν̄. Such bounds are not always
available when further knowledge on the Lévy measure of X is lacking.
2
Proofs:
We start with a proposition, which is a direct consequence of the concentration inequalities
obtained in [12] for general Lipschitz function of infinitely divisible random vectors with finite
exponential moment.
R , ωR , ωI )
Proposition 2.1. Let X = (ωi,i
PN ∼
i,j 1≤i 0 and let T = sup{t > 0 : EN etkXk <
+∞}. Let h−1 be the inverse of
Z
¡
¢
kuk eskuk − 1 ν(du),
0 < s < T.
h(s) =
RN
2
(i) For any Lipschitz function f ,
¡
¢
PN trN (f (XA )) − EN [trN (f (XA ))] ≥ δ ≤ exp
for all 0 < δ <
√
2akf kLip h (T − ) /N .
122
(
−
Z
0
√ Nδ
2akf kLip
)
h−1 (s)ds ,
(ii) Let λmax (XA ) be the largest eigenvalue of the matrix XA . Then,
( Z √N δ
)
√
¡
¢
2a
N
N
−1
P λmax (XA ) − E [λmax (XA )] ≥ δ ≤ exp −
h (s)ds ,
0
for all 0 < δ <
√
√
2ah (T − ) / N .
Proof of Theorem 1.2:
For part (i), following the proof of Theorem 1.3 of [10], without loss of generality, by shift
invariance, assume that min{x : x ∈ K} = 0. Next, for any v > 0, let
0 if x ≤ 0
gv (x) = x if 0 < x < v
(2.33)
v if x ≥ v.
Clearly gv ∈ Lip(1) with kgv k∞ = v. Next for any function f ∈ LipK (1), any ∆ > 0, define
x
recursively f∆ (x) = 0 for x ≤ 0, and for (j − 1)∆ ≤ x ≤ j∆, j = 1, . . . , ⌈ ∆
⌉, let
f∆ (x) =
x
⌉
⌈∆
X
(j)
g∆ ,
j=1
(j)
where g∆ := (21{f (j∆)>f∆ ((j−1)∆)} − 1)g∆ (x − (j − 1)∆). Then |f − f∆ | ≤ ∆ and the 1-Lipschitz
(j)
function f∆ is the sum of at most |K|/∆ functions g∆ ∈ Lip(1), regardless of the function f .
Now, for δ > 2∆,
!
Ã
¯
¯
N
N
P
sup ¯trN (f (XA )) − E [trN (f (XA ))]¯ ≥ δ
f ∈LipK (1)
≤P
N
µ
sup
f ∈LipK (1)
½
¯
¯ ¯
¯trN (f∆ (XA )) − EN (trN (f∆ (XA )))¯ + ¯trN (f (XA ))
¯ ¯
¯
− trN (f∆ (XA ))¯ + ¯EN [trN (f (XA ))] − EN [trN (f∆ (XA ))]¯
¶
µ
¯
¯
≤ PN sup¯trN (f∆ (XA )) − EN (trN (f∆ (XA )))¯ > δ − 2∆
¾
≥δ
¶
f∆
|K|
≤
sup PN
∆ g(j) ∈Lip(1)
∆
¶
µ
¯
¯
¯trN (g (j) (XA )) − EN [trN (g (j) (XA ))]¯ ≥ ∆(δ − 2∆)
∆
∆
|K|
¾
½ Z √N δ2
8 2a|K|
8|K|
−1
h (s)ds ,
(2.34)
≤
exp −
δ
0
q √
whenever 0 < δ < 8 2a|K|h (T − ) /N , and where the last inequality follows from part (i) of
the previous proposition by taking also ∆ = δ/4.
123
In order to prove part (ii), for any f ∈ Lipb (1), i.e, such that kf kLip ≤ 1, kf k∞ ≤ b, and any
τ > 0, let fτ be given via:
f (x)
f (τ ) − sign(f (τ ))(x − τ )
fτ (x) =
f (−τ ) + sign(f (−τ ))(x + τ )
0
if |x| < τ
if τ ≤ x < τ + |f (τ )|
if −τ − |f (−τ )| < x ≤ −τ
otherwise.
(2.35)
Clearly fτ ∈ Lip(1) and supp(fτ ) ⊂ [−τ − |f (−τ )|, τ + |f (τ )|]. Moreover,
¯
¯
¯
¯
sup ¯trN (f (XA ))−EN (trN (f (XA )))¯
f ∈Lipb (1)
≤
¯
¯
¯
¯
sup ¯trN (fτ (XA ))−EN (trN (fτ (XA )))¯
f ∈Lipb (1)
+
¯
¯
¯
¯
sup ¯trN (f (XA ) − fτ (XA )) − EN [trN (f (XA ) − fτ (XA ))]¯
f ∈Lipb (1)
≤
¯
¯
¯
¯
sup ¯trN (fτ (XA )) − EN (trN (fτ (XA )))¯
f ∈Lipb (1)
+ 2trN (gb (|XA | − τ )) +2EN [trN (gb (|XA | − τ ))],
with gb given as in (2.33). Now,
µ
¶
N
N
sup |trN (f (XA )) − E (trN (f (XA )))| ≥ δ
P
f ∈Lipb (1)
≤ PN
³
sup |trN (fτ (XA )) − EN (trN (fτ (XA )))| ≥
f ∈Lipb (1)
(2.36)
δ´
3
³
2δ ´
+ PN 2trN (gb (|XA | − τ )) + 2EN [trN (gb (|XA | − τ ))] ≥
3
´
³
δ
≤ PN
sup |trN (fτ (XA ))−EN (trN (fτ (XA )))| ≥
3
f ∈Lipb (1)
´
³
δ
+PN trN (gb (|XA|−τ ))−EN[trN (gb (|XA|−τ ))] ≥ −2EN[trN (gb (|XA|−τ ))]
3
³
δ´
≤ PN
sup |trN (fτ (XA ))−EN (trN (fτ (XA )))| ≥
3
f ∈Lipb (1)
´
³
δ
+PN trN (gb (|XA |−τ ))−EN [trN (gb (|XA |−τ ))] ≥ − 2bEN[trN (1{|XA |≥τ } ] .
3
(2.37)
Let us first bound the second probability in (2.37). Recall that the spectral
radius ρ(XA ) =
p
max |λi | is a Lipschitz function of X with Lipschitz constant at most a 2/N . Hence, for any
1≤i≤N
124
0 < t ≤ T , and γ > 0 such that ν̄(pγ ) ≤ 1/4,
N
´
1 X N³
P |λi (XA )| ≥ τ
N
i=1
¡
¢
N
≤ P ρ(XA ) ≥ τ
√
√
¶
µ√
¤
N
N N£
N
N
√ ρ(XA ) − √ E ρ(XA ) ≥ √ τ − G2 (γ)
≤P
2a
2a
2a
½
µ√
¶¾
N
≤ exp H(t) − √ τ − G2 (γ) t
2a
EN [trN (1{|XA |≥τ } )] =
(2.38)
where, above, we have used Proposition 1.1 in the next to last inequality and where the last
inequality follows from Theorem 1 in [12] (p. 1233) with
Z
Z t
¡ tkuk
¢
e
− tkuk − 1 ν(du).
h(s)ds =
H(t) =
RN
0
2
We want to choose τ , such that EN [trN (1{|XA |≥τ } )] ≤ δ/12b. This can be achieved if
√
ln 12b
N
δ + H(t)
√ τ − G2 (γ) ≥
.
t
2a
Since
(2.39)
µ
¶
th(t) − ln 12b
d ln 12b
δ + H(t)
δ − H(t)
=
,
2
dt
t
t
and
d2
dt2
µ
ln 12b
δ + H(t)
t
¶
′′
t3 H (t) − 2t(th(t) − ln 12b
δ − H(t))
,
=
4
t
it is clear that the right hand side of (2.39) is minimized when t = t0 , where t0 is the solution of
th(t) − H(t) − ln
12b
= 0,
δ
and the minimum is then h(t0 ).
Thus, if
√ µ
¶
2a
G2 (γ) + h(t0 ) ,
τ = C0 (δ, b) := √
N
then
EN [trN (1{|XA |≥τ } )] ≤
(2.40)
δ
,
12b
and so,
µ
¶
δ
N
N
P trN (gb (|XA|−τ ))−E [trN (gb (|XA|−τ ))] ≥ −2bE [trN (1{|XA |≥τ } ]
3
µ
¶
δ
N
N
≤ P trN (gb (|XA |−τ ))−E [trN (gb (|XA |−τ ))] ≥
6
)
( Z Nδ
N
≤ exp
−
6
√
2a
h−1 (s)ds ,
0
125
(2.41)
√
for all 0 < δ < 6 2ah (T − ) /N , where Proposition 2.1 is used in the last inequality.
For τ chosen as in (2.40), letting K = [−τ − b, τ + b], it follows that for any f ∈ Lipb (1),
fτ ∈ LipK (1). By part (i), the first term in (2.37) is such that
PN
³
sup |trN (fτ (XA ))−EN (trN (fτ (XA )))| ≥
f ∈Lipb (1)
µ
δ´
3
δ
≤P
sup |trN (fτ (XA )) − E [trN (fτ (XA ))]| ≥
3
fτ ∈LipK (1)
)
( Z
2
√ Nδ
144 2a(C0 (δ,b)+b)
48(C0 (δ, b) + b)
h−1 (s)ds ,
exp −
≤
δ
0
N
N
¶
(2.42)
√ ¡
¢
for all 0 < δ 2 ≤ 144 2a C0 (δ, b) + b h(T − )/N .
Hence, returning to (2.37), using (2.41) and (2.42) and for
q √ ¡
o
n √
¡ ¢
¢
δ < min 6 2ah T − /N, 144 2a C0 (δ, b) + b h(T − )/N ,
we have
P
N
µ
N
sup |trN (f (XA )) − E (trN (f (XA )))| ≥ δ
f ∈Lipb (1)
¶
( Z Nδ
)
( Z Nδ
)
δ
√
√
6 2a 24(C0 (δ,b)+b)
6 2a
24(C0 (δ, b)+b)
−1
−1
exp −
h (s)ds +exp −
h (s)ds
≤2
δ
0
0
)
( Z
2
¶
µ
√ Nδ
144 2a(C0 (δ,b)+b)
1 24(C0 (δ, b) + b)
−1
h (s)ds ,
exp −
≤ 2+
12
δ
0
(2.43)
since only the case δ ≤ 2b presents some interest (otherwise the probability in the statement of
the theorem is zero). Part (ii) is then proved.
✷
Proof of Proposition 1.4:
2
As a function of x ∈ RN , dW (µ̂N
A , µ)(x) is Lipschitz with Lipschitz constant at most
2
N
Indeed, for x, y ∈ R ,
dW (µ̂N
A , µ)(x)
=
sup
f ∈Lipb (1)
2a/N .
¯
¯
Z
¯
¯
¡
¢
¯trN f (XA )(x) −
f dµ¯¯
¯
R
¯
¯
¯
¯
≤ sup ¯¯trN (f (XA )(x)) − trN (f (XA )(y))¯¯
f ∈Lipb (1)
¯
¯
Z
¯
¯
+ sup ¯¯trN (f (XA )(y)) −
f dµ¯¯
≤
√
f ∈Lipb (1)
√
2a
kx − yk + dW (µ̂N
A , µ)(y).
N
126
R
(2.44)
Theorem 1.4 then follows from Theorem 1 in [12].
✷
Proof of Corollary 1.5:
£
¤
For Lévy measures with bounded support, EN etkXk < +∞, for all t ≥ 0, and moreover
h(t) ≤ V 2
Hence
H(t) =
Z
t
0
and
exp
½
−
Z
x
−1
h
(s)ds
0
µ
¾
h(s)ds ≤
≤ exp
½
¶
etR − 1
.
R
¢
V 2 ¡ tR
s − 1 − tR ,
2
R
x
−
R
µ
x
V2
+ 2
R R
¶
µ
Rx
ln 1 + 2
V
¶¾
.
Thus, one can take
¶
µ√ ³
¢´
2a
V 2 ¡ t0 R
e
−1 +b ,
C(δ, b) = C √
G2 (γ) +
R
N
where t0 is the solution, in t, of
´
V 2³
12b
tR
tR
.
tRe
−
e
+
1
= ln
2
R
δ
Applying Theorem 1.2 (ii) yields the result.
✷
In order to prove Theorem 1.11, we first need the following lemma, whose proof is essentially as
the proof of Theorem 1 in [13].
R , ωR , ωI )
Lemma 2.2. Let X = (ωi,i
i,j 1≤i 0, let gx0 ,x1 (x) = gx1 (x − x0 ), where gx1 (x) is
defined as in (2.33). Then,
¶
µ¯
2
¯
aα σ(S N −1 )
¯
¯
PN ¯trN (gx0 ,x1 (XA )) − EN [trN (gx0 ,x1 (XA ))]¯ ≥ δ ≤ C(α)
,
N αδα
¡ √ ¢1+α
2
whenever δ 1+α > 2 2a
σ(S N −1 )x1 /αN 1+α and where C(α) = 25α/2 (2eα + 2 − α)/α(2 − α).
Proof of Theorem 1.11
For part (i), first consider f ∈ LipK (1). Using the same approximation as in Theorem 1.2, any
function f ∈ LipK (1) can be approximated by f∆ , which is the sum of at most |K|/∆ functions
(j)
g∆ ∈ Lip(1), regardless of the function f . Now, and as before, for δ > 2∆,
127
PN
Ã
sup
f ∈LipK (1)
|K|
sup
≤
∆ g(j) ∈Lip
∆
j=1,··· ,⌈
≤
|trN (f (XA )) − EN (trN (f (XA )))| ≥ δ
!
µ¯
¯ ∆(δ − 2∆)¶
¯
¯
(j)
(j)
N
P
¯trN (g∆ (XA )) − E [trN (g∆ (XA ))]¯ ≥
|K|
(1)
N
b
|K|
⌉
∆
4|K| 8α aα C2 (α)σ(S N
δ
N α δ 2α
whenever
2 −1
)|K|α
,
(2.45)
√
2
1
δ2
2 2a ³ σ(S N −1 )δ ´ 1+α
,
>
8|K|
N
4α
(2.46)
and where the last inequality follows from Lemma 2.2, taking also ∆ = δ/4.
For any f ∈ Lipb (1), and any τ > 0, let fτ be given as in (2.35). Then, fτ ∈ LipK (1), where
K = [−τ − b, τ + b], and moreover,
P
N
µ
N
sup |trN (f (XA )) − E (trN (f (XA )))| ≥ δ
f ∈Lipb (1)
¶
´
³
δ
≤ PN trN (gτ,b (|XA|))−EN[trN (gτ,b (|XA|))] ≥ −2bEN[trN (1{|XA |≥τ } ]
3
³
δ´
N
N
+P
sup |trN (fτ (XA )) − E (trN (fτ (XA )))| ≥
.
3
fτ ∈LipK (1)
The spectral radius ρ(XA ) is a Lipschitz function of X with Lipschitz constant at most
Then by Theorem 1 in [13],
(2.47)
√
√
2a/ N .
N
´
1 X N³
P |λi (XA )| ≥ τ
N
i=1
³
´
N
≤ P ρ(XA ) > τ
Ã
EN [trN (1{|XA |≥τ } )] =
≤ PN
!
√
2a
ρ(XA ) − m(ρ(XA )) > τ − √ J1 (α)
N
2
whenever
C1 (α)2α/2 aα σ(S N −1 )
√
≤
¡
¢α ,
N α/2 τ − √2a
J
(α)
1
N
√
¶α
µ
2
2C1 (α)2α/2 aα σ(S N −1 )
2a
≥
,
τ − √ J1 (α)
N α/2
N
and where C1 (α) = 4α (2 − α + eα)/α(2 − α). Now, if τ is chosen such that
2
δ
C1 (α)2α/2 aα σ(S N −1 )
√
,
≤
¢
¡
α
12b
J
(α)
N α/2 τ − √2a
1
N
128
(2.48)
(2.49)
that is, if
it then follows that
√
µ
¶α
2
12bC1 (α)2α/2 aα σ(S N −1 )
2a
τ − √ J1 (α)
≥
,
δN α/2
N
EN [trN (1{|XA |≥τ } )] ≤
(2.50)
δ
.
12b
Since gτ,b (|XA|) is the sum of two functions of the type studied in Lemma 2.2 with x1 = b, we
have,
³
´
δ
PN trN (gτ,b (|XA |))−EN [trN (gτ,b (|XA |))] ≥ −2bEN [trN (1{|XA |≥τ } ]
3
³
δ ´
N
N
≤ 2P trN (gτ,b (XA )) − E [trN (gτ,b (XA ))] ≥
12
2 −1
α
α
N
12 a σ(S
)
≤ 2C2 (α)
,
α
α
N δ
whenever
1+α
(2.51)
³ 2√2a ´1+α 121+α σ(S N 2 −1 )b
,
(2.52)
N
α
and where C2 (α) = 25α/2 (2eα + 2 − α)/α(2 − α). The respective ranges (2.50) and (2.52) suggest
that one can choose, for example,
√
√
2a
2a
τ = √ J1 (α) + √ δ.
N
N
δ
>
Then, there exists δ(α, a, N, ν) such that for δ > δ(α, a, N, ν),
³
´
PN
sup |trN (f (XA )) − EN [trN (f (XA ))]| ≥ δ
f ∈Lipb (1)
≤ PN
³
sup
fτ LipK (1)
|trN (fτ (XA )) − EN [trN (fτ (XA ))]| ≥
δ´
3
´
³
δ
+ PN trN (gτ,b (|XA |))−EN [trN (gτ,b (|XA |))] ≥ −2bEN [trN (1{|XA |≥τ } ]
3
´1+α
³√
√
2
2
C3 (α)aα σ(S N −1 ) √2a
J (α) + b + √2a
δ
C4 (α)aα σ(S N −1 )
N 1
N
≤
+
,
N α δ 1+2α
N αδα
where C3 (α) = 24+2α 12α C2 (α), C4 (α) = 2(12α )C2 (α) and δ(α, a, N, ν) is such that (2.46) and
(2.52) hold.
√
Part (ii) is a direct consequence of Theorem 1 of [13], since dW (µ̂N
A , µ) ∈ Lip( 2a/N ) as shown
in the proof of Proposition 1.4.
✷
Proof of Theorem 1.12. For any f ∈ Lip(1), Theorem 1 in [13] gives a concentration inequality for f (X), when it deviates from one of its medians. For 1 < α < 2, a completely similar
(even simpler) argument gives the following result,
P
N
¡
C(α)σ(S N
f (X) − E [f (X)] ≥ x ≤
xα
N
¢
129
2 −1
)
,
(2.53)
α
N
whenever
© α x ≥ K(α)σ(S
ª
max 2 /(α − 1), C(α) .
2 −1
), where C(α) = 2α (eα + 2 − α)/(α(2 − α)) and K(α) =
Next, following the proof of Theorem 1.2, approximate any function f ∈ Lipb (1) by fτ ∈
Lip[−τ −b,τ +b] (1) defined via (2.35). Hence,
P
N
³
≤P
N
sup |trN (f (XA )) − E (trN (f (XA )))| ≥ δ
f ∈Lipb (1)
N
³
´
δ´
sup |trN (fτ (XA )) − E (trN (fτ (XA )))| ≥
3
fτ ∈LipK (1)
N
´
³
δ
+PN trN (gτ,b (|XA |))−EN [trN (gτ,b (|XA |))] ≥ −2bEN [trN (1{|XA |≥τ } ] .
3
(2.54)
For ρ(XA ) the spectral radius of the matrix XA , and for any τ , such that τ − EN [ρ(XA )] ≥
´1/α
³√
2
√2a K(α)σ(S N −1 )
,
N
³
´
¡
¢
EN trN (1{|XA |>τ } ) ≤ PN ρ(XA ) − EN [ρ(XA )] ≥ τ − EN [ρ(XA )]
³ √ ´α
2
√2a
C(α)σ(S N −1 )
N
¢α ,
≤ ¡
τ − EN [ρ(XA )]
(2.55)
√
√
where we have used, in the last inequality, (2.53) and the fact that ρ(XA ) ∈ Lip( 2a/ N ). For
Q > 0, let τ = EN [ρ(XA )] + Qδ −1/α . With this choice, we then have:
¡
¢
EN trN (1{|XA |>τ } ) ≤
³ √ ´α
√2a
N
2 −1
)
¢α
τ − EN [ρ(XA )]
³ √ ´α
2
√2a
C(α)σ(S N −1 )
N
¡
≤δ
≤
C(α)σ(S N
Qα
δ
,
12b
(2.56)
³ √ ´α
√
√
2
2
C(α)σ(S N −1 )/Qα ≤ 1/(12b). Now,
provided Qα /δ > 2aK(α)σ(S N −1 )/ N , and √2a
N
√ ¡
¢1/α √
2
/ N , and recalling, for 1 < α < 2, the lower range
taking Q = 2a 12bC(α)σ(S N −1 )
concentration result for stable vectors√(Theorem 1 and Remark 3 in [5]): For any ǫ > 0, there
exists η0 (ǫ), such that for all 0 < δ < 2akf kLip η0 (ǫ)/N ,
¡
¢
PN trN (f (XA )) − EN (trN (f (XA ))) ≥ δ
(
¡
¢ α
≤ (1 + ǫ) exp
−
2−α α−1 α−1
10
α
(σ(S N 2 −1 ))1/(α−1)
130
Ã
N
√
2akf kLip
!
α
α−1
δ
α
α−1
)
.
(2.57)
With arguments as in the
i
on
Electr
o
u
a
rn l
o
f
P
c
r
ob
abil
ity
Vol. 13 (2008), Paper no. 5, pages 107–134.
Journal URL
http://www.math.washington.edu/~ejpecp/
Concentration of the Spectral Measure for Large
Random Matrices with Stable Entries
Christian Houdré
∗
Hua Xu
†
Abstract
We derive concentration inequalities for functions of the empirical measure of large random
matrices with infinitely divisible entries, in particular, stable or heavy tails ones. We also
give concentration results for some other functionals of these random matrices, such as the
largest eigenvalue or the largest singular value.
Key words: Spectral Measure, Random Matrices, Infinitely divisibility, Stable Vector, Concentration.
AMS 2000 Subject Classification: Primary 60E07, 60F10, 15A42, 15A52.
Submitted to EJP on June 12, 2007, final version accepted January 18, 2008.
∗
†
Georgia Institute of Technology, School of Mathematics, Atlanta, Georgia, 30332, [email protected]
Georgia Institute of Technology, School of Mathematics, Atlanta, Georgia, 30332, [email protected]
107
1
Introduction and Statements of Results:
Large random matrices have recently attracted a lot of attention in fields such as statistics, mathematical physics or combinatorics (e.g., see Mehta [24], Bai and Silverstein [4], Johnstone [18],
Anderson, Guionnet and Zeitouni [2]). For various classes of matrix ensembles, the asymptotic
behavior of the, properly centered and normalized, spectral measure or of the largest eigenvalue
is understood. Many of these results hold true for matrices with independent entries satisfying
some moment conditions (Wigner [35], Tracy and Widom [33], Soshnikov [28], Girko [8], Pastur
[25], Bai [3], Götze and Tikhomirov [9]).
There is relatively little work outside the independent or finite second moment assumptions. Let
us mention Soshnikov [30] who, using ideas from perturbation theory, studied the distribution of
the largest eigenvalue of Wigner matrices with entries having heavy tails. (Recall that a real (or
complex) Wigner matrix is a symmetric (or Hermitian) matrix whose entries Mi,i , 1 ≤ i ≤ N ,
and Mi,j , 1 ≤ i < j ≤ N , form two independent families of iid (complex valued in the Hermitian
case) random variables.) In particular, (see [30]), for a properly normalized Wigner matrix
with entries belonging to the domain of attraction of an α-stable law, limN →∞ PN (λmax ≤ x) =
exp (−x−α ) (here λmax is the largest eigenvalue of such a normalized matrix). Further, Soshnikov
and Fyodorov [32], using the method of determinants, derived results for the largest singular
value of K × N rectangular matrices with independent Cauchy entries, showing that the largest
singular value of such a matrix is of order K 2 N 2 (see also the survey article [31], where band
and sparse matrices are studied).
On another front, Guionnet and Zeitouni [10], gave concentration results for functionals of the
empirical spectral measure of, self-adjoint, random matrices whose entries are independent and
either satisfy a logarithmic Sobolev inequality or are compactly supported. They obtained, for
such matrices, the subgaussian decay of the tails of the empirical spectral measure when it
deviates from its mean. They also noted that their technique could be applied to prove results
for the largest eigenvalue or for the spectral radius of such matrices. Alon, Krivelevich and Vu
[1] further obtained concentration results for any of the eigenvalues of a Wigner matrix with
uniformly bounded entries (see, Ledoux [20] for more developments and references).
Our purpose in the present work is to deal with matrices whose entries form a general infinitely
divisible vector, and in particular a stable one (without independence assumption). As well
known, unless degenerated, an infinitely divisible random variable cannot be bounded. We
obtain concentration results for functionals of the corresponding empirical spectral measure,
allowing for any type of light or heavy tails. The methodologies developed here apply as well to
the largest eigenvalue or to the spectral radius of such random matrices.
Following the lead of Guionnet and Zeitouni [10], let us start by setting our notation and
framework.
Let MN ×N (C) be the set of N ×N Hermitian matrices with complex entries, which is throughout
equipped with the Hilbert-Schmidt (or Frobenius or entrywise Euclidean) norm:
v
u N
p
uX
kMk = tr(M∗ M) = t
|Mi,j |2 .
i,j=1
Let f be a real valued function on R. The function f can be viewed as mapping MN ×N (C) to
MN ×N (C). Indeed, for M = (Mi,j )1≤i,j≤N ∈ MN ×N (C), so that M = UDU∗ , where D is a
108
diagonal matrix, with real entries λ1 , ..., λN , and U is a unitary matrix, set
f (λ1 ) 0
···
0
0
f (λ2 ) · · ·
0
f (M) = Uf (D)U∗ , f (D) =
..
..
.. .
..
.
.
.
.
0
0
· · · f (λN )
Let tr(M) =
PN
i=1 Mi,i
be the trace operator on MN ×N (C) and set also
N
1 X
trN (M) =
Mi,i .
N
i=1
ForP a N × N random Hermitian matrix with eigenvalues λ1 , λ2 , ..., λN , let FN (x) =
N
1
i=1 1{λi ≤x} be the corresponding empirical spectral distribution function. As well known,
N
2
if M is a N × N Hermitian Wigner matrix with
√ E[M1,1 ] = E[M1,2 ] = 0, E[|M1,2 | ] = 1,
2
and E[M1,1 ] < ∞, the spectral measure of M/ N converges to the semicircle law: σ(dx) =
√
4 − x2 1{|x|≤2} dx/2π ([2]).
We study below the tail behavior of either the spectral measure or the linear statistic of f (M)
for classes of matrices M. Still following Guionnet and Zeitouni, we focus on a general random
matrix XA given as follows:
1
XA = ((XA )i,j )1≤i,j≤N , XA = X∗A , (XA )i,j = √ Ai,j ωi,j ,
N
√
I )
R +
−1ωi,j
with (ωi,j )1≤i,j≤N = (ωi,j
1≤i,j≤N , ωi,j = ωj,i , and where ωi,j , 1 ≤ i ≤ j ≤ N is a
√
R + −1P I , 1 ≤ i ≤ j ≤ N , with P I = δ
complex valued random variable with law Pi,j = Pi,j
0
i,j
i,i
(by the Hermite property). Moreover, the matrix A = (Ai,j )1≤i,j≤N is Hermitian with, in most
cases, non-random complex valued entries uniformly bounded, say, by a.
Different choices for the entries of A allow to cover various types of ensembles. For instance,√if
ωi,j , 1 ≤ i < j ≤ N , and ωi,i , 1 ≤ i ≤ N , are iid N (0, 1) random variables, taking Ai,i = 2
and Ai,j = 1, for 1 ≤ i < j ≤ N gives the GOE (Gaussian Orthogonal Ensemble). If
R , ω I , 1 ≤ i < j ≤ N , and ω R , 1 ≤ i ≤ N , are iid N (0, 1) random variables, taking A
ωi,j
i,i = 1
i,i
i,j
√
and Ai,j = 1/ 2, for 1 ≤ i < j ≤ N gives the GUE (Gaussian Unitary Ensemble) (see [24]).
R , ω I , 1 ≤ i < j ≤ N , and ω R , 1 ≤ i ≤ N , are two independent families of real
Moreover, if ωi,j
i,j
i,i
valued random variables, taking Ai,j = 0 for |i − j| large and Ai,j = 1 otherwise, gives band
matrices.
Proper choices of non-random Ai,j also make it possible to cover Wishart matrices, as seen in
the later part of this section. In certain instances, A can also be chosen to be random, like in the
case of diluted matrices, in which case Ai,j , 1 ≤ i ≤ j ≤ N , are iid Bernoulli random variables
(see [10]).
2
R , ω R , ω I ), 1 ≤ i < j ≤ N ,
On RN , let PN be the joint law of the random vector X = (ωi,i
i,j
i,j
R are 1 ≤ i ≤ N . Let EN be the corresponding
where it is understood that the indices for ωi,i
expectation. Denote by µ̂N
A the empirical spectral measure of the eigenvalues of XA , and further
note that
Z
1
f (x)µ̂N
trN f (XA ) = tr(f (XA )) =
A (dx),
N
R
109
for any bounded Borel function f . For a Lipschitz function f : Rd → R, set
|f (x) − f (y)|
,
kx − yk
x6=y
kf kLip = sup
where throughout k·k is the Euclidean norm, and where we write f ∈ Lip(c) whenever kf kLip ≤ c.
Each element M of MN ×N (C) has a unique collection of eigenvalues λ = λ(M) = (λ1 , · · · , λN )
listed in non increasing order according to multiplicity in the simplex
S N = {λ1 ≥ · · · ≥ λN : λi ∈ R, 1 ≤ i ≤ N },
qP
N
2
where throughout S N is equipped with the Euclidian norm kλk =
i=1 λi . It is a classical
result sometimes called Lidskii’s theorem ([26]), that the map MN ×N (C) → S N which associates
to each Hermitian matrix its ordered list of real eigenvalues is 1-Lipschitz ([11], [19]). For
a matrix XA under consideration with eigenvalues λ(XA ), it is then clear that the map ϕ :
2
R , ωR , ωI )
7→ λ(XA ) is Lipschitz, from (RN , k · k) to (S N , k · k), with Lipschitz
(ωi,i
i,j 1≤i 0, let pγ = inf x > 0 : 0 < V 2 (x)/x2 ≤
kuk≤x
ª
γ . Let f ∈ Lip(1), then for any γ such that ν̄(pγ ) ≤ 1/4,
(i) any median m(f (X)) of f (X) satisfies
|m(f (X)) − f (0)| ≤ G1 (γ) := pγ
³√
(ii) the mean EN [f (X)] of f (X), if it exists, satisfies
|EN [f (X)] − f (0)| ≤ G2 (γ) := pγ
´
γ + 3kγ (1/4) + Eγ ,
³√
´
γ + kγ (1/4) + Eγ ,
where kγ (x), x > 0, is the solution, in y, of the equation
¶
µ
y
= ln x,
y − (y + γ) ln 1 +
γ
and where
Eγ =
Ã
2
N ³
X
k=1
hek , βi −
Z
pγ 0, let
n
o
Lipb (c) = f : R → R : kf kLip ≤ c, kf k∞ ≤ b ,
while for a fixed compact set K ⊂ R, with diameter |K| = sup |x − y|, let
x,y∈K
LipK (c) := {f : R → R : kf kLip ≤ c, supp(f ) ⊂ K},
where supp(f ) is the support of f .
R , ωR , ωI )
law PN ∼
Theorem 1.2. Let X = (ωi,i
i,j 1≤i 0. Let T = sup{t ≥ 0 : EN etkXk < +∞}
and let h−1 be the inverse of
Z
¡
¢
kuk eskuk − 1 ν(du),
0 < s < T.
h(s) =
RN
2
(i) For any compact set K ⊂ R,
³
´
N
N
P
sup |trN (f (XA )) − E [trN (f (XA ))] | ≥ δ
f ∈LipK (1)
8|K|
≤
exp
δ
√
for all δ > 0 such that δ 2 < 8 2a|K|h (T − ) /N .
111
(
−
Z
8
0
2
√N δ
2a|K|
)
h−1 (s)ds ,
(1.4)
(ii)
PN
³
sup |trN (f (XA )) − EN [trN (f (XA ))] | ≥ δ
f ∈Lipb (1)
´
)
( Z N δ2
√
2aC(δ,b)
C(δ, b)
−1
h (s)ds ,
exp −
≤
δ
0
(1.5)
√
2aC(δ, b)h(T − )/N , where
¶
µ√ ³
´
2a
G2 (γ) + h(t0 ) + b ,
C(δ, b) = C √
N
for all δ > 0 such that δ 2 ≤
with G2R(γ) as in Proposition 1.1, C a universal constant, and with t0 the solution, in t, of
t
th(t) − 0 h(s)ds − ln(12b/δ) = 0.
Remark 1.3. (i) The order of C(δ, b) in part (ii) can be made more specific. Indeed, it will
be clear from the proof of this theorem (see (2.39)), that for any fixed t∗ , 0 < t∗ ≤ T ,
R t∗
µ √ ³ 12b
´¶
h(s)ds
2a ln δ
0
C(δ, b) ≤ C √
+
+ G2 (γ) .
t∗
t∗
N
(ii) As seen from the
£ proof
¤ (see (2.38)),
£
¤in the statement of the above theorem, G2 (γ) can be
N
N
replaced by E kXk . Now E kXk is of order N , since
q £ ¤
£
¤
£
¤
EN Xj2 ,
(1.6)
N min EN |Xj | ≤ EN kXk ≤ N max
j=1,2,...,N 2
j=1,2,...,N 2
where the Xj , j = 1, 2, . . . , N 2 are the components of X. Actually, an estimate more
precise than (1.6) is given by a result of Marcus and Rosiński [22] which asserts that if
E[X] = 0, then
£
¤ 17
1
x0 ≤ E kXk ≤ x0 ,
4
8
where x0 is the solution of the equation:
V 2 (x) U (x)
+
= 1,
x2
x
R
with V 2 (x) as defined before and U (x) = kuk≥x kukν(du), x > 0.
(1.7)
(iii) As usual, one can easily pass from the mean EN [trN (f )] to any median m(trN (f )) in either
(1.4) or (1.5). Indeed, for any 0 ≤ δ ≤ 2b, if
sup |trN (f ) − m(trN (f ))| ≥ δ,
f ∈Lipb (1)
there exist a function f ∈ Lipb (1) and a median m(trN (f )) of trN (f ), such that either trN (f ) − m(trN (f )) ≥ δ or trN (f ) − m(trN (f )) ≤ −δ. Without loss of generality assuming the former, otherwise dealing with the latter with −f , consider the func2
tion g(y) = min (d(y, A), δ) /2, y ∈ RN , where A = {trN (f ) ≤ m(trN (f )}. Clearly
112
g ∈ Lipb (1), EN [trN (g)] ≤ δ/4, and therefore trN (g) − EN [trN (g)] ≥ δ/4, which indicates that
¯ δ
¯
sup ¯trN (g) − EN [trN (g)] ¯ ≥ .
4
g∈Lipb (1)
Hence,
P
N
µ
¯
¯
sup ¯trN (f ) − m(trN (f ))¯ ≥ δ
f ∈Lipb (1)
≤P
N
µ
¶
¶
¯ δ
¯
N
¯
¯
sup trN (g) − E [trN (g)] ≥
.
4
g∈Lipb (1)
(1.8)
(iv) When the entries of X are independent, and under a finite exponential moment assumption,
the dependency in N of the function h (above and below) can sometimes be improved. We
refer the reader to [15] where some of these generic problems are discussed and tackled.
Next, recall (see [7], [19]) that the Wasserstein distance between any two probability measures
µ1 and µ2 on R is defined by
Z
¯
¯Z
¯
¯
f dµ2 ¯.
dW (µ1 , µ2 ) = sup ¯ f dµ1 −
(1.9)
f ∈Lipb (1)
R
R
Hence, Theorem 1.2 actually gives a concentration result, with respect to the Wasserstein disN N
tance, for the empirical spectral measure µ̂N
A , when it deviates from its mean E [µ̂A ].
As in [10], we can also obtain a concentration result for the distance between any particular
probability measure and the empirical spectral measure.
R , ωR , ωI )
Proposition 1.4. Let X = (ωi,i
law PN ∼
i,j
i,j 1≤i 0. Let T = sup{t > 0 : EN etkXk < +∞}
R
and let h−1 be the inverse of h(s) = RN 2 kuk(eskuk − 1)ν(du), 0 < s < T . Then, for any
probability measure µ,
¡
¢
N
N
PN dW (µ̂N
A , µ) − E [dW (µ̂A , µ)] ≥ δ ≤ exp
for all 0 < δ <
√
½
−
Z
0
Nδ
√
2a
¾
h−1 (s)ds ,
(1.10)
2ah (T − ) /N .
Of particular importance is the case of an infinitely divisible vector having boundedly supported
Lévy measure. We then have:
R , ωR , ωI )
N ∼
Corollary 1.5. Let X = (ωi,i
i,j
i,j 1≤i 0 : ν(x : kxk > r) = 0}, let
¡ 0, ν)
¢ that
R
2
2
V = V (R) = RN 2 kuk2 ν(du), and for x > 0 let
ℓ(x) = (1 + x) ln(1 + x) − x.
113
(i) For any δ > 0,
PN
³
sup |trN (f (XA )) − EN [trN (f (XA ))]| ≥ δ
f ∈Lipb (1)
C(δ, b)
exp
≤
δ
(
´
µ
¶)
N Rδ 2
V2
− 2ℓ √
,
R
2aC(δ, b)V 2
(1.11)
where
µ√ ³
¶
¢´
2a
V 2 ¡ t0 R
C(δ, b) = C √
e
−1 +b ,
G2 (γ) +
R
N
with G2 (γ) as in Proposition 1.1, C a universal constant, and t0 the solution, in t, of
´
V 2³
12b
tR
tR
.
tRe
−
e
+
1
= ln
2
R
δ
(ii) For any probability measure µ on R, and any δ > 0,
¡
¢
N
N
PN dW (µ̂N
A , µ) − E [dW (µ̂A , µ)] ≥ δ
!)
Ã
! Ã
(
V2
Nδ
N Rδ 2
Nδ
.
− √
+
ln 1 + √
≤ exp √
2aR
2aR R2
2aV 2
(1.12)
Remark 1.6. (i) As in Theorem 1.2, the dependency of C(δ, b) in δ and b can be made more
precise. A key step in the proof of (1.11) is to choose τ such that
EN [trN (1{|XA |≥τ } )] ≤ δ/12b,
and then C(δ, b) is determined by τ . Minimizing, in t, the right hand side of (2.38), leads
to the following estimate
´ !)
(
à ³ √N
√ τ − G2 (γ)
R
2
V
2a
EN [trN (1{|XA |≥τ } )] ≤ exp − 2 ℓ
,
R
V2
where ℓ(x) = (1 + x) ln(1 + x) − x. For x ≥ 1, 2ℓ(x) ≥ x ln x. Hence one can choose τ to
be the solution, in x, of the equation
x xR
12b
ln 2 = 2 ln
.
R V
δ
It then follows that C(δ, b) can be taken to be
µ√ ³
¶
´
2a
C √
G2 (γ) + τ + b .
N
Without the finite exponential moment assumption, an interesting class of random matrices with
infinitely divisible entries are the ones with stable entries, which we now analyze.
Recall that X in Rd is α-stable, (0 < α < 2), if its Lévy measure ν is given, for any Borel set
B ∈ B(Rd ), by
Z +∞
Z
dr
(1.13)
1B (rξ) 1+α ,
σ(dξ)
ν(B) =
r
0
S d−1
114
where σ, the spherical component of the Lévy measure, is a finite positive measure on S d−1 , the
unit sphere of Rd . Since the expected value of the spectral measure of a matrix with α-stable
entries might not exist, we study the deviation from a median. Here is a sample result.
R , ωR , ωI )
Theorem 1.7. Let 0 < α < 2, and let X = (ωi,i
i,j 1≤i
√
(1.14)
h
i1/α
2
, and where C(α) = 4α (2 − α + eα)/α(2 − α).
2a 2σ(S N −1 )C(α)
(ii) Let λmax (XA ) be the largest eigenvalue of XA , and let m(λmax (XA )) be any median of
λmax (XA ), then
P
N
¡
√ α σ(S N 2 −1 )
λmax (XA ) − m(λmax (XA )) ≥ δ ≤ C(α)( 2a)
,
N α/2 δ α
¢
(1.15)
i1/α
√
√ h
2
whenever δ N > 2a 2σ(S N −1 )C(α)
, and where C(α) = 4α (2 − α + eα)/α(2 − α).
I
Remark 1.8. Let M be a Wigner matrix whose entries Mi,i , 1≤i≤N , MR
i,j , 1≤i 0,
P(|M1,1 | > δ) =
L(δ)
,
δα
for some slowly varying positive function L such that lim L(tδ)/L(δ) = 1, for all t > 0. Soshδ→∞
nikov [30] showed that, for any δ > 0,
−α
lim PN (λmax (b−1
),
N M) ≥ δ) = 1 − exp(−δ
N →∞
where bN is a normalizing factor such that lim N 2 L(bN )/bαN = 2 and where λmax (b−1
N M) is
N →∞
2
2
−ǫ
α
the largest eigenvalue of b−1
/bN = 0 and lim bN /N α +ǫ = 0, for any
N M. In fact lim N
N →∞
N →∞
ǫ > 0. As stated in [13], when the random vector X is in the domain of attraction of an α-stable
distribution, concentration inequalities similar to (1.14) or (1.15) can be obtained for general
Lipschitz function. In particular, if the Lévy measure of X is given by
Z
Z +∞
L(r)dr
ν(B) =
σ(dξ)
1B (rξ) 1+α ,
(1.16)
2
r
S N −1
0
for some slowly varying function L on [0, +∞), and if we still choose the normalizing factor bN
2
such that limN →∞ σ(S N −1 )L(bN )/bαN is constant, then,
115
¡
¢
−1
PN λmax (b−1
N M) − m(λmax (bN M)) ≥ δ
≤
whenever
2
C(α)σ(S N −1 )2α/2
(δbN )α ≥ 21+α/2 C(α)σ(S N
bαN
2 −1
´
³
L bN √δ2
δα
,
(1.17)
√ ¢
¡
)L bN δ/ 2 .
2
Now, recall that for an N 2 dimensional vector with iid entries, σ(S N −1 ) = N 2 (σ̂(1) + σ̂(−1)),
where σ̂(1) is short for σ(1, 0, . . . , 0) and similarly for σ̂(−1). Thus, for fixed N , our result gives
the correct order of the upper bound for large values of δ, since for δ > 1,
1
e−1
−α
≤ 1 − e−δ ≤ α .
α
eδ
δ
Moreover, in the stable case, L(δ) becomes constant, and bN = N 2/α . Since λmax (N −2/α
is
√ M)
−2/α
a Lipschitz function of the entries of the matrix M with Lipschitz constant at most 2N
,
for any median m(λmax (N −2/α M)) of λmax (N −2/α M), we have,
¡
¢
¡
¢
σ̂(1)+ σ̂(−1) 1
2
2
N
−α
−α
P λmax (N M) − m(λmax (N M)) ≥ δ ≤ C(α)
,
(1.18)
δα
2α/2
£
¡
¢¤1/α
whenever δ ≥ 2C(α) σ̂(1) + σ̂(−1)
. Furthermore, using Theorem 1 in [14], it is not
difficult to see that m(λmax (N −2/α M)) can be upper and lower bounded independently of N .
Finally, an argument as in Remark 1.15 below will give a lower bound on λmax (N −2/α M) of the
same order as (1.18).
The following proposition will give an estimate on any median of a Lipschitz function of X,
where X is a stable vector. It is the version of Proposition 1.1 for α-stable vectors.
R , ωR , ωI )
Proposition 1.9. Let 0 < α < 2, and let X = (ωi,i
i,j 1≤i 0, is the solution, in y, of the equation
µ
α
y− y+
4(2 − α)
¶
µ
¶
4(2 − α)y
ln 1 +
= ln x,
α
and where
E=
Ã
2
N ³
X
k=1
Z
hek , βi− ¡
¢
2
4σ(S N −1 ) 1/α
+
α
Z
1 0 : EK,N [etkXk ] < +∞}
and let h−1 be the inverse of
Z
kuk(eskuk − 1)ν(du),
0 < s < T.
h(s) =
R2KN
120
Then,
³
´
R δ/√2 −1
h (s)ds
,
PK,N λmax (M1/2 ) − EK,N [λmax (M1/2 )] ≥ δ ≤ e− 0
(1.29)
for all 0 < δ < h (T − ).
R , YI )
(ii) Let X = (Yi,j
i,j 1≤i≤K,1≤j≤N be an α-stable random vector with Lévy measure ν given
R
R +∞
by ν(B) = S 2KN −1 σ(dξ) 0 1B (rξ)dr/r1+α . Then,
³
´
√
σ(S 2KN −1 )
,
PK,N λmax (M1/2 ) − m(λmax (M1/2 )) ≥ δ ≤ C(α)( 2)α
δα
√ £
¤1/α
and where C(α) = 4α (2 − α + eα)/α(2 − α).
whenever δ > 2a 2σ(S 2KN −1 )C(α)
Remark 1.15. (i) As already mentioned, Soshnikov and Fyodorov ([32]) obtained the asymptotic behavior of the largest eigenvalue of the Wishart matrix Y∗ Y when the entries of,
the K × N matrix, Y are iid Cauchy random variables. They further argue that although
the typical eigenvalues of Y∗ Y are of order KN , the correct order for the largest one is
K 2 N 2 . The above corollary combined with Remark 1.10 and the estimate (1.19), shows
that when the entries of Y form an α-stable random vector, the largest eigenvalue of Y∗ Y
is of order at most σ(S 2KN −1 )2/α . There is also a lower concentration result, described
next, which leads to a lower bound on the order of this largest eigenvalue. Thus, from
these two estimates, if the entries of Y are iid α-stable, the largest eigenvalue of Y ∗ Y is
of order K 2/α N 2/α .
(ii) Let X ∼ ID(β, 0, ν) in Rd , then (see Lemma 5.4 in [6]) for any x > 0, and any norm
k · kN on Rd ,
n
¡
¢ 1³
¡©
ª¢o´
P kXkN ≥ x ≥
1 − exp − ν u ∈ Rd : kukN ≥ 2x
.
4
R , Y I ), which we denote by kXk , if
But, λmax (M1/2 ) is a norm of the vector X = (Yi,j
λ
i,j
2KN
X is a stable vector in R
.
³
´
PK,N λmax (M1/2 ) − m(λmax (M1/2 )) ≥ δ
³
´
= PK,N λmax (M1/2 ) ≥ δ + m(λmax (M1/2 ))
n
¡©
¡
¢ª¢o´
1³
≥
1 − exp − ν λmax (M1/2 ) ≥ 2 δ + m(λmax (M1/2 ))
4
n
¡©
¡
¢ª¢o´
1³
1 − exp − ν kXkλ ≥ 2 δ + m(λmax (M1/2 ))
≥
4
¡ 2KN −1 ¢
¾¶
µ
½
σ̃ Sk·k
1
λ
¢α
,
=
1 − exp − ¡
4
α δ + m(λmax (M1/2 ))
(1.30)
2KN −1
where Sk·k
is the unit sphere relative to the norm k · kλ and where σ̃ is the spherical
λ
part of the Lévy measure corresponding to this norm. Moreover, if the components of
X¡ are independent,
in which case the Lévy measure is supported on the axes of R2KN ,
¢
2KN −1
is of order KN , and so, as above, the largest eigenvalue of M1/2 is of order
σ̃ Sk·k
λ
K 1/α N 1/α .
121
(iii) For any function f such that g(x) = f (x2 ) is Lipschitz with Lipschitz constant kgkLip :=
2
|||f |||L , tr(g(XA ))
√ = tr(f (X
√ A )) is a Lipschitz function of the entries of Y with Lipschitz
constant at most 2|||f |||L K + N . Hence, under the assumptions of part (i) of Corollary
1.14,
³
K +N´
K,N
K,N
P
trN (f (M)) − E
[trN (f (M))] ≥ δ
± N
½ Z √
¾
≤ exp
−
2(K+N )δ |||f |||L
h−1 (s)ds ,
(1.31)
0
p
for all 0 < δ < |||f |||L h (T − ) / 2(K + N ).
(iv) Under the assumptions of part (ii) of Corollary 1.14, for any function f such that g(x) =
f (x2 ) is Lipschitz with kgkLip = |||f |||L , and any median m(trN (f (M))) of trN (f (M)) we
have:
µ
¶
K +N
K,N
P
trN (f (M)) − m(trN (f (M))) ≥ δ
N
α
|||f |||L
σ(S 2KN −1 )
,
(1.32)
≤ C(α) p
δα
2α (K + N )α
£
¤1/α p
/ 2(K + N ), and where C(α) = 4α (2 − α +
whenever δ > |||f |||L 2σ(S 2KN −1 )C(α)
eα)/α(2 − α).
Remark 1.16. In the absence of finite exponential moments, the methods described in the
present paper extend beyond the heavy tail case and apply to any random matrix whose entries
on and above the main diagonal form an infinitely divisible vector X. However, to obtain explicit
concentration estimates, we do need explicit bounds on V 2 and on ν̄. Such bounds are not always
available when further knowledge on the Lévy measure of X is lacking.
2
Proofs:
We start with a proposition, which is a direct consequence of the concentration inequalities
obtained in [12] for general Lipschitz function of infinitely divisible random vectors with finite
exponential moment.
R , ωR , ωI )
Proposition 2.1. Let X = (ωi,i
PN ∼
i,j 1≤i 0 and let T = sup{t > 0 : EN etkXk <
+∞}. Let h−1 be the inverse of
Z
¡
¢
kuk eskuk − 1 ν(du),
0 < s < T.
h(s) =
RN
2
(i) For any Lipschitz function f ,
¡
¢
PN trN (f (XA )) − EN [trN (f (XA ))] ≥ δ ≤ exp
for all 0 < δ <
√
2akf kLip h (T − ) /N .
122
(
−
Z
0
√ Nδ
2akf kLip
)
h−1 (s)ds ,
(ii) Let λmax (XA ) be the largest eigenvalue of the matrix XA . Then,
( Z √N δ
)
√
¡
¢
2a
N
N
−1
P λmax (XA ) − E [λmax (XA )] ≥ δ ≤ exp −
h (s)ds ,
0
for all 0 < δ <
√
√
2ah (T − ) / N .
Proof of Theorem 1.2:
For part (i), following the proof of Theorem 1.3 of [10], without loss of generality, by shift
invariance, assume that min{x : x ∈ K} = 0. Next, for any v > 0, let
0 if x ≤ 0
gv (x) = x if 0 < x < v
(2.33)
v if x ≥ v.
Clearly gv ∈ Lip(1) with kgv k∞ = v. Next for any function f ∈ LipK (1), any ∆ > 0, define
x
recursively f∆ (x) = 0 for x ≤ 0, and for (j − 1)∆ ≤ x ≤ j∆, j = 1, . . . , ⌈ ∆
⌉, let
f∆ (x) =
x
⌉
⌈∆
X
(j)
g∆ ,
j=1
(j)
where g∆ := (21{f (j∆)>f∆ ((j−1)∆)} − 1)g∆ (x − (j − 1)∆). Then |f − f∆ | ≤ ∆ and the 1-Lipschitz
(j)
function f∆ is the sum of at most |K|/∆ functions g∆ ∈ Lip(1), regardless of the function f .
Now, for δ > 2∆,
!
Ã
¯
¯
N
N
P
sup ¯trN (f (XA )) − E [trN (f (XA ))]¯ ≥ δ
f ∈LipK (1)
≤P
N
µ
sup
f ∈LipK (1)
½
¯
¯ ¯
¯trN (f∆ (XA )) − EN (trN (f∆ (XA )))¯ + ¯trN (f (XA ))
¯ ¯
¯
− trN (f∆ (XA ))¯ + ¯EN [trN (f (XA ))] − EN [trN (f∆ (XA ))]¯
¶
µ
¯
¯
≤ PN sup¯trN (f∆ (XA )) − EN (trN (f∆ (XA )))¯ > δ − 2∆
¾
≥δ
¶
f∆
|K|
≤
sup PN
∆ g(j) ∈Lip(1)
∆
¶
µ
¯
¯
¯trN (g (j) (XA )) − EN [trN (g (j) (XA ))]¯ ≥ ∆(δ − 2∆)
∆
∆
|K|
¾
½ Z √N δ2
8 2a|K|
8|K|
−1
h (s)ds ,
(2.34)
≤
exp −
δ
0
q √
whenever 0 < δ < 8 2a|K|h (T − ) /N , and where the last inequality follows from part (i) of
the previous proposition by taking also ∆ = δ/4.
123
In order to prove part (ii), for any f ∈ Lipb (1), i.e, such that kf kLip ≤ 1, kf k∞ ≤ b, and any
τ > 0, let fτ be given via:
f (x)
f (τ ) − sign(f (τ ))(x − τ )
fτ (x) =
f (−τ ) + sign(f (−τ ))(x + τ )
0
if |x| < τ
if τ ≤ x < τ + |f (τ )|
if −τ − |f (−τ )| < x ≤ −τ
otherwise.
(2.35)
Clearly fτ ∈ Lip(1) and supp(fτ ) ⊂ [−τ − |f (−τ )|, τ + |f (τ )|]. Moreover,
¯
¯
¯
¯
sup ¯trN (f (XA ))−EN (trN (f (XA )))¯
f ∈Lipb (1)
≤
¯
¯
¯
¯
sup ¯trN (fτ (XA ))−EN (trN (fτ (XA )))¯
f ∈Lipb (1)
+
¯
¯
¯
¯
sup ¯trN (f (XA ) − fτ (XA )) − EN [trN (f (XA ) − fτ (XA ))]¯
f ∈Lipb (1)
≤
¯
¯
¯
¯
sup ¯trN (fτ (XA )) − EN (trN (fτ (XA )))¯
f ∈Lipb (1)
+ 2trN (gb (|XA | − τ )) +2EN [trN (gb (|XA | − τ ))],
with gb given as in (2.33). Now,
µ
¶
N
N
sup |trN (f (XA )) − E (trN (f (XA )))| ≥ δ
P
f ∈Lipb (1)
≤ PN
³
sup |trN (fτ (XA )) − EN (trN (fτ (XA )))| ≥
f ∈Lipb (1)
(2.36)
δ´
3
³
2δ ´
+ PN 2trN (gb (|XA | − τ )) + 2EN [trN (gb (|XA | − τ ))] ≥
3
´
³
δ
≤ PN
sup |trN (fτ (XA ))−EN (trN (fτ (XA )))| ≥
3
f ∈Lipb (1)
´
³
δ
+PN trN (gb (|XA|−τ ))−EN[trN (gb (|XA|−τ ))] ≥ −2EN[trN (gb (|XA|−τ ))]
3
³
δ´
≤ PN
sup |trN (fτ (XA ))−EN (trN (fτ (XA )))| ≥
3
f ∈Lipb (1)
´
³
δ
+PN trN (gb (|XA |−τ ))−EN [trN (gb (|XA |−τ ))] ≥ − 2bEN[trN (1{|XA |≥τ } ] .
3
(2.37)
Let us first bound the second probability in (2.37). Recall that the spectral
radius ρ(XA ) =
p
max |λi | is a Lipschitz function of X with Lipschitz constant at most a 2/N . Hence, for any
1≤i≤N
124
0 < t ≤ T , and γ > 0 such that ν̄(pγ ) ≤ 1/4,
N
´
1 X N³
P |λi (XA )| ≥ τ
N
i=1
¡
¢
N
≤ P ρ(XA ) ≥ τ
√
√
¶
µ√
¤
N
N N£
N
N
√ ρ(XA ) − √ E ρ(XA ) ≥ √ τ − G2 (γ)
≤P
2a
2a
2a
½
µ√
¶¾
N
≤ exp H(t) − √ τ − G2 (γ) t
2a
EN [trN (1{|XA |≥τ } )] =
(2.38)
where, above, we have used Proposition 1.1 in the next to last inequality and where the last
inequality follows from Theorem 1 in [12] (p. 1233) with
Z
Z t
¡ tkuk
¢
e
− tkuk − 1 ν(du).
h(s)ds =
H(t) =
RN
0
2
We want to choose τ , such that EN [trN (1{|XA |≥τ } )] ≤ δ/12b. This can be achieved if
√
ln 12b
N
δ + H(t)
√ τ − G2 (γ) ≥
.
t
2a
Since
(2.39)
µ
¶
th(t) − ln 12b
d ln 12b
δ + H(t)
δ − H(t)
=
,
2
dt
t
t
and
d2
dt2
µ
ln 12b
δ + H(t)
t
¶
′′
t3 H (t) − 2t(th(t) − ln 12b
δ − H(t))
,
=
4
t
it is clear that the right hand side of (2.39) is minimized when t = t0 , where t0 is the solution of
th(t) − H(t) − ln
12b
= 0,
δ
and the minimum is then h(t0 ).
Thus, if
√ µ
¶
2a
G2 (γ) + h(t0 ) ,
τ = C0 (δ, b) := √
N
then
EN [trN (1{|XA |≥τ } )] ≤
(2.40)
δ
,
12b
and so,
µ
¶
δ
N
N
P trN (gb (|XA|−τ ))−E [trN (gb (|XA|−τ ))] ≥ −2bE [trN (1{|XA |≥τ } ]
3
µ
¶
δ
N
N
≤ P trN (gb (|XA |−τ ))−E [trN (gb (|XA |−τ ))] ≥
6
)
( Z Nδ
N
≤ exp
−
6
√
2a
h−1 (s)ds ,
0
125
(2.41)
√
for all 0 < δ < 6 2ah (T − ) /N , where Proposition 2.1 is used in the last inequality.
For τ chosen as in (2.40), letting K = [−τ − b, τ + b], it follows that for any f ∈ Lipb (1),
fτ ∈ LipK (1). By part (i), the first term in (2.37) is such that
PN
³
sup |trN (fτ (XA ))−EN (trN (fτ (XA )))| ≥
f ∈Lipb (1)
µ
δ´
3
δ
≤P
sup |trN (fτ (XA )) − E [trN (fτ (XA ))]| ≥
3
fτ ∈LipK (1)
)
( Z
2
√ Nδ
144 2a(C0 (δ,b)+b)
48(C0 (δ, b) + b)
h−1 (s)ds ,
exp −
≤
δ
0
N
N
¶
(2.42)
√ ¡
¢
for all 0 < δ 2 ≤ 144 2a C0 (δ, b) + b h(T − )/N .
Hence, returning to (2.37), using (2.41) and (2.42) and for
q √ ¡
o
n √
¡ ¢
¢
δ < min 6 2ah T − /N, 144 2a C0 (δ, b) + b h(T − )/N ,
we have
P
N
µ
N
sup |trN (f (XA )) − E (trN (f (XA )))| ≥ δ
f ∈Lipb (1)
¶
( Z Nδ
)
( Z Nδ
)
δ
√
√
6 2a 24(C0 (δ,b)+b)
6 2a
24(C0 (δ, b)+b)
−1
−1
exp −
h (s)ds +exp −
h (s)ds
≤2
δ
0
0
)
( Z
2
¶
µ
√ Nδ
144 2a(C0 (δ,b)+b)
1 24(C0 (δ, b) + b)
−1
h (s)ds ,
exp −
≤ 2+
12
δ
0
(2.43)
since only the case δ ≤ 2b presents some interest (otherwise the probability in the statement of
the theorem is zero). Part (ii) is then proved.
✷
Proof of Proposition 1.4:
2
As a function of x ∈ RN , dW (µ̂N
A , µ)(x) is Lipschitz with Lipschitz constant at most
2
N
Indeed, for x, y ∈ R ,
dW (µ̂N
A , µ)(x)
=
sup
f ∈Lipb (1)
2a/N .
¯
¯
Z
¯
¯
¡
¢
¯trN f (XA )(x) −
f dµ¯¯
¯
R
¯
¯
¯
¯
≤ sup ¯¯trN (f (XA )(x)) − trN (f (XA )(y))¯¯
f ∈Lipb (1)
¯
¯
Z
¯
¯
+ sup ¯¯trN (f (XA )(y)) −
f dµ¯¯
≤
√
f ∈Lipb (1)
√
2a
kx − yk + dW (µ̂N
A , µ)(y).
N
126
R
(2.44)
Theorem 1.4 then follows from Theorem 1 in [12].
✷
Proof of Corollary 1.5:
£
¤
For Lévy measures with bounded support, EN etkXk < +∞, for all t ≥ 0, and moreover
h(t) ≤ V 2
Hence
H(t) =
Z
t
0
and
exp
½
−
Z
x
−1
h
(s)ds
0
µ
¾
h(s)ds ≤
≤ exp
½
¶
etR − 1
.
R
¢
V 2 ¡ tR
s − 1 − tR ,
2
R
x
−
R
µ
x
V2
+ 2
R R
¶
µ
Rx
ln 1 + 2
V
¶¾
.
Thus, one can take
¶
µ√ ³
¢´
2a
V 2 ¡ t0 R
e
−1 +b ,
C(δ, b) = C √
G2 (γ) +
R
N
where t0 is the solution, in t, of
´
V 2³
12b
tR
tR
.
tRe
−
e
+
1
= ln
2
R
δ
Applying Theorem 1.2 (ii) yields the result.
✷
In order to prove Theorem 1.11, we first need the following lemma, whose proof is essentially as
the proof of Theorem 1 in [13].
R , ωR , ωI )
Lemma 2.2. Let X = (ωi,i
i,j 1≤i 0, let gx0 ,x1 (x) = gx1 (x − x0 ), where gx1 (x) is
defined as in (2.33). Then,
¶
µ¯
2
¯
aα σ(S N −1 )
¯
¯
PN ¯trN (gx0 ,x1 (XA )) − EN [trN (gx0 ,x1 (XA ))]¯ ≥ δ ≤ C(α)
,
N αδα
¡ √ ¢1+α
2
whenever δ 1+α > 2 2a
σ(S N −1 )x1 /αN 1+α and where C(α) = 25α/2 (2eα + 2 − α)/α(2 − α).
Proof of Theorem 1.11
For part (i), first consider f ∈ LipK (1). Using the same approximation as in Theorem 1.2, any
function f ∈ LipK (1) can be approximated by f∆ , which is the sum of at most |K|/∆ functions
(j)
g∆ ∈ Lip(1), regardless of the function f . Now, and as before, for δ > 2∆,
127
PN
Ã
sup
f ∈LipK (1)
|K|
sup
≤
∆ g(j) ∈Lip
∆
j=1,··· ,⌈
≤
|trN (f (XA )) − EN (trN (f (XA )))| ≥ δ
!
µ¯
¯ ∆(δ − 2∆)¶
¯
¯
(j)
(j)
N
P
¯trN (g∆ (XA )) − E [trN (g∆ (XA ))]¯ ≥
|K|
(1)
N
b
|K|
⌉
∆
4|K| 8α aα C2 (α)σ(S N
δ
N α δ 2α
whenever
2 −1
)|K|α
,
(2.45)
√
2
1
δ2
2 2a ³ σ(S N −1 )δ ´ 1+α
,
>
8|K|
N
4α
(2.46)
and where the last inequality follows from Lemma 2.2, taking also ∆ = δ/4.
For any f ∈ Lipb (1), and any τ > 0, let fτ be given as in (2.35). Then, fτ ∈ LipK (1), where
K = [−τ − b, τ + b], and moreover,
P
N
µ
N
sup |trN (f (XA )) − E (trN (f (XA )))| ≥ δ
f ∈Lipb (1)
¶
´
³
δ
≤ PN trN (gτ,b (|XA|))−EN[trN (gτ,b (|XA|))] ≥ −2bEN[trN (1{|XA |≥τ } ]
3
³
δ´
N
N
+P
sup |trN (fτ (XA )) − E (trN (fτ (XA )))| ≥
.
3
fτ ∈LipK (1)
The spectral radius ρ(XA ) is a Lipschitz function of X with Lipschitz constant at most
Then by Theorem 1 in [13],
(2.47)
√
√
2a/ N .
N
´
1 X N³
P |λi (XA )| ≥ τ
N
i=1
³
´
N
≤ P ρ(XA ) > τ
Ã
EN [trN (1{|XA |≥τ } )] =
≤ PN
!
√
2a
ρ(XA ) − m(ρ(XA )) > τ − √ J1 (α)
N
2
whenever
C1 (α)2α/2 aα σ(S N −1 )
√
≤
¡
¢α ,
N α/2 τ − √2a
J
(α)
1
N
√
¶α
µ
2
2C1 (α)2α/2 aα σ(S N −1 )
2a
≥
,
τ − √ J1 (α)
N α/2
N
and where C1 (α) = 4α (2 − α + eα)/α(2 − α). Now, if τ is chosen such that
2
δ
C1 (α)2α/2 aα σ(S N −1 )
√
,
≤
¢
¡
α
12b
J
(α)
N α/2 τ − √2a
1
N
128
(2.48)
(2.49)
that is, if
it then follows that
√
µ
¶α
2
12bC1 (α)2α/2 aα σ(S N −1 )
2a
τ − √ J1 (α)
≥
,
δN α/2
N
EN [trN (1{|XA |≥τ } )] ≤
(2.50)
δ
.
12b
Since gτ,b (|XA|) is the sum of two functions of the type studied in Lemma 2.2 with x1 = b, we
have,
³
´
δ
PN trN (gτ,b (|XA |))−EN [trN (gτ,b (|XA |))] ≥ −2bEN [trN (1{|XA |≥τ } ]
3
³
δ ´
N
N
≤ 2P trN (gτ,b (XA )) − E [trN (gτ,b (XA ))] ≥
12
2 −1
α
α
N
12 a σ(S
)
≤ 2C2 (α)
,
α
α
N δ
whenever
1+α
(2.51)
³ 2√2a ´1+α 121+α σ(S N 2 −1 )b
,
(2.52)
N
α
and where C2 (α) = 25α/2 (2eα + 2 − α)/α(2 − α). The respective ranges (2.50) and (2.52) suggest
that one can choose, for example,
√
√
2a
2a
τ = √ J1 (α) + √ δ.
N
N
δ
>
Then, there exists δ(α, a, N, ν) such that for δ > δ(α, a, N, ν),
³
´
PN
sup |trN (f (XA )) − EN [trN (f (XA ))]| ≥ δ
f ∈Lipb (1)
≤ PN
³
sup
fτ LipK (1)
|trN (fτ (XA )) − EN [trN (fτ (XA ))]| ≥
δ´
3
´
³
δ
+ PN trN (gτ,b (|XA |))−EN [trN (gτ,b (|XA |))] ≥ −2bEN [trN (1{|XA |≥τ } ]
3
´1+α
³√
√
2
2
C3 (α)aα σ(S N −1 ) √2a
J (α) + b + √2a
δ
C4 (α)aα σ(S N −1 )
N 1
N
≤
+
,
N α δ 1+2α
N αδα
where C3 (α) = 24+2α 12α C2 (α), C4 (α) = 2(12α )C2 (α) and δ(α, a, N, ν) is such that (2.46) and
(2.52) hold.
√
Part (ii) is a direct consequence of Theorem 1 of [13], since dW (µ̂N
A , µ) ∈ Lip( 2a/N ) as shown
in the proof of Proposition 1.4.
✷
Proof of Theorem 1.12. For any f ∈ Lip(1), Theorem 1 in [13] gives a concentration inequality for f (X), when it deviates from one of its medians. For 1 < α < 2, a completely similar
(even simpler) argument gives the following result,
P
N
¡
C(α)σ(S N
f (X) − E [f (X)] ≥ x ≤
xα
N
¢
129
2 −1
)
,
(2.53)
α
N
whenever
© α x ≥ K(α)σ(S
ª
max 2 /(α − 1), C(α) .
2 −1
), where C(α) = 2α (eα + 2 − α)/(α(2 − α)) and K(α) =
Next, following the proof of Theorem 1.2, approximate any function f ∈ Lipb (1) by fτ ∈
Lip[−τ −b,τ +b] (1) defined via (2.35). Hence,
P
N
³
≤P
N
sup |trN (f (XA )) − E (trN (f (XA )))| ≥ δ
f ∈Lipb (1)
N
³
´
δ´
sup |trN (fτ (XA )) − E (trN (fτ (XA )))| ≥
3
fτ ∈LipK (1)
N
´
³
δ
+PN trN (gτ,b (|XA |))−EN [trN (gτ,b (|XA |))] ≥ −2bEN [trN (1{|XA |≥τ } ] .
3
(2.54)
For ρ(XA ) the spectral radius of the matrix XA , and for any τ , such that τ − EN [ρ(XA )] ≥
´1/α
³√
2
√2a K(α)σ(S N −1 )
,
N
³
´
¡
¢
EN trN (1{|XA |>τ } ) ≤ PN ρ(XA ) − EN [ρ(XA )] ≥ τ − EN [ρ(XA )]
³ √ ´α
2
√2a
C(α)σ(S N −1 )
N
¢α ,
≤ ¡
τ − EN [ρ(XA )]
(2.55)
√
√
where we have used, in the last inequality, (2.53) and the fact that ρ(XA ) ∈ Lip( 2a/ N ). For
Q > 0, let τ = EN [ρ(XA )] + Qδ −1/α . With this choice, we then have:
¡
¢
EN trN (1{|XA |>τ } ) ≤
³ √ ´α
√2a
N
2 −1
)
¢α
τ − EN [ρ(XA )]
³ √ ´α
2
√2a
C(α)σ(S N −1 )
N
¡
≤δ
≤
C(α)σ(S N
Qα
δ
,
12b
(2.56)
³ √ ´α
√
√
2
2
C(α)σ(S N −1 )/Qα ≤ 1/(12b). Now,
provided Qα /δ > 2aK(α)σ(S N −1 )/ N , and √2a
N
√ ¡
¢1/α √
2
/ N , and recalling, for 1 < α < 2, the lower range
taking Q = 2a 12bC(α)σ(S N −1 )
concentration result for stable vectors√(Theorem 1 and Remark 3 in [5]): For any ǫ > 0, there
exists η0 (ǫ), such that for all 0 < δ < 2akf kLip η0 (ǫ)/N ,
¡
¢
PN trN (f (XA )) − EN (trN (f (XA ))) ≥ δ
(
¡
¢ α
≤ (1 + ǫ) exp
−
2−α α−1 α−1
10
α
(σ(S N 2 −1 ))1/(α−1)
130
Ã
N
√
2akf kLip
!
α
α−1
δ
α
α−1
)
.
(2.57)
With arguments as in the