Testing the Equality of Covariance Opera

arXiv:1404.7080v1 [math.ST] 28 Apr 2014

A test for the equality of covariance operators
Graciela Boente, Daniela Rodriguez and Mariela Sued
Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires and CONICET, Argentina
e–mail: gboente@dm.uba.ar
drodrig@dm.uba.ar
msued@dm.uba.ar

Abstract
In many situations, when dealing with several populations, equality of the covariance
operators is assumed. An important issue is to study if this assumption holds before
making other inferences. In this paper, we develop a test for comparing covariance
operators of several functional data samples. The proposed test is based on the squared
norm of the difference between the estimated covariance operators of each population.
We derive the asymptotic distribution of the test statistic under the null hypothesis and
for the situation of two samples, under a set of contiguous alternatives related to the
functional common principal component model. Since the null asymptotic distribution
depends on parameters of the underlying distribution, we also propose a bootstrap test.

1


Introduction

In many applications, we study phenomena that are continuous in time or space and can be
considered as smooth curves or functions. On the other hand, when working with more than
one population, as in the finite dimensional case, the equality of the covariance operators
associated with each population is often assumed. In the case of finite-dimensional data,
tests for equality of covariance matrices have been extensively studied, see for example
Seber (1984) and Gupta and Xu (2006). This problem has been considered even for high
dimensional data, i.e., when the sample size is smaller than the number of variables under
study; we refer among others to Ledoit and Wolf (2002) and Schott (2007).
For functional data, most of the literature on hypothesis testing deals with tests on the
mean function including the functional linear model, see, for instance, Fan and Lin (1998),
Cardot et al. (2003), Cuevas et al. (2004) and Shen and Faraway (2004). Tests on the
covariance operators related to serial correlation were considered by Gabrys and Kokoszka
(2007), Gabrys et al. (2010) and Horváth et al. (2010). On the other hand, Benko
et al. (2009) proposed two-sample bootstrap tests for specific aspects of the spectrum of
functional data, such as the equality of a subset of eigenfunctions while Ferraty et al. (2007)
considered tests for comparison of groups of curves based on comparison of their covariances.
The hypothesis tested by the later are that of equality, proportionality and others based

on the spectral decomposition of the covariances. Their approach is high dimensional since
they either approximate the curves over a grid of points, or use a projection approach. More
1

recently, Panaretos et al. (2010) considered the problem of testing whether two samples of
continuous zero mean i.i.d. Gaussian processes share or not the same covariance structure.
In this paper, we go one step further and consider the functional setting. Our goal
is to provide a test statistic to test the hypothesis that the covariance operators of several
independent samples are equal in a fully functional setting. To fix ideas, we will first describe
the two sample situation. Let us assume that we have two independent populations with
b 1 and Γ
b 2 consistent estimators of Γ1 and Γ2 ,
covariance operators Γ1 and Γ2 . Denote by Γ
respectively, such as the sample covariance estimators studied in Dauxois et al. (1982).
It is clear that under the standard null hypothesis Γ1 = Γ2 , the difference between the
covariance operator estimators should be small. For that reason, a test statistic based on
b1 − Γ
b 2 may be helpful to study the hypothesis of equality.
the norm of Γ


The paper is organized as follows. Section 2 introduce the notation and review some
basic concepts which are used in later sections. Section 3 introduces the test statistics for
the two sample problem. Its asymptotic distribution under the null hypothesis is established
in Section 3.1 while a bootstrap test is described in Section 3.2. An important issue is to
describe the set of alternatives that the proposed statistic is able to detect. For that purpose,
the asymptotic distribution under a set of contiguous alternatives based on the functional
common principal component model is studied in Section 3.3. Finally, an extension to
several populations is provided in Section 4. Proofs are relegated to the Appendix.

2

Preliminaries and notation

Let us consider independent random elements X1 , . . . , Xk in a separable Hilbert space H
(often L2 (I)) with inner product h·, ·i and norm kuk = hu, ui1/2 and assume that EkXi k2 <
∞. Denote by µi ∈ H the mean of Xi , µi = E(Xi ) and by Γi : H → H the covariance
operator of Xi . Let ⊗ stand for the tensor product on H, e.g., for u, v ∈ H, the operator
u ⊗ v : H → H is defined as (u ⊗ v)w = hv, wiu. With this notation, the covariance operator
Γi can be written as Γi = E{(Xi − µi ) ⊗ (Xi − µi )}. The operator Γi is linear, self-adjoint
and continuous.

R
In particular, if H = L2 (I) and hu, vi = I u(s)v(s)ds, the covariance operator is defined
Ri (s),
R Xi2(t)), s, t ∈ I as (Γi u)(t) =
Rthrough the covariance function of Xi , γi (s, t) = cov(X
2 =
γ
(s,
t)u(s)ds.
It
is
usually
assumed
that

k
i
I I γi (t, s)dtds < ∞ hence, Γi
I i
is a Hilbert-Schmidt operator. Hilbert–Schmidt operators have a countable number of

eigenvalues, all of them being real.
Let F denote the Hilbert space of Hilbert–Schmidt operators with inner product deP∞
1/2
fined
2 iF = trace(H1 H2 ) =
ℓ=1 hH1 uℓ , H2 uℓ i and norm kHkF = hH, HiF =
P∞ by hH1 ,2H1/2
{ ℓ=1 kHuℓ k } , where {uℓ : ℓ ≥ 1} is any orthonormal basis of H, while H1 , H2 and H
are Hilbert-Schmidt operators, i.e., such that kHkF < ∞. Choosing an orthonormal basis
{φi,ℓ : ℓ ≥ 1} of eigenfunctions
{λi,ℓ : ℓ ≥ 1} such that
P of Γ2i related to the eigenvalues
2 (I), we have kΓ k = kγ k.
.
In
particular,
if
H
=
L

λi,ℓ ≥ λi,ℓ+1 , we get kΓi k2F = ∞
λ
i F
i
ℓ=1 i,ℓ
Our goal is to test whether the covariance operators Γi of several populations are equal
2

or not. For that purpose, let us consider independent samples of each population, that
is, let us assume that we have independent observations Xi,1 , · · · , Xi,ni , 1 ≤ i ≤ k, with
Xi,j ∼ Xi . A natural way to estimate the covariance operators Γi , for 1 ≤ i ≤ k, is through
b i is defined as
their empirical versions. The sample covariance operator Γ
ni


1 X
b
Γi =
Xi,j − X i ⊗ Xi,j − X i ,

ni
j=1

Pn i

where X i = 1/ni j=1 Xi,j . Dauxois et al. (1982) obtained the asymptotic behaviour of


b i . In particular, they have shown that, when E(kXi,1 k4 ) < ∞, √n Γ
b i − Γi converges in
Γ
i
distribution to a zero mean Gaussian random element of F, Ui , with covariance operator
Υi given by
X
˜ i,o ⊗ φi,p
Υi =
sim sir sio sip E[fim fir fio fip ] φi,m ⊗ φi,r ⊗φ
m,r,o,p




X
m,r

˜ i,r ⊗ φi,r
λim λir φi,m ⊗ φi,m ⊗φ

(1)

˜ stands for the tensor product in F and, as mentioned above, {φi,ℓ : ℓ ≥ 1} is an
where ⊗
orthonormal basis of eigenfunctions of Γi with associated eigenvalues {λi,ℓ : ℓ ≥ 1} such that
λi,ℓ ≥ λi,ℓ+1 . The coefficients sim are such that s2im = λi,m , while fim are the standardized
1

2
.
coordinates of Xi − µi on the basis {φi,ℓ : ℓ ≥ 1}, that is, fim = hXi − µi , φi,m i/λi,m
Note that E(fim ) = 0. Using that cov (hu, Xi − µi i, hv, Xi − µi i) = hu, Γi vi, we get that

2 ) = 1, E(f
E(fim
im fis ) = 0 for m 6= s. In particular, the Karhunen-Loéve expansion leads
to

X
1
2
fiℓ φi,ℓ .
(2)
Xi = µi +
λi,ℓ

ℓ=1

EkUi k2F

It is worth noticing that
< ∞ so, the sum of the eigenvalues of Υi is finite, implying
that Υi is a linear operator over

Pk F which is Hilbert Schmidt. Thus, any linear combination
of the operators Υi , Υ =
i=1 ai Υi , with ai ≥ 0, will be a Hilbert Schmidt operator.
Therefore,
if

}
stand
for
the eigenvalues of Υ ordered in decreasing order, θℓ ≥ 0 and
ℓ ℓ ≥1
P
ℓ ≥1 θℓ < ∞. This property will be used later in Theorem 3.1.

b i,h , of the covariance operators were studied in
When H = L2 (I), smooth estimators, Γ
Boente and Fraiman (2000). The smoothed operator is the operator induced by the smooth
covariance function
γ
bi,h (t, s) =

R

ni


1 X
Xi,j,h(t) − X i,h (t) Xi,j,h (s) − X i,h (s) ,
n1
j=1

where Xi,j,h (t) = I Kh (t − x)Xi,j are the smoothed trajectories, Kh (·) = h−1 K(·/h) is a
nonnegative kernel function, and h a smoothing parameter. Boente and Fraiman (2000)
have shown that, under mild conditions, the smooth estimators have the same asymptotic
distribution that the empirical version.
3

3

Test statistics for two–sample problem

We first consider the problem of testing the hypothesis
H 0 : Γ1 = Γ2

against

H1 : Γ1 6= Γ2 .

(3)

b i as the empirical covariance operators of each populaA natural approach is to consider Γ
tion and construct a statistic Tn based on the difference between the covariance operators
b1 − Γ
b 2 k2 , where n = n1 + n2 .
estimators, i.e., to define Tn = nkΓ
F

3.1

The null asymptotic distribution of the test statistic

b1 − Γ
b 2 k2 when
The following result allows to study the asymptotic behaviour of Tn = nkΓ
F
Γ1 = Γ2 and thus, to construct a test for the hypothesis (3.1) of equality of covariance
operators.
Theorem 3.1. Let Xi,1 , · · · , Xi,ni , for i = 1, 2, be independent observations from two
independent samples in H with mean µi and covariance operator Γi . Let n = n1 + n2 and
e i , i = 1, 2, be independent estimators
assume also that ni /n → τi with τi ∈ (0, 1). Let Γ

√ e
D
of the i−th population covariance operator such that ni Γ
i − Γi −→ Ui , with Ui a
zero mean Gaussian random element with covariance operator
Υi . Denote by {θℓ }ℓ ≥1 the
P
eigenvalues of the operator Υ = τ1 −1 Υ1 + τ2 −1 Υ2 with ℓ ≥1 θℓ < ∞. Then,
D
e 1 − Γ1 ) − (Γ
e 2 − Γ2 )k2 −→
nk(Γ
F

X

θℓ Zℓ2 ,

(4)

ℓ ≥1

where Zℓ are i.i.d. standard normal random variables. In particular, if Γ1 = Γ2 we have
D P
e1 − Γ
e 2 k2 −→
θℓ Z 2 .
that nkΓ
F

ℓ ≥1



Remark 3.1.

a) The results in Theorem 3.1 apply in particular, when considering the sample

 covari√
4
b i − Γi
ei = Γ
b i . Effectively, when E(kXi,1 k ) < ∞, n Γ
ance operator, i.e., when Γ
i
converges in distribution to a zero mean Gaussian random element Ui of F with covariance operator Υi given by
P (1). As mentioned in the Introduction, the fact that
E(kXi,1 k4 ) < ∞ entails that ℓ ≥1 θℓ < ∞.

b) It is P
worth noting that if qn is a sequence of integers
that qn → ∞, the fact
P such
n
θℓ Zi2 is Cauchy in L2 and
that ℓ ≥1 θℓ < ∞ implies that the sequence Un = qℓ=1
P
therefore, the limit U = ℓ ≥1 θℓ Zℓ2 is well defined. In fact, analogous arguments to
those considered in Neuhaus (1980) allow to show that the series converges almost
surely. Moreover, since Z12 ∼ χ21 , U has a continuous distribution function FU and
so FUn , the distribution functions of Un , converge to the FU uniformly, as shown in
Lemma 2.11 in Van der Vaart (2000).
4

Remark 3.2. Theorem 3.1 implies that, under the null hypothesis H0 : Γ1 = Γ2 , we
P
D
2
b1 − Γ
b 2 k2 −→
have that Tn = nkΓ
U =
ℓ ≥1 θℓ Zℓ , hence an asymptotic test based on
F
Tn rejecting for large values of Tn allows to test H0 . To obtain the critical value, the
distribution of U and thus, the eigenvalues of τ1−1 Υ1 + τ2−1 Υ2 need to be estimated. As
mentioned in Remark 3.1 the distribution function of U can be uniformly approximated by
that of Un and so, the critical values can be approximated by the (1 − α)−percentile of Un .
Gupta and Xu (2006) provide an approximation for the distribution function of any finite
mixture of χ21 independent random variables that can be used in the computation of the
Pn b 2
θℓ Zℓ where θbℓ are estimators of θℓ . It is also, worth noticing
(1 − α)−percentile of qℓ=1
that under H0 : Γ1 = Γ2 , we have that for i = 1, 2, Υi given in (1) reduces to
X
X
˜ o ⊗ φp −
˜ r ⊗ φr
Υi =
sm sr so sp E[fim fir fio fip ] φm ⊗ φr ⊗φ
λm λr φm ⊗ φm ⊗φ
m,r,o,p

m,r

where for the sake of simplicity we have eliminated the subscript 1 and simply denote as
1/2
sm = λm with λm the m−th largest eigenvalue of Γ1 and φm its corresponding eigenfunction. In particular, if all the populations have the same underlying distribution except for
the mean and covariance operator, as it happens when comparing the covariance operators
of Gaussian processes, the random function f2m has the same distribution as f1m and so,
Υ1 = Υ2 .
The previous comments motivate the use of the bootstrap methods, due the fact that the
asymptotic distribution obtained in (4) depends on the unknown eigenvalues θℓ . It is clear
that when the underlying distribution of the process Xi is assumed to be known, for instance,
if both samples correspond to Gaussian processes differing only on their mean and covariance
operators, a parametric bootstrap can be implemented. Effectively, denote by Gi,µi ,Γi the
distribution of Xi where the parameters µi and Γi are explicit for later convenience. For
⋆ , 1 ≤ j ≤ n , with distribution G
each 1 ≤ i ≤ k, generate bootstrap samples Xi,j
i
b i . Note
i,0,Γ
that the samples can be generated with mean 0 since our focus is on covariance operators.
b i is a finite range operator, hence the Karhunen–
Besides, the sample covariance operator Γ
⋆ knowing the distribution of the random variables
Loéve expansion (2) allows to generate Xi,j
bi,ℓ , 1 ≤ ℓ ≤ ni , that is,
b i and its related eigenvalues λ
fiℓ , the eigenfunctions φbi,ℓ of Γ
b ⋆i as the sample
the estimators of the first principal components of the process. Define Γ


⋆ , 1 ≤ j ≤ n and further, let T ⋆ = nkΓ
b −Γ
b k2 . By replicating
covariance operator of Xi,j
i
n
1
2 F
Nboot times, we obtain Nboot values Tn⋆ that allow easily to construct a bootstrap test.

The drawback of the above described procedure, it that it assumes that the underlying
distribution is known hence, it cannot be applied in many situations. For that reason, we
will consider a bootstrap calibration for the distribution of the test that can be described
as follows,
b i be consistent estimators of Υi for
Step 1 Given a sample Xi,1 , · · · , Xi,ni , let Υ
−1
−1
b 2 with τbi = ni /(n1 + n2 ).
b 1 + τb Υ
b = τb Υ
i = 1, 2. Define Υ
2
1
b
Step 2 For 1 ≤ ℓ ≤ qn denote by θbℓ the positive eigenvalues of Υ.
5

Step 3 Generate Z1∗ , . . . , Zq∗n i.i.d. such that Zi∗ ∼ N (0, 1) and let Un∗ =

Pq n b ∗ 2
j=1 θj Zj .

∗ for 1 ≤ r ≤ N
Step 4 Repeat Step 3 Nboot times, to get Nboot values of Unr
boot .

The (1 − α)−quantile of the asymptotic distribution of Tn can be approximated by the
∗ for 1 ≤ r ≤ N
(1 − α)−quantile of the empirical distribution of Unr
boot . The p−value can
∗ which are larger or equal than
be estimated by pb = s/Nboot where s is the number of Unr
the observed value of Tn .

Remark 3.3.
Note that this procedure depends only on the asymptotic distribution
b i . For the sample covariance estimator, the covariance operator Υi is given by (1).
of Γ
Hence, for Gaussian samples, using that fij are independent and fij ∼ N (0, 1), Υi can be
estimated using as consistent estimators of the eigenvalues and eigenfunctions of Γi , the
eigenvalues and eigenfunctions of the sample covariance. For non Gaussian samples, Υi can
be estimated noticing that
sim sir sio sip E (fim fir fio fip ) = E (hXi,1 , φi,m ihXi,1 , φi,r ihXi,1 , φi,o ihXi,1 , φi,p i) .
When considering other asymptotically normally estimators of Γi , such as the smoothed
estimators Γsi for L2 (I) trajectories, the estimators need to be adapted.

3.2

Validity of bootstrap procedure

The following theorem entails the validity of the bootstrap calibration method. It states
that, under H0 , the bootstrap distribution of Un∗ converges to the asymptotic null distribution of Tn . This fact ensures that the asymptotic significance level of the test based on the
bootstrap critical value is indeed α.

Theorem 3.2. Let qn such that qn / n → 0 and X̃n = (X1,1 , · · · , X1,n1 , X2,1 , · · · , X2,n2 ).
Denote by FU ∗ |X̃n (·) = P(Un∗ ≤ · |X̃n ). Then, under the assumptions of Theorem 3.1, if
n
√ b
nkΥ − Υk = OP (1), we have that
p

ρk (FU ∗ |X̃n , FU ) −→ 0 ,
n

(5)

P
where FU denotes the distribution function of U = ℓ ≥1 θℓ Zℓ2 , with Zℓ ∼ N (0, 1) independent of each other, and ρk (F, G) stands for the Kolmogorov distance between distribution
functions F and G.

3.3

Behaviour under contiguous alternatives

In this section, we study the behaviour of the test statistic Tn under a set of contiguous
alternatives. The contiguous alternatives to be considered consist in assuming that discrepancies from the null hypothesis arise only in the eigenvalues and not in the eigenfunctions of
the covariance operators Γi , i.e., we assume that we are approximating the null hypothesis

6

with alternatives satisfying a functional common principal model. In this sense, under those
local alternatives, the processes Xi , i = 1, 2, can be written as
X1 = µ1 +


X

1

λℓ2 f1ℓ φℓ

and X2 = µ2 +

ℓ=1


X

(n)

1

λ2,ℓ 2 f2ℓ φℓ

(6)

ℓ=1

(n)

with λ1 ≥ λ2 ≥ . . . ≥ 0, λ2,ℓ → λℓ at a given rate, while fiℓ are random variables such
that E(fiℓ ) = 0, E(fiℓ2 ) = 1, E(fiℓ fis ) = 0 for ℓ 6= s. For simplicity, we have omitted the
subscript 1 in λ1,ℓ . Hence, we are considering as alternatives a functional common principal
component model which includes as a particular case, proportional alternatives of the form
Γ2,n = ρn Γ1 , with ρn → 1. For details on the functional principal component model, see
for instance, Benko et al. (2009) and Boente et al. (2010).
Theorem 3.3.
Let Xi,1 , · · · , Xi,ni for i = 1, 2 be independent observations from two
independent distributions in H, P
with mean µi and covariance operator Γi such that Γ2 =
−1/2
Γ2,n = Γ1 + n
Γ, with Γ = ℓ ≥1 ∆ℓ λℓ φℓ ⊗ φℓ . Furthermore, assume that Xi,j ∼ Xi
(n)

where Xi satisfy (6) with λ2,ℓ = λℓ (1 + n−1/2 ∆ℓ ) and that E(kXi,1 k4 ) < ∞ for i = 1, 2.
b i be the sample
Let n = n1 + n2 and assume also that ni /n → τi with τi ∈ (0, 1). Let Γ

covariance operator of the i−th population and denote by
X
X
˜ o ⊗ φp −
˜ r ⊗ φr ,
Υi =
sm sr so sp E[fim fir fio fip ] φm ⊗ φr ⊗φ
λm λr φm ⊗ φm ⊗φ
m,r,o,p

m,r

P
P∞
P∞
1/2
2
if ∞
where sm = λm . Then,
ℓ=1 λℓ ∆ℓ < ∞,
ℓ=1 λℓ ∆ℓ σ4,ℓ < ∞,
ℓ=1 λℓ ∆ℓ σ4,ℓ < ∞,
P
P


4
2
2
ℓ=1 λℓ σ4,ℓ < ∞, with σ4,ℓ = E(f2ℓ ), we get that
ℓ=1 λℓ ∆ℓ < ∞ and
a)



D
1/2
b 2 − Γ1 −→
n2 Γ
U2 + τ2 Γ with U2 a zero mean Gaussian random element with
covariance operator Υ2 .



b) Denote by {θℓ }ℓ ≥1 the eigenvalues of the operator Υ = τ1−1 Υ1 + τ2−1 Υ2 . Moreover,
let υℓ be an orthonormal basis of FP
such that υℓ is the
of Υ related to
P eigenfunction
2
θℓ and consider the expansion Γ = ℓ ≥1 ηℓ υℓ , with ℓ ≥1 ηℓ < ∞. Then,

X 
ηℓ 2
2 D
b
b

Tn = nkΓ1 − Γ2 kF −→
θℓ Zℓ +
θℓ
ℓ ≥1
where Zℓ are independent and Zℓ ∼ N (0, 1) .

4

Test statistics for k−populations

In this Section, we consider tests for the equality of the covariance operators of k populations. That is, if Γi denotes the covariance operator of the i−th population, we wish to test
the null hypothesis
H0 : Γ1 = . . . = Γk

against

H1 : ∃ i 6= j such that Γi 6= Γj
7

(7)

P
Let n = n1 + . . . + nk and assume that ni /n → τi , 0 < τi < 1, ki=1 τi = 1. A natural
generalization of the proposal given in Section 3 is to consider the following test statistic
Tk,n = n

k
X
j=2

bj − Γ
b 1 k2 ,

F

(8)

b i stands for the sample covariance operator of i−th population. The following result
where Γ
states the asymptotic distribution of Tk,n , under the null hypothesis.

Theorem 4.1.
Let Xi,1 , · · · , Xi,ni , for 1 ≤ i ≤ k, be independent observations from
k independent distributions in H, with mean µi and covariance operator Γi such that
b i be the sample covariance operator of the i−th population. Assume
E(kXi,1 k4 ) < ∞. Let Γ
Pk
that ni /n → τi with τi ∈ (0, 1) where n =
i=1 ni . Denote Υw the linear operator
Υw : Hk−1 → Hk−1 defined as
!


k−1
X
1
1
1
ui
Υ2 (u1 ), . . . , Υk (uk−1 ) + Υ1
Υw (u1 , . . . , uk−1 ) =
τ2
τk
τ1
i=1

where Υi are given in (1). Let {θℓ }ℓ ≥1 stand for the sequence of eigenvalues of Υw ordered
in decreasing order. Under H0 : Γ1 = . . . = Γk , we have
n

k
X
j=2

D

bj − Γ
b 1 k2F −→


where Zℓ ∼ N (0, 1) are independent.

X

θℓ Zℓ2

ℓ ≥1

As mentioned in the Introduction, the fact that E(kXi,1 k4 ) < ∞ entails that

P

ℓ≥1 θℓ

< ∞.

Remark 4.1.
Note that Theorem 4.1 is a natural extension of its analogous in the
finite–dimensional case. To be more precisely, let Zij ∈ Rp with 1 ≤ i ≤ k and 1 ≤ j ≤
b i be their sample covariance matrix. Then,
ni be independent random vectors and let Σ

√ b
ni Vi = ni (Σi − Σi ) converges to a multivariate normal distribution with mean zero and
covariance matrix Υi . Let


−Ip Ip×p
0
...
0
 −Ip
0
Ip×p . . .
0 

A=
 −Ip
0
0
0 
−Ip
0
...
0 Ip×p
where Ip stands for the identity matrix of order p. Then, straightforward calculations allow

D
to show that nA(V1 , . . . , Vk )t −→ N (0, Υ) where
 −1

τ1 Υ1 + τ2 −1 Υ2
τ1 −1 Υ1
...
τ1 −1 Υ1


τ1 −1 Υ1
τ1 −1 Υ1 + τ3 −1 Υ3 . . .
τ1 −1 Υ1

Υ=
−1
−1
−1


τ1 Υ 1
τ1 Υ 1
...
τ1 Υ 1
−1
−1
−1
−1
τ1 Υ 1
τ1 Υ 1
. . . τ1 Υ 1 + τk Υ k
8

Therefore, under the null hypothesis of equality of the covariance matrices Σi , we have that
P
D Pkp4
2
b i −Σ
b 1 k2 = k√nAVk2 −→
n ki=2 kΣ
ℓ=1 θℓ Zℓ where V = (V1 , . . . , Vk ) and θ1 , θ2 , . . . , θkp4
are the eigenvalues of Υ. Note that the matrix Υ is the finite dimensional version of the
covariance operator Υw .
Remark 4.2. The conclusion of Theorem 4.1 still holds if, instead of the sample covariance
e i of
operator, one considers consistent and asymptotically normally distributed estimators Γ
√ e
D
the covariance operator Γi such that ni (Γi − Γi ) −→ Ui , where Ui is zero mean Gaussian
e i . For instance, the
random element of F with Hilbert Schmidt covariance operator Υ
scatter estimators proposed by Locantore et al. (1999) and further developed by Gervini
(2008) may be considered, if one suspects that outliers may be present in the sample. These
estimators weight each observation according to their distance to the center of the sample.
To be more precise, let us define the spatial median of the i−th population as the value ηi
such that
ηi = argmin E (kXi − θk − kXi k)
(9)
θ∈H

and the spatial covariance operator Γsi as

Γsi = E (Xi − ηi ) ⊗ (Xi − ηi )/kXi − ηi k2 ,

(10)

with ηi being the spatial median. It is well known that, when second moment exist, Γsi
is not equal to the covariance operator of the i−th sample, even if they share the same
eigenfunctions when Xi has a finite Karhunen Loève expansion and the components fiℓ in
(2) have a symmetric distribution,
P see Gervini (2008). Effectively, under symmetry of fiℓ ,
ηi = µi and we have that Γsi = ℓ≥1 λsi,ℓ φi,ℓ ⊗ φi,ℓ with
λsi,ℓ = λi,ℓ E

fiℓ2
P
2
s≥1 λi,s fis

!

.

The point to be noted here is that even if Γsi is not proportional to Γi , under the null
hypothesis H0 : Γ1 = . . . = Γk , we also have that H0s : Γs1 = . . . = Γsk is true when the
components fiℓ are such that fiℓ ∼ f1ℓ for 2 ≤ i ≤ k, ℓ ≥ 1 which means that all the
populations have the same underlying distribution, except for the location parameter and
the covariance operator. Thus, one can test H0s through an statistic analogous to Tk,n
defined in (8) but based on estimators of Γsi .
Estimators of ηi and Γsi are defined through their empirical versions as P
follows. The
i
estimator of the spatial median is the value ηbi minimizing over µ the quantity nj=1
kXi,j −
µk while the spatial covariance operator estimator is defined as
ni
X
(Xi,j − ηbi ) ⊗ (Xi,j − ηbi )
b si = 1
.
Γ
ni
kXi,j − ηbi k2
j=1

Gervini (2008) derived the consistency of these estimators and the asymptotic normality
b s has not been given yet.
of ηbi . Up to our knowledge, the asymptotic distribution of Γ
i
9

However, we conjecture that, when the components fiℓ in (2) have a symmetric distribution,
its asymptotic behaviour will be the same as that of
ni
X
(Xi,j − µi ) ⊗ (Xi,j − µi )
es = 1
Γ
.
i
ni
kXi,j − µi k2
j=1

b s is
since ηbi is a root−n consistent estimator of ηi = µi . The asymptotic distribution of Γ
i
s
e
beyond the scope of this paper while that of Γi can be derived from the results in Dauxois
et al. (1982) allowing us to apply the results in Theorem 4.1 at least when the center of all
the populations is assumed to be known.
Remark 4.3. As in Section 3, a bootstrap procedure can be considered. In order to
estimate θℓ , we can consider estimators of the operators Υi for 1 ≤ i ≤ k and thus estimate
b w , a bootstrap procedure can defined
Υw . Therefore, if θbℓ are the positive eigenvalues of Υ
using Steps 3 and 4 in Section 3.
Acknowledgments
This research was partially supported by Grants X-018 and X-447 from the Universidad de Buenos
Aires, pid 216 and pip 592 from conicet, and pict 821 and 883 from anpcyt, Argentina.

Appendix
Proof of Theorem 3.1. Since ni /n → τi ∈ (0, 1), the independence between the two
estimated operators allows us to conclude that,
o
√ n
1
1
D
e 1 − Γ1 ) − (Γ
e 2 − Γ2 ) −→
n (Γ
√ U1 − √ U2 ∼ U ,
τ1
τ2

where U is a Gaussian random element of F with covariance operator given by Υ =
τ1 −1 Υ1 + τ2 −1 Υ2 . Then, we easily get
X
D
e 1 − Γ1 ) − (Γ
e 2 − Γ2 ), (Γ
e 1 − Γ1 ) − (Γ
e 2 − Γ2 )iF −→
n h(Γ
θℓ Zℓ2 ,
ℓ ≥1

where {θℓ }ℓ ≥1 are the eigenvalues associated to the operator Υ.

Proof of Theorem 3.2. Let X̃n = (X1,1 , · · · , X1,n1 , X2,1 , · · · , X2,n2 ), Z̃n = (Z1 , · · · , Zqn )
Pq n b 2
and Z̃ = {Zℓ }ℓ ≥1 with Zi ∼ N (0, 1) independent. Define Ubn (X̃n , Z̃n ) =
ℓ=1 θℓ Zℓ ,
P∞
Pq n
2
2
Un (Z̃n ) = ℓ=1 θℓ Zℓ and U (Z̃) = ℓ=1 θℓ Zℓ .
b − Υk for each ℓ (see Kato, 1966), which implies that
First note that |θbℓ − θℓ | ≤ kΥ
qn
X
ℓ=1

qn √ b
nkΥ − Υk .
|θbℓ − θℓ | ≤ √
n
10

(A.1)

On the other hand, we have
qn
i X
i
h
h
X
E |Ubn − U ||X̃n = E |Ubn − Un + Un − U | |X̃n ≤
|θbℓ − θℓ | +
θℓ
ℓ=1

which together with (A.1), the fact that
i p
h
implies that E |Ubn − U | |X̃n −→ 0.



ℓ>qn

P

b
nkΥ−Υk
= OP (1), qn / n → 0 and ℓ ≥1 θℓ < ∞

We also have the following inequalities

P(Ubn ≤ t|X̃n ) = P(Ubn ≤ t ∩ |Un − U | < ǫ |X̃n ) + P(Ubn ≤ t ∩ |Un − U | > ǫ |X̃n )

≤ P(U ≤ t + ǫ) + P(|Un − U | > ǫ |X̃n )
1
1
≤ FU (t + ǫ) + E(|Un − U | |X̃n ) ≤ FU (t) + ∆ǫ (t) + E(|Un − U | |X̃n ) ,
ǫ
ǫ

where ∆ǫ (t) = sup|δ|≤ǫ |FU (t + δ) − FU (t)|. Besides,
P(Ubn ≤ t |X̃n ) = P(Ubn ≤ t ∩ |Un − U | < ǫ |X̃n ) + P(Ubn ≤ t ∩ |Un − U | > ǫ |X̃n )

≥ P(U ≤ t − ǫ ∩ |Un − U | < ǫ |X̃n )
1
1
≥ FU (t − ǫ) − E(|Un − U | |X̃n ) ≥ FU (t) − ∆ǫ (t) − E(|Un − U | |X̃n ) .
ǫ
ǫ

Therefore,

1
|P(Ubn ≤ t |X̃n ) − FU (t)| ≤ ∆ǫ (t) + E(|Un − U | |X̃n ) .
ǫ
As we mentioned in Remark 3.1, FU is a continuous distribution function on R and so,
uniformly continuous, hence limǫ→0 supt∈R ∆ǫ (t) = 0, which implies that ρk (FU ∗ |X̃n , FU ) =
n
p
supt |P(Ubn ≤ t |X̃n ) − FU (t)| −→ 0.
Proof of Theorem 3.3. Using the Karhunen–Loéve representation, we can write
X1,j
X2,j

= µ1 +
= µ2 +


X

ℓ=1

X

1

λℓ2 f1ℓj φℓ , 1 ≤ j ≤ n1
1
2

λℓ

ℓ=1

where fiℓj ∼ fiℓ in (6).
a) For 1 ≤ j ≤ n2 , let Zj = µ2 +
Vj =

P∞


X
ℓ=1



∆ℓ
1+ √
n

1
2

f2ℓj φℓ , 1 ≤ j ≤ n2 .

1
2

ℓ=1 λℓ
1

λℓ2

"

f2ℓj φℓ = µ2 + Z0,j , and
#
1
∆ℓ 2
1+ √
− 1 f2ℓj φℓ .
n

e 2 = (1/n2 ) Pn2 (X2,j − µ2 )⊗
Define the following operators that will be used in the sequel Γ
j=1
b Z = (1/n2 ) Pn2 Z0,j ⊗ Z0,j , Γ
b V = (1/n2 ) Pn2 Vj ⊗ Vj and finally, A
e =
(X2,j − µ2 ) , Γ
0
j=1
j=1
11

P 2
(1/n2 ) nj=1
(Z0,j ⊗ Vj + Vj ⊗ Z0,j ). Using that X2,j − µ2 = Z0,j + Vj , we obtain the following
e 2 − Γ1 = Γ
bZ + Γ
b V + A.
e
expansion Γ
0
The proof will be carried out in several steps, by showing that




b2 − Γ
e 2 ) = oP (1)
n2 (Γ
√ b
= oP (1)
n 2 ΓV
1
√ e
p
n2 A −→ τ22 Γ
D

b Z − Γ1 ) −→ U2 /; ,
n2 (Γ
0

(A.2)
(A.3)
(A.4)
(A.5)

where U2 is a zero mean Gaussian random element with covariance operator Υ2 . Using
that the covariance operator of Z0,j is Γ1 , (A.5) follows from Dauxois et al. (1982).

b2 − Γ
e 2 = − X 2 − µ2 ⊗
We will
derive
(A.2).
Note
that
X

µ
=
Z
+
V
and
Γ
2,j
2
0,j
j


 √

X 2 − µ2 . Then, it is enough to prove that n2 X 2 − µ2 = n2 Z 0 + V = OP (1).

By the central limit theorem in Hilbert spaces, we get that n2 Z 0 converges in distri√
bution, and so it is tight, i.e., n2 Z 0 = OP (1).


On the other hand, to derive that n2 V = OP (1), we will further show that n2 V =
i2
h
P
√ 12
oP (1). To do so, note that E[kV k2 ] = (1/n2 ) ∞
ℓ=1 λℓ (1 + (∆ℓ / n)) − 1 . Using the in√ P

equality (1 + a)1/2 −1 ≤ a1/2 , for any a ≥ 0, we get that E(k n2 V k2 ) ≤ (1/ n) ∞
ℓ=1 λℓ ∆ℓ ,
concluding the proof of (A.2).
To obtain (A.3), note that
"
# "
#
1
1
X 1 1 
2
2


s

1+ √
Vj ⊗ Vj =
λℓ2 λs2
−1
− 1 f2ℓj f2sj φℓ ⊗ φs
1+ √
n
n
ℓ,s

and
bV
Γ

n2
1 X
Vj ⊗ Vj
n2
j=1
"
# "
#
1
1
X 1 1 
∆ℓ 2
∆s 2
2
2
=
1+ √
λℓ λs
−1
− 1 Uℓh φℓ ⊗ φs
1+ √
n
n

=

ℓ,s

where Uℓs = (1/n2 )

P n2

j=1 f2ℓj

f2sj . Note that f2ℓj ∼ f2ℓ and so

1
var(f2ℓ f2s ) + E2 (f2ℓ f2s )
n2
1
1
2 2
f2s ) + E2 (f2ℓ f2s ) ≤
E(f2ℓ
σ4,ℓ σ4,s + 1 ,
n2
n2

2
E(Uℓs
) = var(Uℓs ) + E(Uℓs )2 =



(A.6)

where the last bound follows from the Cauchy–Schwartz inequality and the fact that
2 ) = 1 and σ 2 = E(f 4 ). Thus, using the inequality (1 + a)1/2 − 1 ≤ a/2, we get
E(f2s
4,s
2s
12

h
i2
√ 1/2
that (1 + ∆ℓ / n) − 1 ≤ ∆2ℓ /(4n) which together with (A.6) implies that
b V k2 )
E(n2 kΓ
F

= n2

X

λℓ λs

ℓ,s

=

∆ℓ
1+ √
n

1

2

# 2 "

−1

∆s
1+ √
n

1
2

#2

−1

2
)
E(Uℓs

∆2ℓ ∆2s
2
E(Uℓs
)
16n2
ℓ,s


n2 X
1
2 2
σ
σ
+
1
λ
λ


4,ℓ 4,s
ℓ s ℓ s
16n2
n2
ℓ,h
!2
!2
X
X
n
1
2
λℓ ∆2ℓ σ4,ℓ +
λℓ ∆2ℓ
16n2
16n2

≤ n2


X

"

λℓ λs





b V k2 ) → 0 concluding the proof of (A.3).
and so, E(n2 kΓ
F

Finally, to derive (A.4) note that analogous arguments allow to show that
!
!
X
X
X
1
1
e − E(A)k
e 2) ≤ √
λℓ σ4,ℓ
λs λℓ ∆s σ4,s σ4,ℓ ≤ √
λℓ σ4,ℓ ∆ℓ
E(n2 kA
n
n


ℓ,s



while


√ e
=
E( n2 A)
n2 E (Z0,1 ⊗ V1 + V1 ⊗ Z0,1 )
"
#
1

√ X
∆ℓ 2
= 2 n2
λℓ
− 1 φℓ ⊗ φℓ
1+ √
n
ℓ=1




1 X
= 2 n2 √
λℓ 
n
ℓ=1

1+

∆ℓ
1

∆ℓ

n

2

1

+1

φℓ ⊗ φℓ → τ22


X
ℓ=1

1

λℓ ∆ℓ φℓ ⊗ φℓ = τ22 Γ

concluding the proof of (A.4). The proof of a) follows easily combining (A.2) to (A.5).

√ b

D
b) From a), we have that n Γ

Γ
2
1 −→ Γ+(1/ τ2 )U2 where U2 a zero mean Gaussian
random element with covariance
operator
 Υ2 . On the other hand, the results in Dauxois
√ b
D
et al. (1982) entail that n1 Γ1 − Γ1 −→ U1 , where U1 a zero mean Gaussian random

√ b

D
element with covariance operator Υ1 and so, n Γ

Γ
−→ (1/ τ1 )U1 . The fact
1
1
that the two populations are independent implies that U1 and U2 can be chosen to be
independent so,
o
√ n

1
1
D
b2 − Γ
b 1 ) = n (Γ
b 2 − Γ1 ) − (Γ
b 1 − Γ1 ) −→
n(Γ
Γ + √ U2 − √ U1 ∼ Γ + U
τ2
τ1

where U is a Gaussian random element of F with covariance operator Υ = τ1 −1 Υ1 +τ2 −1 Υ2 .
√ b
2
b 2 D
Therefore, Tn = k n(Γ
2 − Γ1 )kF −→ kΓ + UkF .
13

To conclude the proof, we have to obtain the distribution of kΓ + Uk2F . Since U is
a zero mean Gaussian random element of F with covariance operator Υ, we have that
P
1/2
U can be written as ℓ ≥1 θℓ Zℓ υℓ where Zℓ are i.i.d. random variables such that Zℓ ∼



2
P
P
1/2
1/2
N (0, 1). Hence, Γ+U = ℓ ≥1 ηℓ + θℓ Zℓ υℓ and so, kΓ+Uk2F = ℓ≥1 ηℓ + θℓ Zℓ =
2

P
1/2
+
Z
θ
η

concluding the proof.

ℓ ℓ
ℓ≥1 ℓ

√ b
Proof of Theorem 4.1. Consider the process Vk,n = { n(Γ
i − Γi )}1≤i≤k . The independence of the samples and among populations together with the results stated in Dauxois
et al. (1982), allow to show that Vk,n converges in distribution to a zero mean Gaussian
random element U of F k with covariance operator Υ̃. More precisely, we get that

D
b i − Γi )}1≤i≤k −→
U = (U1 , · · · , Uk )
{ n(Γ

where U1 , · · · , Uk are independent random processes of F with covariance operators τi −1 Υi ,
respectively.
Let A : F k → F k−1 be a linear operator given by A(V1 , · · · , Vk ) = (V2 − V1 , · · · , Vk − V1 ).
√ b
√ b
D
The continuous map Theorem guarantees that A( n(Γ
1 − Γ1 ), · · · , n(Γk − Γk )) −→ W ,
where W is a zero mean Gaussian random element of F k−1 with covariance operator Υw =
AΥ̃A∗ where A∗ denote the adjoint operator of A. It is easy
to see that the adjoint operator
Pk−1
A∗ : F k−1 → F k is given by A∗ (u1 , . . . , uk−1 ) = (− i=1
ui , u1 , . . . , uk−1 ). Hence, as
U1 , · · · , Uk are independent, we conclude that
!


k−1
X
1
1
1
ui .
Υ2 (u1 ), . . . , Υk (uk−1 ) + Υ1
Υw (u1 , . . . , uk−1 ) =
τ2
τk
τ1
i=1

Finally,
Tk,n = n

k
X
j=2

D

bj − Γ
b 1 k2F −→


X

θℓ Zℓ2

ℓ ≥1

where Zℓ are i.i.d standard normal random variables and θℓ are the eigenvalues of the
operator Υw .

References
[1] Benko, M., Härdle, P. & Kneip, A. (2009). Common Functional Principal Components.
Annals of Statistics, 37, 1-34.
[2] Boente, G. & Fraiman, R. (2000). Kernel-based functional principal components.
Statistics and Probabability Letters, 48 , 335-345.
[3] Boente, G., Rodriguez, D. & Sued, M. (2010). Inference under functional proportional
and common principal components models. Journal of Multivariate Analysis, 101, 464475.
14

[4] Cardot, H., Ferraty, F., Mas, A. & Sarda, P. (2003). Testing hypotheses in the functional linear model. Scandinavian Journal of Statistics, 30, 241255.
[5] Cuevas, A., Febrero, M. & Fraiman, R. (2004). An anova test for functional data.
Computational Statistics & Data Analysis, 47, 111122.
[6] Dauxois, J., Pousse, A. & Romain, Y. (1982). Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference.
Journal of Multivariate Analysis, 12, 136-154.
[7] Gabrys, R. & Kokoszka, P. (2007), Portmanteau Test of Independence for Functional
Observations. Journal of the American Statistical Association, 102, 13381348.
[8] Gabrys, R., Horvath, L. & Kokoszka, P. (2010). Tests for error correlation in the
functional linear model. Journal of the American Statistical Association, 105, 11131125.
[9] Fan, J. & Lin, S.-K. (1998). Tests of significance when the data are curves. Journal of
the American Statistical Association, 93, 10071021.
[10] Ferraty, F., Vieu, Ph. & Viguier–Pla, S. (2007). Factor-based comparison of groups of
curves. Computational Statistics & Data Analysis, 51, 4903-4910.
[11] Gupta, A. & Xu, J. (2006). On some tests of the covariance matrix under general
conditions. Annals of the Institute of Statistical Mathematics, 58, 101114.
[12] Horváth, L., Husková, M. & Kokoszka, P. (2010). Testing the stability of the functional
autoregressive process. Journal of Multivariate Analysis, 101, 352367.
[13] Kato, T. (1966). Perturbation Theory for Linear Operators. Springer-Verlag, New York.
[14] Ledoit, O. & Wolf, M. (2002). Some hypothesis tests for the covariance matrix when the
dimension is large compared to the sample size. Annals of Statistics, 30, 4, 1081-1102.
[15] Neuhaus, G. (1980). A note on computing the distribution of the norm of Hilbert space
valued gaussian random variables. Journal of Multivariate Analysis, 10, 19-25.
[16] Panaretos, V. M., Kraus, D. & Maddocks, J. H. (2010). Second-Order Comparison of
Gaussian Random Functions and the Geometry of DNA Minicircles. Journal of the
American Statistical Association, 105, 670682.
[17] Schott, J. (2007). A test for the equality of covariance matrices when the dimension
is large relative to the sample sizes.Computational Statistics & Data Analysis, 51, 12,
6535-6542.
[18] Seber, G. (1984) Multivariate Observations. John Wiley and Sons.
[19] Shen, Q. & Faraway, J. (2004). An F-test for Linear models with functional responses.
Statistica Sinica, 14, 12391257.
[20] Van der Vaart, A. (2000). Asymptotic Statistics. Cambridge University Press.

15