for nonlinear Granger causality is illustrated in two examples. In the first example, we examine the main explanations of the
asymmetric volatility phenomenon using high-frequency data on SP 500 Index futures contracts and find evidence of a
nonlinear leverage and volatility feedback effects. In the sec- ond application, we investigate the Granger causality between
SP 500 Index returns and trading volume. We find convincing evidence of linear and nonlinear feedback effects from stock
returns to volume, but a weak evidence of nonlinear feedback effect from volume to stock returns.
The rest of the article is organized as follows. The condi- tional independence test using the Hellinger distance and the
Bernstein copula is introduced in Section 2. Section 3 provides the test statistic and its asymptotic properties. In Section 4, we
investigate the finite sample size and power properties of our test and we compare with Su and White’s
2008 test. Section
5 contains the two applications described above. Section 6 con- cludes. The proofs of the asymptotic results are presented in the
Technical Appendix, which is available online.
2. NULL HYPOTHESIS, HELLINGER DISTANCE, AND
THE BERNSTEIN COPULA
Let {X
′ t
, Y
′ t
, Z
′ t
′
∈ R
d
1
× R
d
2
× R
d
3
, t = 1, . . . , T } be a sample of stochastic processes in R
d
, where d = d
1
+ d
2
+ d
3
, with joint distribution function F
XYZ
and density function f
XYZ
.
We wish to test the conditional independence between Y and Z conditionally on X. Formally, the null hypothesis can be written
in terms of densities as
H :
Pr{f
Y|X,Z
y | X, Z = f
Y|X
y | X} = 1, ∀y ∈ R
d
2
, or
H :
Pr{f y, X, Zf X = f y, Xf X, Z} = 1, ∀y ∈ R
d
2
, 1
and the alternative hypothesis as H
1
: Pr{f
Y|X,Z
y | X, Z = f
Y|X
y | X} 1, for some y ∈ R
d
2
, where f
·|·
·|· denotes the conditional density. As we mentioned in the introduction, Granger noncausality is a form of condi-
tional independence and to see that let us consider the following example. For Y, Z
′
a Markov process of order 1, the null hy-
pothesis that corresponds to Granger noncausality from Z to Y is given by
H :
Pr{f
Y|X,Z
y
t
| y
t −1
, z
t −1
= f
Y|X
y
t
| y
t −1
} = 1,
where in this case y = y
t
, x = y
t −1
, z = z
t −1
and d
1
= d
2
= d
3
= 1. Next, we reformulate the null hypothesis
1 in terms of cop-
ulas. This will allow us to keep only the terms that involve the dependence among the random vectors. It is well known from
Sklar 1959
that the distribution function of the joint process
X
′
, Y
′
, Z
′ ′
can be expressed via a copula F
XYZ
x, y, z = C
XYZ
¯ F
X
x, ¯ F
Y
y, ¯ F
Z
z, 2
where C
XYZ
. is a copula function defined on [0, 1]
d
that
captures the dependence of X
′
, Y
′
, Z
′ ′
, and for simplicity of notation and to keep more space we denote
¯ F
X
x =
F
X
1
x
1
, . . . , F
X
d1
x
d
1
, ¯
F
Y
y = F
Y
1
y
1
, . . . , F
Y
d2
y
d
2
, ¯
F
Z
z = F
Z
1
z
1
, . . . , F
Z
d3
z
d
3
, where
F
Q
i
., for
Q = X, Y, Z, is the marginal distribution function of the i
th element of the vector Q. If we derive Equation 2
with
respect to x
′
, y
′
, z
′ ′
, we obtain the density function of the joint
process X
′
, Y
′
, Z
′ ′
that can be expressed as f
XYZ
x, y, z =
d
1
j =1
f
X
j
x
j
×
d
2
j =1
f
Y
j
y
j
×
d
3
j =1
f
Z
j
z
j
× c
XYZ
¯ F
X
x, ¯ F
Y
y, ¯ F
Z
z, 3
where f
Q
j
., for Q = X, Y, Z, is the marginal density of the jth element of the vector Q and c
XYZ
. is a copula density defined on [0, 1]
d
of X
′
, Y
′
, Z
′ ′
. Using Equation 3
, we can show that the null hypothesis in
1 can be rewritten in terms of copula
densities as H
: Pr{c
XYZ
¯ F
X
X, ¯ F
Y
y, ¯ F
Z
Zc
X
¯ F
X
X
= c
XY
¯ F
X
X, ¯ F
Y
yc
XZ
¯ F
X
X, ¯ F
Z
Z} = 1, ∀y ∈ R
d
2
, against the alternative hypothesis
H
1
: Pr{c
XYZ
¯ F
X
X, ¯ F
Y
y, ¯ F
Z
Zc
X
¯ F
X
X
= c
XY
¯ F
X
X, ¯ F
Y
yc
XZ
¯ F
X
X, ¯ F
Z
Z} 1,
for some y ∈ R
d
2
where c
X
., c
XY
. and c
XZ
. are the copula densities of the
processes X, X
′
, Y
′ ′
, and X
′
, Z
′ ′
, respectively. Observe that
under H ,
the dependence of the vector X
′
, Y
′
, Z
′ ′
is controlled
by the dependence of X, X
′
, Y
′ ′
, and X
′
, Z
′ ′
and not that of
Y
′
, Z
′ ′
. Note also that in the typical case where d
1
= 1, we have c
X
u = 1 and therefore does not need to be estimated below.
Note that
in Equation
3, the
term c
XYZ
¯ F
X
x,
¯ F
Y
y, ¯ F
Z
z corresponds to the copula density that is de- fined on all the univariate components of X, Y, and Z. Al-
ternatively, we can equivalently rewrite Equation 3 in terms of the product of densities of the multivariate random vectors
X, Y, Z, say f
X
x × f
Y
y × f
Z
z, and the density copula
c
XYZ
F
X
x, F
Y
y, F
Z
z that is now defined in terms of the cumulative distributions of the multivariate random vectors X,
Y, Z, rather than the marginal distributions of their respec- tive univariate components. Redefining the null hypothesis H
in this way allows us to avoid the estimation of the copula of the components of X. However, this approach requiresus
to estimate nonparametrically the joint cumulative distribution functions F
X
X,F
Y
y,F
Z
Z. The null could also be written in
terms of conditional copulas, but similarly this would requireus to estimate nonparametrically conditional distributions. The dif-
ferences between these approaches will be investigated in future work.
Given the null hypothesis, our test statistic is based on the Hellinger distance between the copulas c
XYZ
u, v, wc
X
u and
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:37 11 January 2016
c
XY
u, vc
XZ
u, w, for u ∈ [0, 1]
d
1
, v ∈ [0, 1]
d
2
, w ∈ [0, 1]
d
3
, H
c, C =
[0,1]
d
1 − c
XY
u, vc
XZ
u, w
c
XYZ
u, v, wc
X
u
2
× dC
XYZ
u, v, w. 4
Under the null hypothesis, the measure H c, C is equal to zero. Note that in case we want to test in particular directions, then this
is easily done by using a weighting function in 4
. The Hellinger distance is often used for measuring the closeness between two
densities and this is because it is simple to handle compared to L
∞
and L
1
. Furthermore, it is symmetric and invariant to continuous monotonic transformations and it gives lower weight
to outliers see e.g., Beran 1977
. The Hellinger distance in 4
can be estimated by
ˆ H = H ˆc, C
T
=
[0,1]
d
1 − ˆc
XY
u, v ˆc
XZ
u, w
ˆc
XYZ
u, v, w ˆc
X
u
2
× dC
XYZ,T
u, v, w
= 1
T
T t =1
⎛ ⎝
1− ˆc
XY
¯ F
X,T
X
t
, ¯ F
Y,T
Y
t
ˆc
XZ
¯ F
X,T
X
t
, ¯ F
Z,T
Z
t
ˆc
XYZ
¯ F
X,T
X
t
, ¯ F
Y,T
Y
t
, ¯ F
Z,T
Z
t
ˆc
X
¯ F
X,T
X
t
⎞ ⎠
2
,
where ¯ F
X,T
X
t
, ¯ F
Y,T
Y
t
, and ¯ F
Z,T
Z
t
with subscript T is to indicate the empirical analog of the distribution functions
defined in ¯ F
X
X, ¯ F
Y
Y, and ¯ F
Z
Z; C
XYZ,T
. is the empir- ical copula defined by Deheuvels
1979 ; and ˆc
X
., ˆc
XY
., ˆc
XZ
., and ˆc
XYZ
. are the estimators of the copula densities c
X
., c
XY
., c
XZ
., and c
XYZ
. respectively obtained using the Bernstein density copula defined below. Let us first set some
additional notations. In what follows, we denote by
G
t
= G
t 1
, . . . , G
t d
= ¯ F
X
X
t
, ¯ F
Y
Y
t
, ¯ F
Z
Z
t
, and its empirical analog
ˆ G
t
= ˆ G
t 1
, . . . , ˆ G
t d
= ¯ F
X,T
X
t
, ¯ F
Y,T
Y
t
, ¯ F
Z,T
Z
t
. The Bernstein density copula estimator of c
XYZ
. at a given
value s = s
1
, . . . , s
d
is defined by: ˆc
XYZ
s
1
, . . . , s
d
= ˆc
XYZ
s =
1 T
T t =1
K
k
s, ˆ G
t
, 5
where K
k
s, ˆ G
t
= k
d k−1
n
1
=0
· · ·
k−1 n
d
=0
A
ˆ G
t
,n d
j =1
p
n
j
s
j
, the integer k represents a bandwidth parameter, p
n
j
s
j
is the binomial distribution
p
n
j
s
j
= k − 1
n
j
s
n
j
j
1 − s
j k−n
j
−1
, for
n
j
= 0, . . . , k − 1, and A
ˆ G
t
,n
is an indicator function A
ˆ G
t
,n
= 1
{ ˆ G
t
∈B
n
}
, with B
n
= n
1
k ,
n
1
+ 1 k
× · · · × n
d
k ,
n
d
+ 1 k
. The Bernstein estimators ˆc
X
., ˆc
XY
., and ˆc
X
. of c
XY
., c
XY
., and c
XZ
., respectively, are defined in a similar way like we did for ˆc
XYZ
.. Observe that the kernel K
k
s, ˆ G
t
can be rewritten as
K
k
s, ˆ G
t
=
k−1 n
1
=0
· · ·
k−1 n
d
=0
A
ˆ G
t
,n d
j =1
Bs
j
, n
j
+ 1, k − n
j
, where Bx, n
j
+ 1, k − n
j
is a beta density with shape pa- rameters n
j
+ 1 and k − n
j
evaluated at x. K
k
s, ˆ G
t
can be viewed as a smoother of the empirical density estimator
by beta densities. To implement the Bernstein density cop- ula estimator in the simulations and applications, we define
k
∗t
= k
∗t 1
, . . . , k
∗t d
= [k ˆ G
t
], where [ .] denotes the integer part of each element. Consequently, we have
K
k
s, ˆ G
t
= k
d d
j =1
p
k
∗t j
s
j
, which is straightforward to program.
The Bernstein density copula estimator in 5
is easy to imple- ment, nonnegative, integrates to one, and is free from the bound-
ary bias problem that often occurs with conventional nonpara- metric kernel estimators. Bouezmarni, Rombouts, and Taamouti
2009 established the asymptotic bias, variance, and the uni-
form and almost sure convergence of Bernstein density copula estimator for α-mixing data. These properties are necessary to
prove the asymptotic normality of our test statistic. Notice that some other nonparametric copula density estimators are pro-
posed in the literature. For example, Gijbels and Mielniczuk
1990 suggested nonparametric kernel methods and use the re-
flection method to overcome the boundary bias problem, and more recently Chen and Huang
2007 used the local linear
estimator. Fermanian and Scaillet 2003
derived the asymp- totic properties of kernel estimators of nonparametric copulas
and their derivatives in the context of time series data. With re- spect to empirical copula processes, Fermanian, Radulovic, and
Wegkamp 2004
studied weak convergence and Doukhan and Lang
2009 stated a multidimensional functional central limit
theorem. Note that since ˆ
H contains a random denominator ˆc
X
u and
ˆc
XYZ
u, v, w, the test statistic may be ill-behaved for ˆc