v C denotes the nn12 vector that is obtained from vec C by eliminating all
supradiagonal elements of a n ]n matrix C. In this way, for symmetric C, vC
contains only the generically distinct elements of C. For a random vector X or for a pair of random vectors X and , CovX and CovX, indicate the
variance}covariance matrix of X and of vecX, , respectively. The trace of a square matrix C is denoted by tr C and the determinant is denoted by det C.
The Kronecker product of any matrix A and B is denoted by A?B, whereas the sum of two vector subspaces H1 and H2 is denoted by H1=H2. The lag
operator denoted by ¸ and the dierence operator is denoted by D1¸.
2. Causal measures for cointegrated VAR model
The section introduces the measures OMO and FMO for non-deterministic stationary time-series and then extends them to non-stationary time-series in
cointegrated relations see for mathematical details Hosoya, 1991, 1997a. At the end of this section, we discuss long- and short-run relationships expressed by
those one-way eect measures.
The construction of the causal measures, in particular the measures of one-way eect, is closely related to the prediction theory of stationary processes.
Suppose that M;t, t, t3ZN is a zero mean jointly covariance stationary
process where the ;t and t are p1]1 and p2]1 real vectors, respectively pp1p2. Suppose also that the process M;t,tN is non-deterministic
and has the p ]p spectral density matrix
f j
C
f11j f12j f21j f22j
D
, njn,
where f11j is the p1]p1 spectral density of M;tN, and that f j satises
P
n ~n
log det f j djR.
2.1 Under condition 2.1, f
j has a factorization such that f
j 1
2 n
K e
~jKe~jH, 2.2
where Ke ~j is the boundary value limk?1~Kke~j of a p]p matrix-valued
function Kz which is analytic and has no zeros inside the unit disc Mz: DzD1N of
the complex plane. Such a factorization is said to be a canonical factorization in the sequel. Let R be the covariance of the one-step ahead linear prediction error
228 F. Yao, Y. Hosoya
Journal of Econometrics 98 2000 225}255
of the process M;t, tN by its own past; then, we have
det MK0K0HNdet R2np exp
G
1 2
n
P
n ~n
log det f j dj
H
2.3 see Rozanov, 1967, pp. 71}7, for example. The relationship 2.2 is the fre-
quency domain version of the time-domain Wold decomposition representation
A
; t
t
B
= +
j0 K
I jetj, where
MetN is a white-noise process with arMetNIp and the matrices KIj are the real-matrix coecients in the expansion of the analytic function Kz; namely
K z
+ =
j0 K
I jzj. The one-way eect component of t is the component which causes
M;tN one-sidedly but suers no feedback from it in the Granger sense. We can ex-
tract such component and the residual is denoted by 0,~1t from t as the
linear regression
residual obtained
by regressing
t on
M;t1j, tj, j3Z`N. It turns out that M0,~1tN is a white-noise process with covariance matrix R22R21R~1
11 R
12 and that
MtN does not cause
M0,~1tN in the Granger sense see Hosoya, 1991. Although we have no way of measuring directly how the addition of the series
MtN improves the one-step ahead prediction of
M;tN, the series M0,~1tN enables us to do so. A series
MtN causes another M;tN in Grangers sense if Mtj, j3Z`N is informative in predicting ;t, whence it would be natural to measure the
strength of causality by the extent the one-step ahead prediction error of ;t is reduced by adding the information H
Mtj, j3Z`N. Since MtN does not cause
M0,~1tN and ;t is orthogonal to 0,~1t the ;t has the Sims representation
; t
= +
j1 P
I j 0,~1
tjNt, t3Z
, 2.4
where the P I j are p
1] p2 matrices and Nt is a dependent stationary process
which is orthogonal to the process M0,~1tN. It follows from 2.4 that f11j the
spectral density matrix of M;tN is decomposed as
f11jg1jg2j, 2.5
where g
1j 1
2 n
RP I je~+jR
22 R
21 R
~1 11
R 12
RP I je~+jH
and g 2j is the spectral density matrix of the process MNtN. Also it follows
from 2.4 that the regression residual of ;t on H M0,~1tj, ;tj, j3Z`N
F. Yao, Y. Hosoya Journal of Econometrics 98 2000 225}255
229
is equal to the one of Nt on H MNtj, j3Z
`N so that the determinant of that covariance matrix is provided in view of 2.3 by
2 n
p
1
exp
G
1 2
n
P
n ~n
log det g 2j dj
H
, whereas the corresponding quantity of the residual by regressing
M;tN on H
M;tj, j3Z `N is equal to
2 n
p
1
exp
G
1 2
n
P
n ~n
log det f11j dj
H
. Since
log det f11jlogdetg2j for each j by 2.5, the quantity log
Mdet f11jdetg2jN and the integration over [n, n] can be regarded as measuring the frequency-wise as well as overall improvement of prediction by
the use of the information H M0,~1tj, j3Z`N, respectively, equivalently as
measuring the strength of causality of M0,~1tN to M;tN.
For the purpose of obtaining the explicit analytic expression of the prediction improvement, it is convenient to translate the above rather time-domain-
oriented construction in terms of frequency-domain representations. In contrast to the Wold decomposition of
M;t, tN which is a decomposition into an orthogonal sum in the time domain,
M;t, tN is known to have the spectral representation
; t
P
n ~n
e jt d;Ij and t
P
n ~n
e jt dIj,
where ; I j and Ij are frequency-wise orthogonal random measures such
that Cov
Md; I j, dIjNf j,
namely, the processes M;tN and MtN are interpreted as weighted sums of
harmonic oscillations with orthogonal random weight for the respective fre- quency. On the other hand, the prediction error formula 2.3 implies that the
one-step ahead prediction error of ;t measured in terms of the determinant of the prediction error covariance matrix is the geometric mean of the
det Cov
Md; I jN over the frequency domain njn. In other words, the
variability of d; I j expresses the frequency-wise contribution to the one-step
ahead prediction error of ;t. In the case of the joint one-step ahead prediction of
M;t, tN, a similar argument applies and the variability expressed by det Cov
Md; I j,dIjN indicates the contribution of the j-frequency oscillation to
the joint prediction error of ;t and t. Now in order to measure the strength of Granger causality, the questions to
be asked are how much of the prediction error reduction in ;t is attributed to
230 F. Yao, Y. Hosoya
Journal of Econometrics 98 2000 225}255
the other series Ms, st1N when it is added for the prediction of ;t and
which portion of the variability in the pair Md;
I j, dIjN, which is correlated in general, is attributable to the series
MtN. Although these questions are intrac- table if we try to work directly with the pair
M;t, tN, the pairing M;t, 0,~1tN enables us to solve these questions. In view of its construction of
0,~1 t, the projection residual of ;t onto H
M0,~1tj;j3Z`N is given by N
t
P
n ~n
e tjMd;IjfI
12 j f
I ~1 22
jd I 0,~1jN,
2.6 where the spectral density matrix of the process
M;t, 0,~1tN is denoted by the p1 to p2 partitioned matrix
f I j
C
f I
11 j f
I 12
j f
I 21
j f I
22 j
D
and f I
11 jf11j, fI21jMR21R~1
11 , Ip
2
NK0Ke ~j~1f
1 j, where f1j is
the p
]p 1
matrix which
consists of
the rst
p1 columns of f
j, f I
22 j12nMR22R21R~1
11 R
12 N see Hosoya, 1991, pp. 432}3, and also
see Whittle, 1963 for the spectral regression 2.6. Relation 2.6 is nothing but the frequency-domain version of 2.4. Since the one-step ahead prediction error
of ;t on the basis of H M;tj, 0,~1tj,j3Z`N is the same as that of Nt
on the basis of its own past, it follows that det R
11 2
np
1
exp
C
1 2
n
P
n ~n
log det Cov Md;
I j f
I 12
j f I ~1
22 j d
I 0,~1
jN dj], 2.7
where R 11
denotes the covariance matrix of the one-step ahead prediction error of St; whereas as for the prediction of ;t by its own past values, we have the
relation det R112np
1
exp
C
1 2
n
P
n ~n
log det Cov Md;
I jN dj
D
. 2.8
The comparison of 2.7 and 2.8 implies that the prediction improvement by the additional information of 0,~1t is given by
MV?UlogMdetR11detR11N 2.9
and that the frequency-wise reduction of the variability from d; I j to d;Ij is
given by MV?Ujlog[detCovMd;IjNdetCovMd;Ij
f I
12 j f
I ~1 22
j d I
0,~1 jN].
2.10
F. Yao, Y. Hosoya Journal of Econometrics 98 2000 225}255
231
It turns out that MtN does not cause M;tN in the Granger sense if and only if
MV?U0 see Hosoya, 1991, p. 432. Consequently, in conformity with Gran- gers causality concept, we might call MV?U the overall measure of one-way
eect OMO from to ; and MV?Uj the frequency-wise measure of one-way eect FMO. It is obvious that MV?Uj in 2.10 can also be expressed by
MV?Ujlog[det f11jdetMf11jfI12j fI~1 22
j f I
21 jN].
2.11 Then the OMO from to ; can be expressed by
MV?U 1
2 n
P
n ~n
MV?Uj dj. 2.12
As a next step, in order to extend this causal analysis of non-deterministic stationary time-series to non-stationary processes, consider the process
MXt, tN which is generated by
A ¸
C
X t
t
D C
; t
t
D
, t1,2,2
2.13 where
M;t, t, t3ZN is the stationary process dened as before, and the lag polynomial matrix A¸ is a p
]p matrix such that A
¸
C
A11¸ A22¸
D C
+ lj0 A11,j¸j
+ lj0 A22,j¸j
D
for some positive l where A11,0Ip
1
and A22,0Ip
2
. Suppose in the sequel that Xt and t for t0 are random vectors which belong to H
M;t, t0N and H
Mt, t0N respectively. The process given by 2.13 has the characteristic that the one-step ahead prediction and the residual of Xt based on
H MXtj, j1N=HM;t, t0N and t, t1, based on HMtj,
j3Z `N=HMt, t0N are the same as those of ;t and t based on
H M;tj, j3Z
`N and HMtj, j3Z`N respectively. Similarly, the joint prediction
of MXt, tN
based on
H MXtj, tj, j3Z
`N= H
M;t, t, t0N is the same as the prediction of M;t, tN based on H
M;tj, tj, j3Z `N. Therefore the predictional properties of the process
MXt, tN for t1 are entirely determined by those of the generating station- ary process
M;t, tN. Since the one-way eect structure of MXt, tN is determined only by its predictional properties, it follows that it is given by the
corresponding structure of M;t, tN. Namely, the OMO and the FMO be-
tween MXtN and MtN can be equated with the corresponding measures
between the generating processes M;tN and MtN. This is the basic idea for the
extension of the denitions of OMO and FMO to nonstationary processes.
232 F. Yao, Y. Hosoya
Journal of Econometrics 98 2000 225}255
It should be noted, however, that the relationship 2.13 is not very well dened. Suppose that B¸ is another block diagonal matrix given by
B ¸
C
B11¸ B22¸
D
, where B11¸ and B22¸ are lag polynomials such that B11,0Ip
1
and B22,0Ip
2
. The left multiplication of B¸ to each member of the equation 2.13 produces a dierent representation of the process
MXt, tN. Unless B
¸Ip, the resulting generating process MB11¸;t,B22¸tN might pos- sibly possess a spectral structure dierent from that of
M;t, tN. In order to retain invariance of the one-way eect structure under such a multiplication,
a certain restriction on the generating mechanism 2.13 is required. Let f11j12nK1e~jK1e~jH and f22j12nK2e~jK2e~ijH be ca-
nonical factorizations, respectively. Assumption 2.1.
The process 2.13 satises either i
the zeros of det A11z and detA22z are all on or outside of the unit disc; or ii There are no common zeroes between det A11z and detK1z and between
det A22z and detK2z. Remark 2.1.
Assumption i is convenient to deal with such unit-root type processes as cointegration processes, where non-stationary is generated by
a unit-root common trend. Since, then, B¸ is limited to such lag polynomials for which the zeroes of det B11z and detB22z are on or outside the unit disc,
the one-way eect structure between B11¸;t and B22¸t remains invari- ant, thanks to the relations that
P
n ~n
log det Biie~jBiie~jHdj0, i1, 2, which follows from a more basic relationship in calculus that for real r such that
DrD1,
P
n ~n
log D12r cos jr2D dj0.
If some zeros of det Az is allowed to be inside the unit circle so that the process has a greater-than-unity root, this invariance property does not hold any longer.
To deal with such a circumstance, assumption ii would be useful in order to identify the generating process.
The preceding consideration leads us to the following extended denitions of the Granger non-causality and of the measures OMO and FMO. Suppose the
process MXt, t, t1, 2,2N generated by 2.13 satises Assumption 2.1.
F. Yao, Y. Hosoya Journal of Econometrics 98 2000 225}255
233
Dexnition 2.1. MtN is said not to cause MXtN if and only if the prediction error
covariance matrices of Xt based on H M;s, s, st1N and based on
H M;s, st1N are identical.
Dexnition 2.2. The OMO MY?X and the FMO MY?Xj are dened by
MY?X,MV?U and MY?Xj,MV?Uj, respectively.
Remark 2.2. Note that we have H
M;tj, tj, j3Z`NHMXtj, tj, j3Z
`;;s, s, s0N and HM;tj, j3Z`NHMXtj, j3Z`; ;
s, s0 N, and also that MtN does not cause MXtN if and only if MtN does
not cause M;tN.
Now consider the p-dimensional process Zt MXtH, tHNH represented by
a nite a-th order VAR model Z
t a
+ j1
P jZtj
et t1, 2,2, 2.14
where the P js are p ]p matrices, MetN is a p-dimensional white noise process
such that
E et0, CovetR,
and rank Rp
. Set
A ¸Ip+aj1Pj¸j, where the zeros of detAz are assumed to be either on
or outside of the unit disc. Denote by C¸ the adjoint matrix of A¸ so that C
¸A¸,D¸, where D¸ is the diagonal matrix having d¸,det A¸ as the common
diagonal element, d¸ +
bj0 dj¸j is a lag polynomial with scalar coecients such that d01 and the zeros of +bj0 djzj are either on or outside the unit
circle. Left-multiplying C¸ to the members of Eq. 2.14, we have
C
d ¸
} d
¸ d
¸ }
d ¸
D
C
X t
t
D
C ¸
et,
C
; t
t
D
, 2.15
where we set =t,[;tH, tH]H for p1 and p2 vectors ;t and t this representation is given by Granger and Lin, 1995 and a similar idea is also used
in Engle and Granger, 1987. It follows from the above construction that
234 F. Yao, Y. Hosoya
Journal of Econometrics 98 2000 225}255
M;t, tN is a stationary MA process and that the process MXt, tN satises Assumption 2.1 i. Therefore, in view of Denition 2.2 above, all the measures of
one-way eect for the possibly non-stationary processes MXt, tN are deter-
mined by the corresponding measures of the stationary processes M;t, tN.
Moreover, since the zeros of det Cz are either on or outside the unit circle, the covariance matrix of the one-step ahead prediction error of =t is equal to
R and if the spectral density matrix of
M=tN is denoted by f j, it has a canonical factorization
f j
1 2
n K
e ~jKe~jH,
2.16 where Ke
~jCe~jR12 for the Cholesky factor R12 of R such that RR
12R12. Then the causal measures can be calculated in view of 2.11 and 2.12 by means of the spectral density f
j dened by 2.16 and its factor Ke~j. A variety of causal measures can be derived on the basis of the OMO and the
FMO between
MXtN and MtN for the purposes of the long-run or short-run characterization of causal relation. In case MY?XO0, for example, the contri-
bution of a long-run eect in the overall one-way eect is given by DY?Xe
1 2
n
P
e ~e
MY?Xj djMY?X for a certain low-frequency band [
e, e], or one might be rather interested in the
contribution of
the relative
eect for
a given
period band
[t1, t2]2t1t2 DY?Xt1,t2
1 n
P
2nt
1
2nt
2
MY?XjdjMY?X, where period implies the time-length of a cycle and we used the relation t2
nj between period t and frequency
j j0. The long-run eect may be measured in another way, by the mean FMO
which is given by D
M Y?X
e 1
2 e
P
e ~e
MY?Xjdj, where
e is a small positive number, or its limit as eP0. In order to summarize the one-way eect in a period band [t1, t2],
D M
Y?X t1,t2
t1t2 2
nt2t1
P
2nt
1
2nt
2
MY?Xjdj might be useful. In any case, in order to interpret those quantities on the basis of
empirical data, we need a statistical testing theory.
F. Yao, Y. Hosoya Journal of Econometrics 98 2000 225}255
235
Remark 2.3. A notable peak of MV?Uj at jp2, for example for quarterly
data, would indicate that there is a signicant one-way eect from MtN to
M;tN in one-year period cycle. But this does not imply that MtN causes M;tN with one-year lag. The time-lag relationship in
MtN causing M;tN should be observed in Sims distributed-lag representation 2.4 which connects
M0,~1tN and
M;~1,0tN in the time domain. Remark 2.4.
The existence of the Nyquist frequency see for example Yao, 1985 should not be ignored. The discernible highest frequency is
jn, which corres- ponds to two periods t2
nj2; namely, half a year for quarterly data. The economic implication is that, we cannot discern the one-way eect shorter than
half a year for quarterly data.
3. Testing causality in cointegrated VAR processes