2. Methodology
In modeling the correlation between security returns, we use two approaches to assemble these returns data. The first uses each of the n available single period
returns as an observation from which to infer the prevailing correlation. The second groups the single period returns data into m sequential nonoverlapping
intervals of length ts with n s m = ts. Under this second approach, we assume
that the correlation is constant across each interval of length ts and we model the correlation as changing across these m periods so defined. While this latter
approach reduces attendant measurement errors, it does so at the expense of having fewer observations from which to estimate the parameters of the model.
2.1. Single period returns We first rely on a pair of single period returns to infer the correlation prevailing
between two securities. Assuming we have demeaned the return series, it follows that for a given population correlation r the distribution of the sample correlation
coefficient, r, degenerates into a discrete distribution with either r s q1 or y1.
Under the assumption of conditional bivariate normality, from Abramowitz and Ž
. Stegun 1965, p. 937 we have:
1
w x
P r s q1 s
q arcsin r rp ,
Ž .
2 1
w x
P r s y1 s
y arcsin r rp .
Ž .
2 To investigate the behavior of correlation, we require a model for the dynamics
of r. For mathematical tractability, we transform the correlation onto the range of the real line and then model the transformed process as a Gaussian autoregressive
process. In particular, we consider transformations of the form:
W s W r ,
Ž .
Ž . Ž .
Ž .
such that W 0 s 0, W 1 s q`, W y1 s y`, and dWrd r 0. Given the form
of the discrete distribution function, we also seek a transformation that permits Ž .
tractable use of the arcsin function. To do so, let F P denote the standard normal Ž .
cumulative distribution function and implicitly define W P so that: 1
w x
P r s q1 s
q arcsin r rp s F W
Ž . Ž
.
2 or equivalently,
1
y1
W s F
q arcsin r rp .
Ž .
ž
2
That is, pF W y pr2 s arcsin r
Ž .
Ž .
or r s sin pF W y pr2 .
Ž .
Ž .
Combining, we have the following state space model:an observation equation, 1
w x
P r s q1 s
q arcsin r rp s F W , t
s 1, . . . ,n;
Ž .
Ž .
t t
t
2 and a transition equation,
W s a q b W
q se , t
s 1, . . . ,n
t t
y1 t
4 which governs the dynamics of the latent variable for ´
a sequence of i.i.d.
t
Ž Ž
. .
standard normals and where r s sin pF W y pr2 . The result is a stochastic
t t
probit model where the state variable W evolves so that the probability of the Ž
. event in this case, a sample correlation of
q1 varies stochastically over time. 2.1.1. Integration-based filtering
There are a number of ways of estimating this nonlinear model. A full-nonlin- ear filter may be run involving numerical integration of the latent variable.
Ž .
Alternatively, following Ball and Torous 1999 , a single integration-based filter may be used. As we now demonstrate, in this case the nature of the filter is such
that the integration may be implemented analytically thereby maintaining accuracy while reducing computational effort significantly.
Denote the sample correlation at time t by r and the set of sample correlations
t
through time t by R . For each time point t of a bivariate return series, either the
t
two returns have the same sign or opposite sign. When the signs are the same, we have r
s q1. To begin with, assume that the marginal distribution of W given
t t
y1
observations through time t y 1 is Gaussian:
f W N R
s N m ,s
.
Ž .
Ž .
t y1
t y1
t y1
t y1
Next project to obtain the conditional distribution of W given R which will
t t
y1
also be Gaussian f W
N R s N m
,s
Ž .
Ž .
t t
y1 t ty1
t ty1
with mean m
s a q bm
t N ty1
t y1
and variance s
2
s b
2
s q s
2
.
t ty1 t
y1
Applying Bayes theorem, we have: f W , r
N R s f r N W , R
f W N R
Ž .
Ž . Ž
.
t t
t y1
t t
t y1
t t
y1
and integrating this expression with respect to W gives the conditional likelihood
t
Ž .
function f r R . An alternative application of Bayes theorem generates the
t t
y1
posterior distribution: f W
N R s f r N W , R
f W N R
rf r N R .
Ž .
Ž . Ž
. Ž
.
t t
t t
t y1
t t
y1 t
t y1
Proceeding sequentially in this manner, we may calculate the likelihood function: lnLike
s Ýln f r N R
.
Ž .
t t
y1
Ž .
From Fruhwirth-Schnatter 1994 , we make the additional assumption that the
¨
Ž .
1
posterior distribution f W R is also Gaussian and obtain the mean and variance
t t
parameters which characterize this distribution by integration. In particular, define
w sq`
y0 .5 2
2 2
G m ,s s
F w 2ps
exp y w y m r2 s
dw
Ž .
Ž . Ž .
Ž .
Ž .
H
w sy`
w sq`
y0 .5 2
2 2
F m ,s s
w y m F w
2ps exp
y w y m r2 s dw
Ž .
Ž . Ž . Ž
. Ž
.
Ž .
H
w sy`
w sq`
y0 .5 2
2
H m ,s s
w y m F w
2ps
Ž .
Ž .
Ž . Ž .
H
w sy`
=
2 2
exp y w y m r2 s
dw.
Ž .
Ž .
Noting that the integrand in each case above involves the normal cumulative distribution function, a change of variable and differentiation with respect to m
allows us to obtain the partial derivative of G, F, H with respect to m as integral expressions that now involve the normal density function rather than the normal
cumulative distribution function. Maintaining Gaussian distributions under convo- lution implies that the resultant integrals can be expressed as Gaussian densities.
Subsequent integration with respect to m regenerates the original functions and we obtain the following results:
m G m ,s
s F
Ž .
0 .5
ž
2
1 q s
Ž .
m
2 y0 .5
2 2
F m ,s s s
2ps exp
y
Ž .
Ž .
2
ž
2 1 q s
Ž .
s
2 2
2
H m ,s s s G m ,s q m
F m ,s .
Ž .
Ž .
Ž .
2
1 q s
1
If the posterior is close to a normal density this approximation error is small. The results of Ž
. Fahrmeir 1992 indicate that the posterior tends to normal even in cases where the observation density
is extremely nonnormal.
From these results the likelihood function can be expressed as: Like r
N R s G r m ,s .
Ž .
Ž .
t t
t
The updating step in the filter generates the following analytic values for the mean and variance at time t given available information through time t:
F r m ,s
Ž .
t t ty1
t ty1
m s m q
t t
N ty1
G r m ,s
Ž .
t t ty1
t ty1
HG y F
2 2
s s N r m
,s .
Ž .
t t
t ty1 t ty1
2
G Maximum likelihood parameter estimation requires numerical optimization of this
likelihood function across the parameter space. Additionally, asymptotic standard errors are obtained from the inverse Hessian computed at the maximum likelihood
estimates.
2
2.2. Longer window methods Assuming n available single period returns, the methodology outlined in
Section 2.1 utilizes the maximum number of these observations. However, this methodology is subject to potentially significant observation error as inference
about correlation at any point in time is based solely on a pair of corresponding single period returns. This observation error can be reduced by combining several
single period returns and relying on the correlation calculated using this collection or window of returns.
Define a window of length ts as a collection of ts contiguous single period returns. We now observe the sample correlation between two return series
calculated using nonoverlapping windows of corresponding single period returns of the two securities. As a result, given n single period returns, we have m
observations where m = ts s n. We assume that the true correlation, r, remains constant for each pair of returns in a particular window.
When ts 1, the sampling distribution of the sample correlation coefficient is Ž
. no longer discrete. Anderson 1984 provides a detailed analysis of this sampling
2
The model may also be estimated using Gibbs sampling. To see this define a process Z where
t
Ž .
w x
Ž .
w x
Ž .
Z ; N W ,1 ; r s I so that P r
sq1 sF W and P r
sy1 sF yW . It will also be
t t
t Z 0
t t
t t
t
convenient to define W to represent all elements of W except W . The Gibbs sampler proceeds in the
; t t
Ž . 4
following series of steps: 1 Specify priors on parameters Q s a ,b,s . Prescribe initial values for W
Ž . Ž
. Ž . Ž
. Ž . and Z. 2 For each t, draw from f W r , Z ,W
. 3 Draw from f Z NW ,r . 4 Draw Q given Z
t N t
t ; t
t t
t
Ž . and W.
5 Go back to step 2 and sweep through the sampler. Step 4 is simply assuming a
normal-gamma conjugate prior. The more difficult computation is drawing in steps 2 and 3. Step 2 is actually straightforward also since the conditional distribution is Gaussian. Step 3 is a drawing from a
truncated normal distribution which is still quite tractable to implement.
distribution under the assumption of conditional bivariate normality. Simple analytic expressions are available when ts
s 2 or 3, but for larger values of ts either iterative formulae are needed or truncations of hypergeometric expansions
must be relied upon.
3
In general, the density of the sample correlation is highly nonnormal and Ž
. converges slowly to normality as ts increases. Fisher 1921 , however, noted that
the following transformation of the sample correlation converges to a standard normal distribution extremely quickly:
1 q r
y1
T r s 0.5ln
tanh r .
Ž . Ž .
1 y r
Let 1
q r
y1
T r s 0.5ln
tanh r
Ž . Ž .
1 y r
denote the corresponding transformation of the population correlation coefficient. This transformation is monotonically increasing and has the whole real line as its
range.
4
Our econometric framework may now be expressed conveniently in state space form. The observation equation is
T r s T r q w
t s 1, . . . ,m
Ž . Ž
.
t t
t
4 where the distribution of w
will be approximately standard normal for large ts
t
4 for each t. For small values of ts, we require the exact distribution of w
which
t
Ž . Ž .
5
can be defined implicitly given the conditional distribution of T r T r . The
t t
transition equation is T r
s a q b T r q se ,
t s 1, . . . ,m
Ž .
Ž .
t t
y1 t
4
6
where the errors ´ are independent standard normals.
t
3
Ž .
t s
Ž .
y1
Ž From Johnson et al. 1995, Chap. 32 , for ts
s 2, the density of r is given by: f r
sp 1
y
r 2
.
y 0.5
Ž
2
.Ž
2 2
.
y 1
Ž .4
t s
Ž .
y 1
Ž
2
.
y 1.5
Ž r
1 y r
1 y r r
1 q rr Q rr
while for
ts s 3,
f r
s p 1
y r 1
y
r 2 2
.
y2
Ž
2 2
. Ž .4
Ž .
Ž
2 2
.
y0 .5
Ž .
r r 3rr
q 1q2 r r Q rr where Q rr
s 1y r r arccos
y rr . For larger values of ts, Ž
. Johnson et al. 1995 provide an iterative formula to expand the density for increasing ts.
4
Ž .
y1
Ž Recall that in the single returns case, ts
s1, we use the transformation W r sF 0.5
q
1
Ž . .
Ž . Ž .
y1
Ž . arcsin r
rp while for ts1 we use T r W r s tanh
r . These transformations are similar
t s
Ž .
Ž . to each other verified in unreported calculations and both share the characteristics that W 0
s 0, Ž .
Ž .
W 1 sq`, W y1 sy`, and dW rd r 0. The choice of a particular transformation is based on
Ž . technical convenience. The transformation W P permits an analytic solution to the integrated filter
1
Ž . approach while we use W P for ts1 because this transformation applied to r converges very
t s
quickly to normality for increasing ts.
5
w Ž .
x w
Ž .x Ž .
Observe that for any x, P T r F x s P r F tanh x . The density of T r
is given by
t t
t t s
Ž .
t s
Ž Ž ..
Ž .4
t s
Ž Ž ..Ž
Ž .
2
. f
x s f
tanh x d tanh x rd x s f
tanh x 1
ytanh x .
T Ž r .
r r
6
In the empirical results presented later, we estimate the model with the reparameterization Ž
. ms a r 1y b .
Ž .
We assume that the dynamics of T r are given by a first order autoregressive
t
specification. The model may be easily extended to incorporate exogenous ex- 4
planatory variables Z that are hypothesized to influence correlation:
t
T r s a q b T r
q u Z q se , t
s 1, . . . ,m.
Ž .
Ž .
t t
y1 t
t
For example, a natural choice for Z is a measure of return volatility. In this way
t
we can statistically assess the effects of volatility on the behavior of stochastic correlation.
Ž .
As before, we follow Ball and Torous 1999 and use a single integration-based filter to estimate the parameters of the model. Given a set of m observations
Ž . Ž
. T r , . . . , T r
, the likelihood function can be expressed as
1 m
lnLike s
Ýln f T r N T
Ž .
Ž .
t t
y1
Ž . Ž
. Ž .4
where T T T r , T r
, . . . , T r denotes the history of the observable
t t
t y1
1
Ž Ž . .
Ž . through time t and f T r
T T
denotes the conditional density of T r given
t t
y1 t
the history of the observable through time t y 1.
Approximating the prior density by a normal density with the same first and second moments as the prior, the filter is then implemented in a similar fashion to
the Gaussian case. The projection, based on a normal prior, preserves normality and so can be implemented analytically. The evaluation of the conditional
likelihood requires numerical integration as the measurement error is nonGaussian, but the approximation can be made highly accurate in our case as the integration is
single dimensional. Computation of the first and second moments of the updated prior each requires an additional single dimensional numerical integration and so
the iterative scheme may be continued.
3. Data and sample statistics