Manajemen | Fakultas Ekonomi Universitas Maritim Raja Ali Haji 073500107000000412

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

A Simulation-Based Specification Test for Diffusion
Processes
Geetesh Bhardwaj, Valentina Corradi & Norman R Swanson
To cite this article: Geetesh Bhardwaj, Valentina Corradi & Norman R Swanson (2008) A
Simulation-Based Specification Test for Diffusion Processes, Journal of Business & Economic
Statistics, 26:2, 176-193, DOI: 10.1198/073500107000000412
To link to this article: http://dx.doi.org/10.1198/073500107000000412

Published online: 01 Jan 2012.

Submit your article to this journal

Article views: 76

View related articles

Citing articles: 1 View citing articles


Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=ubes20
Download by: [Universitas Maritim Raja Ali Haji]

Date: 12 January 2016, At: 17:48

A Simulation-Based Specification Test for
Diffusion Processes
Geetesh B HARDWAJ
Bates and White, San Diego, CA 92130 (geetesh.bhardwaj@gmail.com)

Valentina C ORRADI
Department of Economics, University of Warwick, Coventry CV4 7AL, U.K. (v.corradi@warwick.ac.uk )

Norman R. S WANSON

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:48 12 January 2016

Department of Economics, Rutgers University, New Brunswick, NJ 08901 (nswanson@econ.rutgers.edu )

This article makes two contributions. First, we outline a simple simulation-based framework for constructing conditional distributions for multifactor and multidimensional diffusion processes, for the case where
the functional form of the conditional density is unknown. The distributions can be used, for example, to
form predictive confidence intervals for time period t + τ , given information up to period t. Second, we
use the simulation-based approach to construct a test for the correct specification of a diffusion process.
The suggested test is in the spirit of the conditional Kolmogorov test of Andrews. However, in the present
context the null conditional distribution is unknown and is replaced by its simulated counterpart. The
limiting distribution of the test statistic is not nuisance parameter-free. In light of this, asymptotically
valid critical values are obtained via appropriate use of the block bootstrap. The suggested test has power
against a larger class of alternatives than tests that are constructed using marginal distributions/densities.
The findings of a small Monte Carlo experiment underscore the good finite sample properties of the proposed test, and an empirical illustration underscores the ease with which the proposed simulation and
testing methodology can be applied.
KEY WORDS: Block bootstrap; Finance, continuous time model; Parameter estimation error; Simulated GMM; Stochastic volatility.

1. INTRODUCTION
In this article we outline a simple simulation-based framework for constructing conditional distributions for multifactor
and multidimensional diffusion processes, in the case where
the functional form of the conditional density is unknown. The
estimated conditional distributions can be used, for example,
to form predictive confidence intervals for time period t + τ ,
given information up to period t. Further, we use the simulationbased approach to construct tests for the correct specification

of a given diffusion model. What distinguishes our approach
from that followed by numerous other authors is that we directly evaluate conditional distributions and confidence intervals, rather than focusing on marginal and/or joint distributions
when constructing our specification test.
Precedents to our article include Corradi and Swanson
(CS) (2005a), who constructed various tests, including a Kolmogorov-type test, based on the comparison of the empirical
cumulative distribution function and the cumulative distribution function (CDF) implied by the specification of the drift
and the variance functions of a diffusion, and Aït-Sahalia (AS)
(1996), who proposed an interesting specification test based on
a comparison of the marginal density of the process under the
null hypothesis and its kernel counterpart. The AS and CS tests
determine whether the drift and variance components of a particular single-factor continuous-time model are correctly specified, although the CS test is based on the comparison of CDFs,
whereas Aït-Sahalia’s is based on the comparison of densities,
and uses a nonparametric density estimator. The AS test is thus
characterized by a nonparametric rate, whereas the CS test has a
parametric rate. In the case of multifactor and multidimensional
models characterized by stochastic volatility, say, the functional

form of the invariant density of the return(s) is no longer guaranteed to be given in closed form, upon joint specification of
the drift and variance terms, so that the Kolmogorov-type test
of CS is no longer applicable. To get around this problem, CS

compared the empirical joint distribution of historical data with
the empirical joint distribution of (model) simulated data (see
also Corradi and Swanson 2007).
This article is different from that of CS in at least two important ways. First, we provide a correct specification test for
a diffusion that is based on the evaluation of an estimate of the
conditional distribution, rather than the joint or marginal distribution. This is a relevant departure from CS, because in the
literature on the evaluation of continuous-time financial models, the main goal is to construct specification tests based on
the transition density associated with a model. In fact, tests
based on comparisons of marginal distributions have no power
against iid alternatives with the same marginal, for example.
Second, we outline an easy-to-implement procedure for constructing confidence intervals for an asset price for a given period, say t + τ . This second feature is important in credit risk
management, for example, given that our confidence intervals
are predictive intervals whenever the model is estimated using
data prior to period t + 1. On the other hand, when models are
estimated using all available data, and when t + τ ≤ T, then the
methods in this article are strictly in-sample, so that our specification test is not ex ante in nature, where T is the sample size.

176

© 2008 American Statistical Association

Journal of Business & Economic Statistics
April 2008, Vol. 26, No. 2
DOI 10.1198/073500107000000412

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:48 12 January 2016

Bhardwaj, Corradi, and Swanson: Diffusion Processes

For discussion of theoretical issues that arise when our specification test is used in ex ante contexts, the reader is referred to
Corradi and Swanson (2006a).
If the functional form for the transition density were known,
we could test the hypothesis of correct specification of a diffusion via the probability integral transform approach of Diebold,
Gunther, and Tay (1998), the cross-spectrum approach of Hong
(2001), Hong and Li (2005), Hong, Li, and Zhao (2004), and
Thompson (2004), the test of Bai (2003) based on the joint use
of a Kolmogorov test and a martingalization method, or via the
normality transformation approach of Bontemps and Meddahi
(2005) and Duan (2003).
Alternative tests to that proposed in this article for the case of
unknown transition density functions have recently been suggested. For example, Aït-Sahalia, Fan, and Peng (2006) outlined a test based on the comparison of conditional kernel density estimators and approximations of conditional densities using Hermite polynomials (see also Aït-Sahalia 1999, 2002). In

principle, one can also compare conditional kernel density estimators based on historical and simulated data, along the lines
of Altissimo and Mele (2005). One feature of these tests is that
they are characterized by nonparametric rates. Our test, on the
other hand, is characterized by a parametric rate. This may seem
surprising, because it is well known that kernel estimators of
conditional distributions converge at nonparametric rates (see,
e.g., Fan, Yao, and Tong 1996 and Hall, Wolff, and Yao 1999).
Therefore, tests based on the comparison of conditional distributions of simulated and historical data are in general characterized by nonparametric rates. However, we obtain a parametric
rate. This is done by simulating S paths of length τ, all having
as common starting value the observable value at time t, say Xt .
Now,
√ the empirical distribution of the simulated series provides
a T-consistent estimator of the conditional distribution of the
null model. We can thus construct a conditional Kolmogorov
test, along the lines of Andrews (1997).
Of final note is that the null diffusion model depends on unknown parameters that need to be estimated. Parameters are assumed to be estimated via the simulated generalized method of
moments (SGMM) approach of Duffie and Singleton
(1993),

assuming exact identification. This is because T-consistency

does not hold for overidentified (S)GMM estimators of misspecified models, as shown by Hall and Inoue (2003). Additionally, and as is common with the type of test proposed here,
limiting distributions are functionals of zero-mean Gaussian
processes with covariance kernels that reflect the contribution
of parameter estimation error (PEE). Thus, limiting distributions are not nuisance parameter-free and critical values cannot
be tabulated. [Note that in the special case of testing for normality, Bontemps and Meddahi (2005) provided a GMM-type
test based on moment conditions that is robust to parameter estimation error.] In light of this fact, we provide valid asymptotic
critical values via an extension of the empirical process version
of the block bootstrap which properly captures the contribution of PEE, for the case where parameters are estimated via
SGMM. Our bootstrap results stem in part from the fact that
when simulation error is negligible (i.e., the simulated sample
grows faster than the historical sample), and given exact identification, the results of Gonçalves and White (2004) for QMLE
estimators extend to SGMM estimators.

177

The potential usefulness of our proposed bootstrap-based
tests is examined via a series of Monte Carlo experiments using
discrete samples of 400 and 800 observations, and based on the
use of bootstrap critical values constructed using as few as 100
replications. For the larger sample, rejection rates under the null

are quite close to nominal values, and rejection rates under the
alternative are generally high. Additionally, an empirical illustration is provided which underscores the ease with which one
can apply our simulation and testing methodology.
The rest of the article is organized as follows. In Section 2 we
outline a setup that is appropriate for discussing our simulation
and testing methodology, in the context of one-dimensional diffusion processes. In Section 3 we discuss a simple approach to
simulation of conditional distributions and confidence intervals.
Section 4 outlines our specification test, and Section 5 extends
all results to the case of stochastic volatility models. Section 6
contains the results of our small Monte Carlo experiment, and
discusses the findings from an empirical illustration based on
the Eurodollar deposit rate. Concluding remarks are gathered
in Section 7, and all proofs are collected in the Appendix.
2. SETUP
In this section we outline the setup for the case of onedimensional diffusion processes. All results carry through to
the more complicated cases of multidimensional and multifactor stochastic volatility models, as outlined in Section 5.
Let X(t), t ≥ 0, be a one-dimensional diffusion process solution to the following stochastic differential equation:
dX(t) = μ0 (X(t), θ0 ) dt + σ0 (X(t), θ0 ) dW(t),

(1)


where θ0 ∈ ,  ⊂ ℜp , and  is a compact set. In general,
assume that the model that is specified and estimated is the same
as above, but with θ0 replaced by its pseudo true analog, θ † .
Namely, we consider models of the form
dX(t) = μ(X(t), θ † ) dt + σ (X(t), θ † ) dW(t).

(2)

Thus, correct specification of the diffusion process corresponds
to μ(·, ·) = μ0 (·, ·) and σ (·, ·) = σ0 (·, ·). Note that the drift and
variance terms [μ(·) and σ 2 (·), respectively] uniquely determine the stationary density, say f (x, θ † ), associated with the invariant probability measure of the above diffusion process (see,
e.g., Karlin and Taylor 1981, p. 241). In particular,
 x

2μ(v, θ † )
c(θ † )

exp
dv ,

(3)
f (x, θ ) = 2
σ (x, θ † )
σ 2 (v, θ † )
where c(θ † ) is a constant ensuring that the density integrates to
1. However, knowledge of the drift and variance terms does not
ensure knowledge of a closed functional form for the transition
density.
Now, suppose that we observe a discrete sample (skeleton)
of T observations, say (X1 , X2 , . . . , XT )′ , from the underlying
diffusion. Furthermore, suppose that we use these sample data
in conjunction with a simulated “path” to construct an estimator of θ , say 
θT,N,h , where N denotes the simulation path length
and h is the discretization interval. For the case in which the
moment conditions can be written in closed form, and so there
is no need of simulations, we have that 
θT,N,h = 
θT . Finally,

178


Journal of Business & Economic Statistics, April 2008

note that we use the notation X(t) to denote the continuoustime process, and the notation Xt to denote the skeleton (i.e.,
θ denote pathwise
the discrete sample). In light of this, let Xt,h
simulated data, constructed using some θ ∈ , and discrete interval h, and sampled at the same frequency of the data.
Assume that 
θT,N,h is the simulated generalized method of
moments estimator, which is defined as
′
 T
N
1
1
θ

θT,N,h = arg min
g(Xt ) −
g(Xt,h ) WT
θ∈ T
N
t=1

×




T
N
1
1
θ
g(Xt ) −
g(Xt,h )
T
N
t=1

t=1



θ∈

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:48 12 January 2016

∇θ G∞ (θ † )′ W † G∞ (θ † ) = 0.
However, in the case for which the number of parameters and
the number of moment conditions is the same, ∇θ G∞ (θ † )′ W †
is invertible, and so the first-order conditions also imply that
G∞ (θ † ) = 0.
In the sequel, we shall rely on the following assumption.

t=1

= arg min GT,N,h (θ ) WT GT,N,h (θ ),

(4)

where g denotes a vector of p moment conditions,  ⊂ ℜp (so
that we have as many moment conditions as parameters), and
θ = Xθ
Xt,h
[Kth/N] , with N = Kh. Typically the p moment conditions are based on the difference between sample moments
of historical and simulated data, or between sample moments
and model implied moments, whenever the latter are known in
closed form. Finally, WT is the inverse of a heteroscedasticity
and autocorrelation (HAC) robust covariance matrix estimator.
That is,


T−l
lT
T

T

1
1
WT−1 =
g(Xt )

g(Xt ) −
T
T
ν=−lT

the asymptotic behavior of misspecified overidentified GMM
estimators). Note also that first-order conditions imply that

t=1

t=ν+1+lT



′
T
1
× g(Xt−ν ) −
g(Xt ) ,
T

(5)

t=1

where wv = 1 − v/(lT + 1). To construct simulated estimators,
we require simulated sample paths. If we use a Milstein scheme
(see, e.g., Pardoux and Talay 1985), then




θ
θ
θ
θ
Xkh
, θ ǫkh
, θ h + σ X(k−1)h
= b X(k−1)h
− X(k−1)h



′ θ
1 θ
− σ X(k−1)h
,θ h
, θ σ X(k−1)h
2

2

′ θ
1 θ
, θ ǫkh
, (6)
, θ σ X(k−1)h
+ σ X(k−1)h
2
iid

where (Wkh − W(k−1)h ) = ǫkh ∼ N(0, h), k = 1, . . . , K, Kh =
N, and σ (·, ·)′ is defined later (see Assumption A). Hereafter,
θ denotes the values of the diffusion at time kh, simulated
Xkh
under θ, and with a discrete interval equal to h, and so is a
θ . Further, the pseudo true value, θ † , is
fine grain analog of Xt,h
defined to be
θ † = arg min G∞ (θ )′ W0 G∞ (θ ),
θ∈

where G∞ (θ )′ W0 G∞ (θ ) = p limN,T→∞,h→0 GT,N,h (θ )′ WT ×
GT,N,h (θ ), and where θ † = θ0 , if the model is correctly specified. Note that the reason why we limit our attention to the exactly identified case is that this ensures that G∞ (θ † ) = 0, even
when the model used to simulate the diffusion is misspecified,
in the sense of differing from the underlying data generating
process (DGP) (see, e.g., Hall and Inoue 2003 for discussion of

Assumption A.
(i) X(t), t ∈ ℜ+ , is a strictly stationary, geometric ergodic
diffusion.
(ii) μ(·, θ † ) and σ (·, θ † ), as defined in (2), are twice continuously differentiable. Also, μ(·, ·), μ(·, ·)′ , σ (·, ·), and σ (·, ·)′
are Lipschitz, with Lipschitz constant independent of θ , where
μ(·, ·)′ and σ (·, ·)′ denote derivatives with respect to the first
argument of the function.
θ is geometrically ergodic
(iii) For any fixed h and θ ∈ , Xkh
and strictly stationary.



a.s.
(iv) WT → W0 = −1
j=−∞ E((g(X1 ) −
0 =
0 , where
E(g(X1 )))(g(X1+j ) − E(g(X1+j )))′ ).
θ be the vector of derivatives with
(v) Hereafter, let ∇θ Xt,h
θ
θ )
respect to θ of Xt,h . For θ ∈  and for all h,
g(Xt,h
2+δ <
θ
θ ))
C < ∞, g(Xt,h ) is Lipschitz, uniformly on , θ → E(g(Xt,h
θ ), ∇ X θ are 2r-dominated (the
is continuous, and g(Xt ), g(Xt,h
θ t,h
last two also on ) for r > 3/2.
(vi) Unique identifiability: G∞ (θ † )′ W0 G∞ (θ † ) < G∞ (θ )′ ×
W0 G∞ (θ ), ∀θ = θ † .
(vii) 
θT,N,h and θ † are in the interior of , g(Xtθ ) is twice
continuously differentiable in the interior of , and D† =
E(∂g(Xtθ )/∂θ |θ=θ † ) exists and is of full rank, p.
Assumption A is rather standard. Assumption A(ii) ensures
that under both hypotheses there is a unique solution to the
stochastic differential equation in (2). Assumptions A(i) and
A(iii)–A(vii) ensure consistency and asymptotic normality of
the SGMM estimator.
All results stated in the sequel rely on the following lemma.
Lemma 1. Let Assumption A hold. Assume that T, N → ∞.
Then, if h → 0, T/N → 0, and h2 T → 0, then




d
T(
θT,N,h − θ † ) → N 0, (D†′ W0 D† )−1 ,

where W0 and D† are defined in Assumptions A(iv) and A(vii).
The above normality result for the SGMM estimator can be extended in a straightforward manner to the case of the EMM estimator of Gallant and Tauchen (1996) (see also the sequential
partial indirect inference approach of Dridi, Guay, and Renault
2006).
3. SIMULATED CONDITIONAL DISTRIBUTIONS
In this section, we outline how to construct in-sample τ -stepahead simulated conditional distributions, when the functional
form of the conditional distribution is not known in closed form.

Bhardwaj, Corradi, and Swanson: Diffusion Processes

179

Conditional confidence interval construction follows immediately, and is discussed in Section 6. Let S be the number of simulated paths. Then, for s = 1, . . . , S, t ≥ 1, and k = 1, . . . , τ/h,
define

θ


θ

T,N,h
T,N,h
Xs,t+kh
− Xs,t+(k−1)h







θT,N,h
θT,N,h
= μ Xs,t+(k−1)h
,
θT,N,h h + σ Xs,t+(k−1)h
,
θT,N,h ǫs,t+kh

′ 


1 
θT,N,h
θT,N,h
,
θT,N,h σ Xs+(k−1)h
,
θT,N,h h
− σ Xs,t+(k−1)h
2

′ 


1 
θT,N,h
θT,N,h
+ σ Xs,t+(k−1)h
,
θT,N,h σ Xs,t+(k−1)h
,
θT,N,h
2
2
× ǫs,t+kh
,

pr


Fτ (u|Xt , 
θT,N,h ) − Fτ (u|Xt , θ † ) → 0.
pr

(7)

iid

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:48 12 January 2016

Proposition 2. Let Assumptions A and B hold. Assume that
T, N, S → ∞. Then, if h → 0, T/N → 0, and h2 T → 0,
T 2 /S → ∞, the following result holds for any Xt , t ≥ 1, uniformly in u:
(8)

In addition, if the model is correctly specified [i.e., if μ(·, ·) =
μ0 (·, ·) and σ (·, ·) = σ0 (·, ·)] then

where ǫs,t+kh ∼ N(0, h). That is, we simulate S paths of length
τ, with τ finite, all having the same starting value, Xt . This
allows for the preservation of the starting value effect on the
finite-length simulation paths. It should be stressed, however,
that the simulated diffusion is ergodic. Thus, the effect of the
starting value approaches zero at an exponential rate, as τ →

θ

T,N,h
∞. Hereafter, Xs,t+kh
denotes the value for the diffusion at time
t + kh, at simulation s, using parameters 
θT,N,h , a discrete interval equal to h, and using, as a starting value at time t, Xt ;


θT,N,h
Xs,t+τ

is defined analogously, by setting kh = τ.
Now, for any given starting value, Xt , the simulated randomness is assumed to be independent across simulations, so that
E(ǫs,t+kh ǫj,t+kh ) = 0, for all s = j. On the other hand, it is important to retain the same simulated randomness across different starting values (i.e., use the same set of random errors for
each starting value), so that E(ǫs,t+kh ǫs,l+kh ) = h, for any t, l.
Note that the test discussed in the next section requires the
construction of multiple paths of length τ, for a number of
different starting values, and hence we have attempted to underscore the importance of keeping the random errors used in
the construction of each set of paths for each starting value the
same.
As an estimate for the distribution, at time t + τ, conditional
on Xt , define
S


Fτ (u|Xt , 
θT,N,h ) =

θ
is continuously differentiable in the interior of ,
(ii) Xs,t+τ
θ
for s = 1, . . . , S; and ∇θ Xs,t+τ
is 2r-dominated in , uniformly
in s for r > 2.


1 
θT,N,h
1 Xs,t+τ
≤u .
S
s=1

As stated in Proposition 2 below, if the model is correctly


θT,N,h
specified, then S1 Ss=1 1{Xs,t+τ
≤ u} provides a consistent estimate of the “true” conditional distribution. A related approach
for approximating distribution functions has been suggested by
Thompson (2004), in the context of tests for the correct specification of term structure models. Hereafter, let Fτ (u|Xt , θ † ) deθ † , given X θ † = X , that
note the conditional distribution of Xt+τ
t
t


θ ≤ u|X θ = X ).
is, Fτ (u|Xt , θ † ) = Pr(Xt+τ
t
t
The statement of Proposition 2 requires the following assumption.
Assumption B.
(i) Fτ (u|Xt , θ ) is twice continuously differentiable in the
interior of . Also, ∇θ Fτ (u|Xt , θ ) and ∇θ2 Fτ (u|Xt , θ ) are
jointly continuous in the interior of , almost surely, and 2rdominated on , r > 2.


θT,N,h ) − F0,τ (u|Xt , θ0 ) → 0,
Fτ (u|Xt , 

(9)

where F0,τ (u|Xt , θ0 ) = Pr(Xt+τ ≤ u|Xt ).

In practice, once we have an estimate for the τ -period-ahead
conditional distribution, we still do not know whether our estimate is based on the correct model. This suggests that one use
of the test for correct specification outlined in the next section
is to assess the “relevance” of the distribution estimator.
4. SPECIFICATION TESTING
In the first subsection, we outline the test. The second subsection discusses bootstrapping procedures for obtaining asymptotically valid critical values.
4.1 The Test
The test outlined in this section can be viewed as
a simulation-based extension of Andrews (1997) and Corradi
and Swanson (2005b, 2006b), which has been adapted to the
use of continuous-time models. Of additional note is that in the
Andrews and Corradi–Swanson articles, the functional form of
conditional distribution under the null hypothesis is assumed to
be known, whereas in the current context we replace the conditional distribution (or conditional mean) with a simulationbased estimator. Furthermore, in the Andrews article, a conditional Kolmogorov test is developed under the assumption of iid
data, so that the bootstrap used in that article is not applicable
in our context.
Recall that in the previous section we discussed the construction of simulation paths for a given starting value. To carry out
a specification test, however, we now require the construction
of a sequence of T − τ conditional distributions that are τ steps
ahead, say. In this way, S paths of length τ are thus available
for each starting value from t = 1, . . . , T − τ .
The hypotheses of interest are:
H0 : Fτ (u|Xt , θ † ) = F0,τ (u|Xt , θ0 ), for all u, a.s.
HA : Pr(Fτ (u|Xt , θ † ) − F0,τ (u|Xt , θ0 ) = 0) > 0, for some
u ∈ U, with nonzero Lebesgue measure.
Thus, the null hypothesis coincides with that of correct specification of the conditional distribution, and is implied by the
correct specification of the drift and variance terms used in
simulating the paths. The alternative is simply the negation
of the null. In practice, we observe neither Fτ (u|Xt , θ † ) nor
F0,τ (u|Xt , θ0 ). However, we can define the following statistic:
VT =

sup
u×v∈U×V

|VT (u, v)|,

(10)

180

Journal of Business & Economic Statistics, April 2008

where
1
VT (u, v) = √
T −τ

T−τ

t=1



1
S

S




θT,N,h
1 Xs,t+τ
≤ u − 1{Xt+τ ≤ u}

s=1

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:48 12 January 2016

× 1{Xt ≤ v},



(11)

with U and V compact sets on the real line.
The statistic defined in (10) is a simulation-based version of
the conditional Kolmogorov test of Andrews (1997). As stated


θT,N,h
in Proposition 2, S1 Ss=1 1{Xs,t+τ
≤ u} is a consistent estimator of F0,τ (Xt ≤ u|Xt , θ0 ), which is the conditional distribution implied by the null model.
Thus, we compare the joint
1 T−τ
empirical distribution T−τ
t=1 1{Xt+τ ≤ u}1{Xt ≤ v} with
its semiempirical/semiparametric analog given by the product
1 T−τ
of T−τ
t=1 F0,τ (u|Xt , θ0 )1{Xt ≤ v}. Intuitively, if the null
model used for simulating the data is correct, then the difference between the two approaches zero, and has a well-defined
limiting distribution when properly scaled. The asymptotic behavior of VT is described by the following theorem.
Theorem 3. Let Assumptions A and B hold. Assume that
T, N, S → ∞. Then, if h → 0, T/N → 0, T/S → 0, T 2 /S →
∞, Nh → 0, and h2 T → 0, the following result holds under
H0 :
d

VT →

sup
u×v∈U×V

|Z(u, v)|,

where Z(u, v) is a Gaussian process with covariance kernel
K(u, u′ , v, v′ ) given by
K(u, u′ , v, v′ )
=




j=−∞




E F0,τ (u|X1 , θ0 ) − 1{X1+τ ≤ u} 1{X1 ≤ v}






× F0,τ (u′ |X1+j , θ0 ) − 1{X1+τ +j ≤ u′ } 1{X1+j ≤ v′ }


0′

0 −1

+ μf0,τ (u, v) (D W0 D )

μf0,τ (u, v)

of simulation error is asymptotically negligible. The limiting
distribution is not nuisance parameter-free and hence critical
values cannot be tabulated. In the next section we thus outline a
bootstrap procedure for calculating asymptotically valid critical
values for VT .
4.2 Bootstrap Critical Values
Given that the limiting distribution of VT is not nuisance
parameter-free, our approach is to construct bootstrap critical values for the test. To show the first-order validity of the
bootstrap, we shall obtain the limiting distribution of the bootstrapped statistic and show that it coincides with the limiting
distribution of the actual statistic, under H0 . Then, a test with
correct asymptotic size and unit asymptotic power can be obtained by comparing the value of the original statistic with bootstrapped critical values.
Asymptotically valid bootstrap critical values for the test
should be constructed as follows:
Step 1. At each replication, draw b blocks (with replacement) of length l, where bl = T. Thus, each block is equal
to Xi+1 , . . . , Xi+l , for some i = 0, . . . , T − l, with probability
1/(T − l + 1). More formally, let Ik , k = 1, . . . , b, be iid discrete
uniform random variables on [0, 1, . . . , T − l]. Then, the re∗ , . . . , X∗ =
sampled series, Xt∗ , is such that X1∗ , X2∗ , . . . , Xl∗ , Xl+1
T
XI1 +1 , XI1 +2 , . . . , XI1 +l , XI2 , . . . , XIb +l , and so a resampled series consists of b blocks that are discrete iid uniform random
variables, conditional on the sample. Use these data to construct


θT,N,h
. For N/T → ∞, GMM and simulated GMM are asymptotically equivalent. Thus, we do not need to resample simulated moment conditions. More precisely, define
′
 T
N
1
1

θ


θT,N,h = arg min
g(Xt ) −
g(Xt,h )
θ∈ T
N
t=1

× WT

− 2μf0,τ (u, v)′ (D0′ W0 D0 )−1 D0′ W 0






g(X1+j ) − E(g(X1 ))
×
j=−∞

where




× F0,τ (u|X1+j , θ0 ) − 1{X1+τ +j ≤ u′ } 1{X1+j ≤ v′ },







θ0
1{X1 ≤ v} ,
μf0,τ (u, v) = EX f0,τ (u|X1 , θ0 )Es ∇θ0 Xs,1+τ

EX denotes the expectation under the law governing the sample
and ES denotes the expectation under the measure governing
the simulated randomness.
Furthermore, under HA , there exists some ε > 0 such that


1
lim Pr √ VT > ε = 1.
P→∞
T
Notice that the limiting distribution is a zero-mean Gaussian
process, with a covariance kernel given by K(u, u′ , v, v′ ). The
first term is the long-run variance we would have if we knew
F0,τ (u|X1 , θ0 ); the second term captures the contribution of parameter estimation error; and the third term captures the correlation between the first two. As T/S → 0, the contribution

t=1




T
N
1
1

θ
g(Xt ) −
g(Xt,h ) ,
T
N
t=1

t=1

where WT and g(·) are defined in (4).
Step 2. Using the same set of random errors used in the con
θ∗

T,N,h
struction of the actual statistic, construct Xs,t+τ,∗
, s = 1, . . . , S
and t = 1, . . . , T − τ . Note that we do not resample the simulated series (as S/T → ∞, simulation error is asymptotically
negligible). Instead, simply simulate the series using bootstrap
estimators and using bootstrapped values as starting values.

Step 3. Construct the following bootstrap statistic, which is
the bootstrap counterpart of VT :
VT∗ =

sup
u×v∈U×V

|VT∗ (u, v)|,

(12)

where
VT∗ (u, v)


 S
T−τ

 1


θT,N,h
1

=√
1 Xs,t+τ,∗ ≤ u − 1{Xt+τ ≤ u}
T − τ t=1 S s=1
× 1{Xt∗ ≤ v}

Bhardwaj, Corradi, and Swanson: Diffusion Processes

181


 S
T−τ
 1


1
θT,N,h
−√
1 Xs,t+τ ≤ u − 1{Xt+τ ≤ u}
T − τ t=1 S s=1

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:48 12 January 2016

× 1{Xt ≤ v}.

(13)

Step 4. Repeat Steps 1–3 B times, and generate the empirical
distribution of the B bootstrap statistics.
Note that the first term on the right side of (13) is the bootstrap analog of (11). However, simulation error is asymptotically negligible (as T/S → 0), so that there is no need to resample the simulated data. Though, to properly mimic the contribution of parameter estimation error, data are simulated using the
bootstrap estimator. Also, we need to use resampled observed
values as starting values for the simulated series. The second
term on the right side of (13) is the mean of the first, computed
under the law governing the bootstrap, and conditional on the
sample. It plays the role of a recentering term.
For the iid case when the conditional distribution, F0 (yt |Xt ),
is known in closed form, Andrews (1997) suggested using a
parametric bootstrap procedure [i.e., resample y∗t , say, from
F(u|Xt , 
θT ), keeping the conditioning variates, Xt , fixed]. In
our dynamic context, the conditioning variables cannot be
kept constant. One possibility, for the case where τ = 1,


θ
would be to draw X2∗ from S1 Ss=1 1{Xs T,N,h (X1 ) ≤ u}, simulate again using X2∗ as the initial value and draw X3∗ from

θT,N,h ∗
1 S
(X2 ) ≤ u}, and so on, forming the resampled
s=1 1{Xs
S
series X2∗ , . . . , XT∗ . Then, construct the analog of (10). However,
this bootstrap procedure may not properly mimic the contribution of parameter estimation error. We leave this issue to future
research.
Theorem 4. Let Assumptions A and B hold. Assume that
T, N, S → ∞. Then, if h → 0, T/N → 0, T/S → 0, T 2 /S →
∞, Nh → 0, h2 T → 0, l → ∞, and l2 /T → 0, the following
result holds:






P ω : supP∗ (VT∗ (ω) ≤ x) − P (VT − E(VT )) ≤ x  > ε → 0,
x∈ℜ

where P∗ denotes the probability law of the resampled series,
conditional on the sample.

The above results suggest proceeding in the following manner. For any bootstrap replication, compute the bootstrap statistic, VT∗ . Perform B bootstrap replications (B large) and compute
the quantiles of the empirical distribution of the B bootstrap statistics. Reject H0 if VT is greater than the (1 − α)th percentile.
Otherwise, do not reject. Now, for all samples except a set with
probability measure approaching zero, VT has the same limiting
distribution as the corresponding bootstrapped statistic, ensuring asymptotic size equal to α. Under the alternative, VT diverges to (plus) infinity, whereas the corresponding bootstrap
statistic has a well-defined limiting distribution, ensuring unit
asymptotic power.
5. STOCHASTIC VOLATILITY MODELS
We now focus our attention on stochastic volatility models.
Extension to general multifactor and multidimensional models

follows directly. Consider the following model:



 

σ11 (V(t), θ † )
μ1 (X(t), θ † )
dX(t)
dW1 (t)
dt +
=
dV(t)
μ2 (V(t), θ † )
0


σ21 (V(t), θ † )
dW2 (t), (14)
+
σ22 (V(t), θ † )
where W1,t and W2,t are independent standard Brownian motions, and where μ1 (·, ·), μ2 (·, ·), σ11 (·, ·), σ21 (·, ·), σ22 (·, ·),
and θ † are replaced with μ1,0 (·, ·), μ2,0 (·, ·), σ11,0 (·, ·),
σ21,0 (·, ·), σ22,0 (·, ·), and θ0 , respectively, under correct specification.
In the one-dimensional case, X(t) can be expressed as a
function of the driving Brownian motion, W(t). However, in
the multidimensional case, that is X(t) ∈ Rp , X(t) cannot in
general be expressed as a function of the p driving
Brownian
t
motions, but is instead a function of (Wj (t), 0 Wj (s) dWi (s)),
i, j = 1, . . . , p (see, e.g., Pardoux and Talay 1985, p. 30–32).
For this reason, simple approximation schemes like the Euler
and Milstein schemes, which do not involve approximations of
stochastic integrals, may not be adequate. One case in which the
Milstein scheme does straightforwardly generalize to the multidimensional case is when the diffusion matrix is commutative.
Namely, write the diffusion matrix
(X) as
(X) = ( σ1 (X)

· · · σP (X) ) ,

where σi (X) is a p × 1 vector, for i = 1, . . . , p. If for all i, j =
1, . . . , p,
 ∂σj (X)
∂σj (X) 
···
σi (X)
∂X1
∂Xp
 ∂σi (X)
∂σi (X) 
···
=
σj (X),
∂X1
∂Xp
then
(X) is commutative.
However, almost all frequently used stochastic volatility
(SV) models violate the commutativity property (see, e.g., the
GARCH diffusion model of Nelson 1990; Heston 1993; and the
general eigenfunction stochastic volatility models of Meddahi
2001). A simple way of imposing commutativity is to assume
no leverage. Now, the nonleverage assumption is suitable for
exchange rates, but not for stock returns. In this situation, more
“sophisticated” approximation schemes are necessary.
It is immediate to see that the diffusion in (14) violates the
commutativity property. Now, let




0
σ11 (·, ·)
μ1 (·, ·)
,
σ=
, (15)
μ(·, ·) =
μ2 (·, ·)
σ21 (·, ·) σ22 (·, ·)
and define the following generalized Milstein scheme [see,
Kloeden and Platen 1999, p. 346, eq. (3.3)]:
θ
θ
θ
θ
X(k+1)h
= Xkh
+
μ1 (Xkh
, θ )h + σ11 (Vkh
, θ )ǫ1,(k+1)h
θ
+ σ12 (Vkh
, θ )ǫ2,(k+1)h

θ , θ)
∂σ12 (Vkh
1
θ
2
ǫ2,(k+1)h
, θ)
+ σ22 (Vkh
2
∂V
θ , θ)
∂σ11 (Vkh
∂V

 (k+1)h  s
×
dW2,τ dW1,s ,

θ
+ σ22 (Vkh
, θ)

kh

kh

(16)

182

Journal of Business & Economic Statistics, April 2008
θ
θ
θ
θ
= Vkh
+
μ2 (Vkh
, θ )h + σ22 (Vkh
, θ )ǫ2,(k+1)h
V(k+1)h

where

θ , θ)
∂σ22 (Vkh
1
θ
2
+ σ22 (Vkh
, θ)
,
ǫ2,(k+1)h
2
∂V

(17)

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:48 12 January 2016

where h−1/2 ǫi,kh ∼ N(0, 1), i = 1, 2, E(ǫ1,kh ǫ2,mh ) = 0 for all k
and m, and



μ1 (V, θ )

μ(V, θ ) =

μ2 (V, θ )

(V,θ) 
μ1 (V, θ ) − 12 σ22 (V, θ ) ∂σ12∂V
,
=
(V,θ)
μ2 (V, θ ) − 21 σ22 (V, θ ) ∂σ22∂V

The last terms on the right side of (16) involve stochastic
integrals and cannot be explicitly computed. However, they can
be approximated, up to an error of order o(h), by [see, Kloeden
and Platen 1999, p. 347, eq. (3.7)]

 (k+1)h  s
dW2,τ dW1,s
kh

≈h



+

kh


1

ξ1 ξ2 + ρp (v2,p ξ1 − v1,p ξ2 )
2
h


p

r=1

− 1{Xt+τ ≤ u} 1{Xt ≤ v}.

where for j = 1, 2, ξj , vj,p , ςj,r , ηj,r are iid N(0, 1) with ξj =
p
1
h−1/2 ǫj,(k+1)h , ρp = 12
− 2π1 2 r=1 r12 , and p is such that as
h → 0, p → ∞.
To construct conditional distributions, given the observable
state variables, we need to perform the following steps.
Step 1. Simulate a path of length N using the schemes in
(16), (17) and estimate θ by SGMM, as in (4). Also, retrieve

θ

VkhT,N,h , for k = 1/h, . . . , K/h, with Kh = N, and hence obtain

θ

Vj,hT,N,h , j = 1, . . . , N (i.e., sample the simulated volatility at the
same frequency as the data).

Step 2. Using the schemes in (16), (17), simulate S × N paths
of length τ, setting the initial value for the observable state
variable to be Xt . As we do not observe data on volatility, use
the values simulated in the previous step as the initial value for
the volatility process (i.e., as initial values for the unobservable

θ

state variable, use Vj,hT,N,h , j = 1, . . . , N). Also, keep the sim (k+1)h  s
( kh dW1,τ ) dW2,s ]
ulated randomness [i.e., ǫ1,kh , ǫ2,kh , kh
constant across j and t (i.e., constant across the different starting values for the unobservable and observable state variables).
T,N,h
Define Xj,s,t+τ
to be the simulated τ -step-ahead value for the


θ

All of the results outlined in Sections 3 and 4 generalize to
the current setting. In particular, the following results hold.
Proposition 5. Let Assumptions A′ and B hold. Assume that
T, N, S → ∞. Then, if h → 0, T/N → 0, T 2 /S → ∞, and
h2 T → 0, the following result holds for any Xt , t ≥ 1:
S


1  
pr
θT,N,h
1 Xj,s,t+τ
≤ u − Fτ (u|Xt , θ † ) → 0
NS
j=1 s=1

uniformly in u.
In addition, if the model is correctly specified, that is, if
μ(·, ·) = μ0 (·, ·) and σ (·, ·) = σ0 (·, ·), then
N

S


1  
pr
θT,N,h
1 Xj,s,t+τ
≤ u − F0,τ (u|Xt , θ0 ) → 0
NS
j=1 s=1

uniformly in u.
In the sequel, consider testing the hypotheses given in Section 4.1. However, in the present context correct specification of
the conditional distribution of Xt+τ |Xt no longer implies correct
specification of the underlying diffusion process. Indeed, correct specification of the stochastic volatility process is equivalent to correct specification of Xt+τ , Vt+τ |Xt , Vt . Also, although
Xt and Vt are jointly Markovian, Xt is no longer Markov. Therefore, by using only Xt as a conditioning variable, we implicitly allow for dynamic misspecification under the null. Furthermore, it thus follows that the test has no power against diffusion
processes having the same transition density as Xt+τ |Xt .
Theorem 6. Let Assumptions A′ and B hold. Assume that
T, N, S → ∞. Then, if h → 0, T/N → 0, T/S → 0, T 2 /S →
∞, Nh → 0, and h2 T → 0, the following result holds under
H0 :

return series at replication s, using initial values Xt and Vj,hT,N,h .
, θ † ),

Step 4. Construct the statistic of interest:
SV T =

sup
u×v∈U×V

|SV T (u, v)|,

d

SV T →

1
NS

×
Step 3. As an estimator of Fτ (u|Xt
construct
N S

θT,N,h
j=1
s=1 1{Xj,s,t+τ ≤ u}. Note that, by averaging over the
initial value of the volatility process, we have integrated out


θT,N,h
≤ u} is an estimate
its effect. In other words, S1 Ss=1 1{Xj,s,t+τ

θ

of Fτ (u|Xt , Vj,h , θ ).

(18)

Assumption A′ . This assumption is the same as Assumption A, except that Assumption A(ii) is replaced by (ii′ ): μ(·, ·)
(V,θ)
and σ (·, ·) [as defined in (14) and (15)], and σlj (V, θ ) ∂σkι∂V
are twice continuously differentiable, Lipschitz, with Lipschitz
constant independent of θ, and grow at most at a linear rate,
uniformly in , for l, j, k, ι = 1, 2.

N





1
ς2,r ( 2ξ1 + η1,r ) − ς1,r ( 2ξ2 + η2,r ) ,
r


θ


T−τ
N 
S
 1 


1
θT,N,h
1 Xj,s,t+τ
≤u
SV T (u, v) = √
T − τ t=1 NS j=1 s=1


sup
u×v∈U×V

|SZ(u, v)|,

where SZ(v) is a Gaussian process with covariance kernel
SK(v, v′ ) given by
SK(u, u′ , v, v′ )
=




k=−∞




E F0,τ (u|X1 , θ0 ) − 1{X1+τ ≤ u} 1{X1 ≤ v}






× F0,τ (u′ |X1+k , θ0 ) − 1{X1+τ +k ≤ u′ } 1{X1+k ≤ v′ }

Bhardwaj, Corradi, and Swanson: Diffusion Processes

183

+ μSV f0,τ (u, v)′ (D0′ W0 D0 )−1 μSV f0,τ (u, v)

∞, Nh → 0, h2 T → 0, l → ∞, and l2 /T → 0, the following
result holds under H0 :






P ω : supP∗ (SV ∗T (ω) ≤ x) − P (SV T − E(SV T )) ≤ x  > ε

− 2μSV f0,τ (u, v)′ (D0′ W0 D0 )−1 D0′ W 0

×

where






g(X1+j ) − E(g(X1 ))

x∈ℜ

j=−∞




× F0,τ (u|X1+j , θ0 ) − 1{X1+τ +j ≤ u′ } 1{X1+j ≤ v′ },







θ0
1{X1 ≤ v}
μSV f0,τ (u, v) = E f0,τ (u|X1 , θ0 )Ej,s ∇θ0 Xj,s,1+τ

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:48 12 January 2016

and Ej,s denotes the expectation with respect to the joint probability measure governing the simulated randomness and the
θ
volatility process, Vh,j0 , conditional on the sample.
Furthermore, under HA , there exists some ε > 0 such that


1
lim Pr √ SV T > ε = 1.
P→∞
T
Note that in a Bayesian context, Chib, Kim, and Shephard
(1998) and Elerian, Chib, and Shephard (2002) suggested likelihood ratio tests for the comparison of a stochastic volatility
model against a specific alternative model, for the discrete and
continuous cases, respectively. In their tests, the likelihood is
constructed via the joint use of probability integral transform
and particle filtering (see also Pitt and Shephard 1999; Pitt
2005). These tests differ from ours, as we are interested in assessing whether the conditional density of Xt+τ |Xt implied by
the null model is correct.
To obtain asymptotically valid critical values, resample as
discussed above for the one-factor model, and note that there
θ . In particular, form bootstrap statisis no need to resample Vh,j
tics as follows:
SV ∗T =

sup
u×v∈U×V

|SV ∗T (u, v)|,

(19)

where

T−τ
N 
S

 1 


θi,T,N,h
1

1 Xj,s,t+τ,∗
≤u
SV T (u, v) = √
NS
T − τ t=1
j=1 s=1


− 1{Xt+τ
≤ u} 1{Xt∗ ≤ v}


T−τ
N 
S
 1 


1
θi,T,N,h
−√
1 Xj,s,t+τ
≤u
T − τ t=1 NS j=1 s=1

− 1{Xt+τ ≤ u} 1{Xt ≤ v},


θ∗

i,T,N,h
and where Xj,s,t+τ,∗
is the simulated value at simulation s,

θ∗

constructed using 
θi,T,N,h and using Xt∗ and Vj,hi,T,N,h as initial
values. This allows us to state the following theorem.

Theorem 7. Let Assumptions A′ and B hold. Assume that
T, N, S → ∞. Then, if h → 0, T/N → 0, T/S → 0, T 2 /S →

→ 0.

As in the single-factor case, the above results suggest proceeding in the following manner. For any bootstrap replication,
compute the bootstrap statistic, SV ∗T . Perform B bootstrap replications (B large) and compute the quantiles of the empirical
distribution of the B bootstrap statistics. Reject H0 , if SV T is
greater than the (1 − α)th percentile. Otherwise, do not reject.
Now, under the null, for all samples except a set with probability measure approaching zero, SV T and SV ∗T have the same limiting distribution, ensuring asymptotic size equal to α. Under
the alternative, SV T diverges to (plus) infinity, whereas the corresponding bootstrap statistic has a well-defined limiting distribution, ensuring unit asymptotic power.
6. EXPERIMENTAL AND EMPIRICAL RESULTS
In this section, we briefly illustrate the above testing methodology for the case in which we are interested in specification
testing from the perspective of conditional confidence intervals.
In particular, consider VT = supv∈V |VT (v)|, where
 S
T−τ
 1



1
θT,N,h
1 u ≤ Xs,t+τ
≤u
VT (v) = √
T − τ t=1 S s=1

− 1{u ≤ Xt+τ ≤ u} 1{Xt ≤ v}.

Additionally, define the following bootstrap statistic: VT∗ =
supv∈V |VT∗ (v)|, where
 S
T−τ

 1



θT,N,h
1

≤u
VT (v) = √
1 u ≤ Xs,t+τ,∗
T − τ t=1 S s=1


− 1{u ≤ Xt+τ
≤ u} 1{Xt∗ ≤ v}

 S
T−τ
 1



1
θT,N,h
−√
1 u ≤ Xs,t+τ
≤u
T − τ t=1 S s=1

− 1{u ≤ Xt+τ ≤ u} 1{Xt ≤ v}.

It is immediate to see that the above statistic is a version of
the distributional test discussed above, so that all of the theoretical results outlined in Sections 4 and 5 hold. For the case
of stochastic volatility models, consider the statistic SV T =
supv∈V |SV p (v)|, where

N 
T−τ
S
 1 



1
θT,N,h
SV T (v) = √
1 u ≤ Xj,s,t+τ
≤u
NS
T −τ
t=1

j=1 s=1



−1{u ≤ Xt+τ ≤ u} 1{Xt ≤ v},

184

Journal of Business & Economic Statistics, April 2008

and its bootstrap analog SV ∗T = supv∈V |SV ∗T (v)|, where

N 
T−τ
S

 1 



θi,T,N,h
1

1 u ≤ Xj,s,t+τ,∗
≤u
SV T (v) = √
NS
T −τ
t=1


− 1{u ≤ Xt+τ

j=1 s=1



≤ u} 1{Xt∗ ≤ v}


N 
S
T−τ
 1 



1
θi,T,N,h
−√
1 u ≤ Xj,s,t+τ
≤u
T − τ t=1 NS j=1 s=1


Downloaded by [Universitas Maritim Raja Ali Haji] at 17:48 12 January 2016

− 1{u ≤ Xt+τ ≤ u} 1{Xt ≤ v}.

In the sequel we shall study a common version of the
square root process discussed in Cox, Ingersoll, and Ross (CIR)
(1985), the square root stochastic volatility model of Heston
(1993), called SV, and a stochastic volatility model with jumps,
called SVJ. The specifications of these models are as follows:

CIR: dr(t) = kr (r − r(t)) dt + σr r(t) dWr (t), where kr > 0,
σr > 0, and 2kr r ≥ σr2 ,

SV: dr(t) = kr (r √
− r(t)) dt + V(t) dWr (t), and dV(t) =
kv (v − V(t)) dt + σv V(t) dWv (t), where Wr (t) and Wv (t) are
independent Brownian motions, and where kr > 0, σr > 0,
kv > 0, σv > 0, and 2kv v > σv2 .

SVJ: dr(t) = kr (r − r(t)) dt + V(t)
√dWr (t) + Ju dqu −
Jd dqd , and dV(t) = kv (v − V(t)) dt + σv V(t) dWv (t), where
Wr (t) and Wv (t) are independent Brownian motions, and where
kr > 0, σr > 0, kv > 0, σv > 0, and 2kv v > σv2 . Further, qu and
qd are Poisson processes with jump intensity λu and λd , and are
independent of the Brownian motions Wr (t) and Wv (t). Jump
sizes are iid and are controlled by jump magnitudes ζu , ζd > 0,
which are drawn from exponential distributions, with densities
f (Ju ) = ζ1u exp(− Jζuu ) and f (Jd ) = ζ1d exp(− Jζdd ). Here, λu is the
probability of a jump up, Pr(dqu (t) = 1) = λu , and jump-up
size is controlled by Ju , whereas λd and Jd control-jump down
intensity and size. Note that the case of Poisson jumps with
constant intensity and jump size with exponential density is
covered by the assumptions stated in the previous sections.
6.1 Monte Carlo Experiment
In this subsection, parameterizations for the above models
were selected via examination of models estimated using the
interest rate data discussed in the next subsection, and include:
CIR—(kr , r, σr ) = {(.15, .05, .10), (.30, .05, .10), (.50, .05,
.10)}; SV—(kr , r, kv , v, σv ) = (.30, .05, 5.00, .0005, .01);
SVJ—(kr , r, kv , v, σv , λu , ζu , λd , ζd ) = (.30, .05, 2.5, .0005,
.01, 8.00, .001, 1.50, .001). Note that kr in the data was estimated to be approximately .30, and our DGPs set kr =
.15, .30, .5. Thus, we include a case where there is substantially less mean reversion than that observed in the data. For
a discussion of the crucial issue of persistence and its effect
on tests related to that discussed in this paper, the reader is referred to Pritsker (1998). Note that the parameters estimated
for the three models, ordered as CIR, SV, and SVJ, are kr =
.31, .24, .50; r = .068, .059, .068; σr = .11; kv = 5.12, 1.56;
v = .0003, .0009; σv = .014, .029; λu = 7.86; ζu = .0004;

λd = 2.08; and ζd = .0008. Data were generated using the Milstein scheme discussed earlier with h=1/T, for T = {400, 800}.
The jump component can be simulated without any error, because of the constancy of the intensity parameter.
The three models we study (i.e., the CIR, SV, and SVJ models) fall in the class of affine diffusions. Therefore, it is possible to compute in closed form the conditional characteristic
function; for the CIR model we follow Singleton (2001), for
the SV model Jiang and Knight (2002), and for the SVJ model
Chacko and Viceira (2003). Given the conditional characteristic function, we compute as many conditional moments as the
number of parameters to be estimated and thus we implement
exactly identified GMM. (Note that the criticism of Andersen
and Sorensen 1996 about the use of exact GMM does not necessarily apply to the case of conditional moments obtained via
the conditional characteristic function.)
In our experiment, all empirical bootstrap distributions were
constructed using 100 bootstrap replications, and critical values were set equal to the 90th percentile of the bootstrap distributions. For the bootstrap, block lengths of 5, 10, 20, and 50
were tried. Additionally, we set S = {10T, 20T}, and for models SV and SVJ we set N = S. Tests were carried out using τ step-ahead confidence intervals, for τ = {1, 2, 4, 12}, and we set
(u, u); X ± .5σX , and X ± σX , where X and σX are the mean and
variance of an initial sample of data. Finally, [v, v] was set equal
to [Xmin , Xmax ], and a grid of 100 equally spaced values for v
across this range was used, where Xmin and Xmax are again fixed
using the initial sample. All results are based on 500 Monte
Carlo iterations.
Results are gathered in Tables 1–3. Tables 1 and 2 report results for