07350015%2E2012%2E754313
Journal of Business & Economic Statistics
ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20
Preaveraging-Based Estimation of Quadratic
Variation in the Presence of Noise and Jumps:
Theory, Implementation, and Empirical Evidence
Nikolaus Hautsch & Mark Podolskij
To cite this article: Nikolaus Hautsch & Mark Podolskij (2013) Preaveraging-Based Estimation
of Quadratic Variation in the Presence of Noise and Jumps: Theory, Implementation,
and Empirical Evidence, Journal of Business & Economic Statistics, 31:2, 165-183, DOI:
10.1080/07350015.2012.754313
To link to this article: http://dx.doi.org/10.1080/07350015.2012.754313
Accepted author version posted online: 04
Feb 2013.
Submit your article to this journal
Article views: 276
View related articles
Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=ubes20
Download by: [Universitas Maritim Raja Ali Haji]
Date: 11 January 2016, At: 22:05
Preaveraging-Based Estimation of Quadratic
Variation in the Presence of Noise and Jumps:
Theory, Implementation, and Empirical Evidence
Nikolaus HAUTSCH
Institute for Statistics and Econometrics and Center for Applied Statistics and Economics (CASE),
Humboldt-Universität zu Berlin and Center for Financial Studies (CFS), Frankfurt, D-10178 Berlin, Germany
([email protected])
Mark PODOLSKIJ
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
Institute of Applied Mathematics, University of Heidelberg, D-69120 Heidelberg, Germany
([email protected])
This article contributes to the theory for preaveraging estimators of the daily quadratic variation of asset
prices and provides novel empirical evidence. We develop asymptotic theory for preaveraging estimators in
the case of autocorrelated microstructure noise and propose an explicit test for serial dependence. Moreover,
we extend the theory on preaveraging estimators for processes involving jumps. We discuss several jumprobust measures and derive feasible central limit theorems for the general quadratic variation. Using
transaction data of different stocks traded at the New York Stock Exchange, we analyze the estimators’
sensitivity to the choice of the preaveraging bandwidth. Moreover, we investigate the dependence of
preaveraging-based inference on the sampling scheme, the sampling frequency, microstructure noise
properties, and the occurrence of jumps. As a result of a thorough empirical study, we provide guidance
for optimal implementation of preaveraging estimators and discuss potential pitfalls in practice.
KEY WORDS: High frequency data; Jump diffusions; Power variations; Sampling schemes; Volatility
estimation.
1.
INTRODUCTION
The availability of high-frequency data has significantly influenced research on volatility estimation during the last decade.
Inspired by the seminal articles by Andersen et al. (2001) and
Barndorff-Nielsen and Shephard (2002), the idea of estimating
daily volatility using realized measures relying on intraday data
has stimulated a new and active area of volatility modeling and
estimation.
This article provides new theory, implementation details, and
extensive and detailed empirical results for the class of preaveraging estimators originally introduced by Jacod et al. (2009)—
henceforth JLMPV09. The article’s aim is two-fold. First, we extend existing theory for preaveraging estimators in the presence
of serially dependent noise and jumps. We derive the asymptotic
properties of preaveraging estimators if market microstructure
noise is autocorrelated and suggest an explicit test for dependence up to q lags (so-called q dependence). To address the
occurrence of jumps, we derive a feasible central limit theorem for preaveraged multipower estimators and appropriate
local estimators for underlying components of the estimators’
asymptotic variance. Second, we provide a thorough empirical
study on the properties of preaveraging estimators in practice.
We study their empirical performance relative to alternative
state-of-the-art estimators and analyze the impact of implementation details, such as the choice of local interval lengths, the
role of sampling frequency, finite-sample adjustments, and the
impact of the sampling scheme depending on noise properties
and jumps.
The main difficulty in estimating the daily quadratic variation of asset prices using noisy high-frequency data is how to
optimally employ a maximum of information without being affected by so-called market microstructure noise arising from
market frictions, such as the bid–ask bounce or the discreteness
of prices, inducing a deviation of the price process from a continuous semimartingale process. As discussed, for example, by
Hansen and Lunde (2006), among others, market microstructure
noise can lead to severe biases and inconsistency of the estimators when the sampling frequency tends to infinity. To overcome
these effects, sparse sampling, for example, based on 20 min, has
been proposed as an ad hoc solution (see, e.g., Andersen et al.
2003). The major disadvantage of sparse sampling, however,
is the enormous data loss and the resulting inefficiency of the
estimator. As a result, various methods for bias corrections and
filtering of noise effects have been proposed in the literature. The
most prominent and general approaches are the realized kernel
(RK) estimator by Barndorff-Nielsen et al. (2008a) (henceforth
BHLS08a), the maximum likelihood estimator by Aı̈t-Sahalia,
Mykland, and Zhang (2005), and the preaveraging estimator by
JLMPV09. Andersen, Bollerslev, and Diebold (2008) provided
an overview of alternative approaches. For sound practical implementations, however, a good understanding of the estimators’
finite-sample behavior, their dependence on plug-in estimators
165
© 2013 American Statistical Association
Journal of Business & Economic Statistics
April 2013, Vol. 31, No. 2
DOI: 10.1080/07350015.2012.754313
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
166
and data-specific adjustments, and implementation details is crucial. In the case of kernel-type estimators, the empirical performance and properties are analyzed by Hansen and Lunde (2006)
and Barndorff-Nielsen et al. (2008b)—henceforth BHLS08b.
The latter shows that the estimators’ empirical (finite-sample)
properties can significantly deviate from theoretical (asymptotic) properties making finite-sample adjustments necessary
and ultimately important in financial practice. The first study
analyzing the performance of preaveraging estimators is Andersen, Dobrev, and Schaumburg (2010) who compared these
estimators with jump-robust nearest neighbor truncation estimators. Nevertheless, preaveraging estimators’ empirical properties and their sensitivity to implementation details as well as
to data features are still widely unknown.
The article’s theoretical contribution is two-fold: First, we develop asymptotic theory for preaveraging estimators in the case
of serially dependent market microstructure noise. In this context, we derive asymptotic theory for preaveraging estimators if
microstructure noise is q dependent and propose a corresponding autocorrelation test. Second, we construct a feasible central
limit theorem for the quadratic variation including a jump part
by estimating the variation’s asymptotic variance involving local jump-robust estimates of the noise and diffusive volatility
components. This makes the pure probabilistic results of Jacod, Podolskij, and Vetter (2009), who proved the asymptotic
mixed normality of the estimator, applicable for determining
confidence regions for the general quadratic variation.
The article’s second major contribution is of empirical nature
and is motivated by the need for a better understanding of how
preaveraging estimators and corresponding inference work in
practice, and how they should be optimally implemented. We
study the estimators’ dependence on the choice of the underlying
local preaveraging interval, the role of the sampling scheme (i.e.,
calendar time sampling vs. business time sampling, transaction
price sampling vs. midquote price sampling), and the impact of
the sampling frequency. Of particular interest is how the derived
noise-robust and jump-robust inference is reflected by the data
and how the estimators perform relative to alternative state-ofthe-art estimators.
Employing preaveraging-based inference to transaction and
quote data of a cross section of New York Stock Exchange
(NYSE) stocks covering different trading frequencies, we analyze the estimators’ sensitivity to the length of the underlying
local preaveraging interval. We show that too small choices of
the length of the local window induce biases and make estimators quite unstable. Addressing an inherent trade-off between
estimators’ bias and variance, we suggest a rule of thumb for
choosing the local window. In this context, finite-sample adjustments of the estimator and the underlying sampling scheme are
particularly important. Using an optimal preaveraging scheme,
we find jump proportions between 5% and 10% on average.
Ignoring the possibility of jumps in the price process leads to an
overestimation of confidence intervals that can be substantial in
the case of less liquid assets. Moreover, our results suggest implementing preaveraging estimators based on maximally high
sampling frequencies. A reduction in sampling frequency tends
to imply an “oversmoothing” of volatility resulting in negative
biases. As a result of this analysis, we derive suggestions for
optimal implementation of this class of estimators and tests in
practice.
Journal of Business & Economic Statistics, April 2013
The remainder of the article is organized as follows. In Section 2, we present the basic case with underlying conditionally
independent noise without jumps, discuss the underlying theoretical framework, and provide evidence on the optimal choice
of the preaveraging interval. Section 3 considers preaveraging
estimators under the assumption of continuous semimartingales
with dependent noise and presents empirical evidence based
on an explicit test. In Section 4, we discuss the discontinuous
case including the possibility of jumps. Finally, in Section 5,
the time series properties and the impact of sampling frequencies and schemes are analyzed, while Section 6 concludes. All
proofs are collected in the Appendix.
2. PREAVERAGING ESTIMATORS FOR THE
QUADRATIC VARIATION: THE CONTINUOUS CASE
2.1 The Basic Model
It is well known in finance that, under the no-arbitrage assumption, price processes must follow a semimartingale (see,
e.g., Delbaen and Schachermayer 1994). In this section, we
consider a continuous semimartingale (Xt )t≥0 of the form
t
t
Xt = X0 +
as ds +
σs dWs .
(2.1)
0
0
Here, W denotes a one-dimensional Brownian motion, (as )s≥0 is
a càglàd drift process, and (σs )s≥0 is an adapted càdlàg volatility
process. Moreover, t denotes (continuous) calendar time.
However, due to various market frictions, such as price discreteness or bid–ask spreads, the efficient price X is contaminated by noise. Thus, we can only observe a noisy version Z of
the process X. More precisely, we consider a filtered probability space (, F, (Ft )t≥0 , P) on which we define the process Z,
observed at n time points, indexed by i = 0, 1, . . . , n, as
Zt = Xt + Ut ,
t ≥ 0,
(2.2)
where Ut denotes the error term. In the case of transaction time
sampling (TRTS), i indexes the (irregular) time points associated
with each n -th trade. Hence, n = [N/n ] with N denoting the
number of trades until t. In the case of calendar time sampling
(CTS), i indexes equal-spaced time intervals of length n with
n = [t/n ]. Here, the price of the most previous observation
occurring before the end of the sampling interval is used (previous tick sampling). For convenience, we show all theoretical
relationships using the CTS notation. However, all relationships
also hold for TRTS with i, n , and n accordingly defined.
We are in the framework of high-frequency asymptotics or
so-called “in-fill” asymptotics, that is, n → 0. We assume that
Ut ’s are, conditional on the efficient price X = (Xs )s≥0 , centered
and independent, that is,
E(Ut |X) = 0,
Ut ⊥⊥ Us , t = s, conditional on X. (2.3)
Furthermore, we assume that the conditional variance of the
noise process U, defined as
(2.4)
αt2 = E Ut2 |X ,
is càdlàg, and introduce the process
Nt (q) = E(|Ut |q |X),
(2.5)
Hautsch and Podolskij: Preaveraging-Based Estimation of Quadratic Variation
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
which denotes the qth conditional absolute moment of the process U. Note that the noise specification does not entail a noise
level tending to 0 as n → ∞. This property is an attractive
feature for empirical work and will also be illustrated in examples given later. Model (2.2) was originally introduced by
JLMPV09. Note that the process (2.2) with conditions (2.3) and
(2.4) is defined in continuous time (see JLMPV09 for the precise
construction of the underlying probability space). In Section 3,
we discuss a discrete version of the model to introduce serial
dependence in noise.
Model (2.2) allows in particular for time-varying variances of
the noise and dependence between the efficient price X and the
microstructure noise U. Let us give some examples to illustrate
the applicability of our model. (i) The standard additive iid noise
model
such that g ′ is piecewise Lipschitz, and g(0) = g(1) = 0. Then,
g is associated with the following real-valued numbers:
1
1
ψ1 =
(g ′ (s))2 ds, ψ2 =
(g(s))2 ds,
φ1 (s) =
Zt = γ [(Xt + Vt )/γ ],
where γ > 0 is a fixed rounding level, and Vt ’s are iid U([0, γ ])
distributed and independent of X. Then the process Ut = Zt −
Xt satisfies the assumption (2.3), and we have the identity
Xt
Xt 2
2
2
−
αt = γ
,
γ
γ
with {x} = x − [x]. The process
αt2
0
0
1
s
g ′ (u)g ′ (u − s)du,
Our main goal is the estimation of the integrated variance
(IV) (or quadratic variation) defined as
t
IVt = [X]t =
σu2 du.
(2.6)
0
The preaveraging method as originally introduced by Podolskij
and Vetter (2009a) is based on certain local moving averages that
reduce the influence of the noise process U. Below, we briefly
review the asymptotic theory for the preaveraging method as
presented in JLMPV09.
To construct the estimator, we choose a sequence kn of integers, which satisfies
(2.7)
kn n1/2 = θ + o n1/4
for some θ > 0, and a nonzero real-valued function g : [0, 1] →
R, which is continuous, piecewise continuously differentiable
1
s
g(u)g(u − s)du,
A typical example of a function g : [0, 1] → R, which we use
in the empirical applications, is given by g(x) = x ∧ (1 − x). In
this case, the aforementioned constants are
ψ1 = 1,
22 =
ψ2 =
1
,
12
11 =
1
,
6
12 =
1
,
96
151
.
80,640
The preaveraged returns are given as
n
Zi
kn
j
ni+j Z,
g
=
k
n
j =1
ni Z = Zin − Z(i−1)n . (2.8)
Note that the latter performs a weighted averaging of the increments nj Z in the local window [in , (i + kn )n ]. It is intuitively clear that such an averaging diminishes the influence
of the noise to some extent. The window size kn is chosen to
−1/2
be of order n to find a balance between the two conflicting
convergence rates that are due to the diffusive part and the noise
part. This choice will lead later to optimal convergence rates.
Our main statistical instruments are the realized variance (RV)
[t/n ]
RVnt
=
n Z 2 ,
i
(2.9)
i=0
and the preaveraging estimator
is obviously càdlàg.
2.2 The Preaveraging Method and Asymptotic Results
φ2 (s) =
i, j = 1, 2.
0
where Ut ’s are iid with E(Ut ) = 0, E(Ut2 ) = ̟ 2 , and independent of the efficient price X, is obviously included in our
framework. (ii) The pure rounding model
at fixed level γ > 0 does not satisfy the first assumption of
(2.3). In fact, it is easy to observe that the integrated variance
cannot be estimated in the pure rounding model, because only
observations on a fixed grid γ Z are available.
(iii) In contrast to the pure rounding process in (ii), iid noise
plus rounding models may satisfy our assumptions. Consider
the model
s ∈ [0, 1],
1
ij =
φi (s)φj (s)ds,
Zt = Xt + Ut ,
Zt = γ [Xt /γ ]
167
[t/n ]−kn
V (Z, 2)nt =
i=0
n 2
Z ,
i
(2.10)
which is a direct analog of RV based on preaveraged returns
n
Z i . We remark that RVnt is not an appropriate estimator of the
integrated volatility in the noisy diffusion model (2.2). More
precisely, we have that
t
n n P
RVt −→
αu2 du,
(2.11)
2
0
that is, a normalized version of RVnt converges to the integrated conditional variance of the noise process U (see, e.g.,
JLMPV09).
Denote a mixed normal distribution with expectation 0
and (conditional) variance F 2 by MN(0, F 2 ) and assume that
the process Nt (2) is locally bounded. Then, as shown by
JLMPV09, we have
√
t
t
n
ψ1
2
n P
α 2 du, (2.12)
σu du + 2
V (Z, 2)t −→
θ ψ2
θ ψ2 0 u
0
168
Journal of Business & Economic Statistics, April 2013
where by (2.11) a consistent estimator of IVt is obtained by
Ctn :=
√
n
ψ1 n
P
V (Z, 2)nt − 2 RVnt −→ IVt =
θ ψ2
2θ ψ2
for given constants a, b, c. Following the arguments of remark
4 in JLMPV09, a consistent estimator of T is obtained by
t
σu2 du.
0
(2.13)
Ttn =
a
2
3θ ψ22
[t/n ]−kn
i=0
[t/n ]−2kn
Hence, the consistency of
is achieved by subtracting the
noise-induced bias term that is similar to Zhang, Mykland, and
Aı̈t-Sahalia (2005). Furthermore, under the assumption that the
process Nt (8) is locally bounded, JLMPV09 showed that
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
22 θ σu4
σ 2α2
α4
+ 212 u u + 11 u3
θ
θ
.
(2.15)
1/4
The estimator Ctn has the convergence rate n , which is
known to be the best attainable (see Gloter and Jacod 2001).
To obtain a feasible version of stable convergence in (2.14),
JLMPV09 suggested a consistent estimator of the conditional
variance Ŵt given by
Ŵtn =
422
3θ ψ24
+
+
[t/n ]−kn
i=0
4n
θ3
n
θ3
n 4
Z
i
[t/
n ]−2kn
12
22 ψ1
−
3
ψ24
ψ2
i=0
n 2
Z
11
12 ψ1
22 ψ1
−2
+
2
3
ψ2
ψ24
ψ2
i+2k
n
i
2
j =i+kn +1
P
n Z 2 n Z 2 −→
Ŵt .
i+2
i
n 2
Z
j
(2.16)
i=1
The statistic Ŵtn is obtained by separately estimating the three
integrals appearing in the definition of Ŵt . It has the appealing property of being positive as long as the weight function
g(x) = x ∧ (1 − x) is used. Then, by the properties of stable
convergence, we deduce the convergence in distribution
−1/4
n
Ctn − IVt
d
n
−→ N(0, 1).
Ŵt
(2.17)
Clearly, we are also
t able to estimate other random quantities
of similar form as 0 γu2 du. Consider a random variable of the
type
Tt = a
t
0
σu4 du
+b
i
i+2k
n
j =i+kn +1
n 2
Z
aψ12 − bθ 2 ψ1 ψ2
c+
θ 4 ψ22
j
P
n Z 2 n Z 2 −→
Tt .
i+2
i
×
(2.18)
i=1
For instance, when taking (a, b, c) = (1, 0, 0),
t we obtain a consistent estimator of the integrated quarticity 0 σu4 du.
To reduce the finite-sample bias in the case of too small
sampling frequencies, we need to slightly modify our statistics.
First, let us introduce the finite-sample analogs of the quantities
ψ1 , ψ2 , ij , φ1 (s), φ2 (s) as given by
ψ1kn = kn
ψ2kn =
2
kn
j +1
j
g
−g
,
k
k
n
n
j =1
kn −1
j
1
,
g2
kn j =1
kn
kn
i
i−1
−g
g
k
k
n
n
i=j +1
i−j
i−j −1
× g
−g
,
kn
kn
kn
i−j
i
φ2kn (j ) =
g
,
g
kn
kn
i=j +1
⎞
⎛
k
n −1
kn 2 1 kn 2
kn
11 = kn ⎝
φ1 (j ) − φ1 (0) ⎠ ,
2
j =0
φ1kn (j ) =
[t/n ]−2
×
n
+
4
(2.14)
where the convergence is stable in distribution [see, e.g.,
Renyi
(1963) for the definition of stable convergence], and
t
Ŵt = 0 γu2 du and γu2 are defined by
4
= 2
ψ2
i=0
n 2
Z
[t/n ]−2
st
n
Ct − IVt −→ MN(0, Ŵt ),
−1/4
n
γu2
×
Ctn
2
n 4
Z + n bθ ψ2 − 2aψ1
i
2θ 4 ψ22
0
t
σu2 αu2 du
+c
t
0
αu4 du
k22n =
k12n
⎛
k
n −1
1 ⎝
kn3 j =0
⎞
kn 2 1 kn 2
φ2 (j ) − φ2 (0) ⎠ ,
2
⎛
⎞
kn −1
1 ⎝
1
=
φ kn (j )φ2kn (j ) − φ1kn (0)φ2kn (0)⎠ .
kn j =0 1
2
Note that ψikn → ψi , kiln → il , and φikn (j ) − φi (j/kn ) → 0
(i, l = 1, 2), but the above approximations are the “correct”
quantities (i.e., these are the quantities that really appear in the
proof). Now, we redefine our main statistics as follows:
−1
ψ1kn n
n
Ct,a = 1 −
2θ 2 ψ2kn
√
[t/n ] n
ψ1kn n
n
n
V (Z, 2)t −
RVt
×
([t/n ] − kn + 2)θ ψ2kn
2θ 2 ψ2kn
Hautsch and Podolskij: Preaveraging-Based Estimation of Quadratic Variation
Table 1. Summary statistics of the underlying data
and
n
Ŵt,a
= 1−
ψ1kn n
2θ 2 ψ2kn
−2
[t/n ]−kn
4
Z̄ n +
i
4k22n [t/n ]
4
3θ ψ2kn ([t/n ] − kn + 1)
4n [t/n ]
3 ([t/ ]−k +1)
θ
n
n
i=0
kn
k n ψ k n
× k123 − 22k 14
ψ2 n
ψ2 n
×
[t/n ]−kn
n 2
Z̄
i+2k
n
n [t/n ]
3 ([t/ ] − 2)
θ
n
j =i+kn +1
i=0
kn
2kn ψ kn
kn (ψ kn )2
× k112 − 12k 31 + 22 k 14
ψ2 n
ψ2 n
ψ2 n
[t/n ]−2
2
n 2 n
i Z i+2 Z
×
,
×
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
169
i
|nj Z|2 +
i=1
where the subscript a stands for adjusted. Note that the finitesample adjustments do not influence the asymptotic results.
To explain the aforementioned modifications, let us consider
n
. First, the factor [t/n ]/([t/n ] − kn + 1) in
the statistic Ct,a
n
the definition of Ct,a
is an adjustment for the true number of
summands in V (Z, 2)nt . On the other hand, we have that
t
n n
n
RVt ≈
IVt ,
αu2 du +
2
2
0
where the error of this approximation is of order n and has
expectation 0. This means that the original statistic Ctn actually
estimates
ψ1kn n
IVt .
1−
2θ 2 ψ2kn
This explains the appearance of (1 −
n
of Ct,a
.
ψ1kn n −1
)
2θ 2 ψ2kn
in the definition
2.3 Choosing θ in Practice
To analyze the empirical properties of the preaveraging estimators and their sensitivity to the choice of θ , we employ
transaction and quote data of the stocks Exxon (XOM), Homedepot (HD), and Sonoco Products Co. (SON) traded at the NYSE
between May and August 2006 corresponding to 5 months or
88 trading days. We see this 5-month period as a good compromise between choosing a sample that, on the one hand, is
sufficiently long to guarantee empirically valid results but, on the
other hand, prevents aggregation over possibly many different
states of volatility. XOM and HD represent highly liquid stocks
with approximately 12 (80) and 8 (60) trade (quote) arrivals per
minute, respectively. Conversely, SON is significantly less liquid with approximately 2 (14) trade (quote) arrivals per minute.
As illustrated below, these stocks represent different empirical features and thus allow us to gain valuable insights into the
empirical performance of volatility estimators. However, to provide even more cross-sectional evidence, all empirical findings
reported in the article are replicated for three additional stocks,
Avg. time between trades (in sec)
Avg. time between quote arrivals (in sec)
Avg. time between nonzero quote changes
(in sec)
Avg. number of trades
Avg. number of quotes
Avg. proportion of nonzero trade returns
Avg. proportion of nonzero MQ returns
2
Avg. α̂ · 1e7
Avg. ξ̂
XOM
HD
SON
5.28
0.74
3.16
7.19
1.07
4.60
36.29
4.33
14.29
4428 3255 644
31,76 21,96 5405
0.73 0.60 0.62
0.23 0.23 0.30
0.10 0.08 0.69
0.31 0.23 0.23
NOTE: The table reports (daily) averages of the time between trades, quote arrivals, and
nonzero quote changes, the number of trades and quotes, the proportions of nonzero trade
2
(or midquote) returns, the (long-run) noise variance α̂ , the (long-run) noise variance per
trade ξ̂ , and first-order autocorrelations of underlying sampled returns.
that is, Citigroup (C), Tektronik (TEK), and Zale Corporation
(ZLC). These results are reported in the online Appendix.
Table 1 gives descriptive statistics on the number of trades,
quote arrivals, and nonzero returns for all stocks. It also yields
information on the magnitude of market
microstructure noise as
t
represented by its variance α 2t = 0 αu2 du and estimated by
⎧
N
N
N N
⎪
1 N N
⎪
⎪
−
i Z i−1 Z < 0,
Z
Z
,
if
⎪
i
i−1
⎪ N −1
⎨
i=2
i=2
2
α̂ t =
N N 2
⎪
⎪
i Z
⎪
⎪
⎪
,
otherwise,
⎩
2N
i=1
(2.19)
N
i Z
denotes trade-to-trade returns and N is the number
where
of all transaction returns. Equation (2.19) corresponds to the estimator proposed by Oomen (2006b) based on TRTS employing
all transactions, that is, n = N and = 1. As it provides only
positive variance estimates as long as the first-order autocorrela (Ni Z)2
tions of trade returns are negative, we replace it by N
i=1 2N
N
N
whenever N
i=2 (i Z)(i−1 Z) ≥ 0. The latter estimator is motivated by (2.11) in the case of iid noise. As suggested by Oomen
2
t /N) gives the noise-to-signal ratio per
(2006a), ξ̂t := α̂ t /(IV
t is computed by the maximum likelihood estitrade, where IV
mator (henceforth MLRV) proposed by Aı̈t-Sahalia, Mykland,
and Zhang (2005) (see the Appendix for implementation de2
tails). As shown in Table 1, α̂ t is varying substantially across
the different stocks, with ξ̂t being quite stable ranging from 0.2
to 0.3. We focus on four sampling schemes:
(i) calendar time sampling (CTS) using transaction prices,
(ii) calendar time sampling (CTS) using midquotes,
(iii) transaction time sampling (TRTS) using transaction
prices,
(iv) tick change time sampling (TTS) using midquotes.
CTS is commonly used in the empirical literature (see, e.g.,
Andersen et al. 2003) and is a natural choice if the sampling
frequency is moderate (e.g., 10 min). However, it faces the
problem of sampling mainly zero return intervals if the sampling frequency is very high. In this case, TRTS is a more natural
sampling scheme since one samples only whenever a transaction
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
170
Journal of Business & Economic Statistics, April 2013
Figure 1. Autocorrelations (up to three lags) of transaction price and midquote returns for XOM. Upper panel: CTS based on 1 sec and 5 sec.
Lower panel: T(R)TS based on 1 tick and 2 ticks.
has occurred avoiding sampling artificial zero returns. However,
even in this case, a high number of zero returns can be sampled
since many trades do not necessarily imply a price or quote
change. As an alternative, TTS ignores any zero returns and
samples only whenever quotes have been changed. As stressed
by Andersen, Dobrev, and Schaumburg (2010), TTS is justified
if volatility is constant in tick time (and thus observation times
of price movements and the quadratic variation of the price process are perfectly correlated), whereas CTS or TTS is optimal
when observation times are exogenous to the price process and
inference can be conditioned on observation times. However,
empirically it is rather unclear which assumption is justifiable
and thus which sampling scheme should be preferred. This motivates us to focus on all commonly applied sampling schemes.
Figures 1–3 show empirical autocorrelations for trade and
midquote returns sampled based on the different sampling
schemes. In the case of TRTS and TTS, we generally observe
negative first-order autocorrelations. The negative sign is well
confirmed by the literature and is predominantly driven by the
bid–ask bounce effect; see Roll (1984). Interestingly, midquote
returns tend to induce higher (negative) autocorrelations than
transaction returns. The same patterns also emerge based on
individual ask or bid quote returns. This is induced by the fact
that in the hybrid trading system of NYSE, transaction prices
are often located inside of the bid–ask spread and thus are less
discrete than (mid-)quote returns. Moreover, the empirical evidence shows that quote processes reveal strong reversals that are
naturally most pronounced in the case of TTS. Corresponding
evidence is also confirmed by Hautsch and Huang (2012) who
analyzed the quote impact of order arrivals based on blue chip
stocks. These findings show that using (mid-)quote returns does
not necessarily allow reducing the discreteness (and thus noise)
in the data. Overall, the serial dependence in CTS returns is
lower. This is particularly true in the case of trade price returns
Figure 2. Autocorrelations (up to three lags) of transaction price and midquote returns for HD. Upper panel: CTS based on 1 sec and 5 sec.
Lower panel: T(R)TS based on 1 tick and 2 ticks.
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
Hautsch and Podolskij: Preaveraging-Based Estimation of Quadratic Variation
171
Figure 3. Autocorrelations (up to three lags) of transaction price and midquote returns for SON. Upper panel: CTS based on 1 sec and 5 sec.
Lower panel: T(R)TS based on 1 tick and 2 ticks.
and if the sampling frequency is higher than the average trade
arrival frequency. As a result, consecutive returns are mostly
zero and trade-to-trade (or tick-to-tick) dynamics break down.
Interestingly, returns based on XOM trading are positively autocorrelated. This is in contrast to common findings and makes
this stock an interesting special case to analyze the properties
of preaveraging estimators under various conditions. See also
Hansen and Lunde (2006), Oomen (2006b), and Gatheral and
Oomen (2010) for an extensive discussion of empirical properties of market microstructure noise effects.
To gain insights into the performance of the proposed preaveraging estimators, we benchmark them against the most common
competing approaches employed in recent literature. In particular, we implement a subsampled RV estimator computed as
[t/n ]−kn
RVSSt =
kn−1
i=0
|Z(i+kn )n − Zin |2 .
(2.20)
Hence, RVSS corresponds to the subsampling counterpart to the
preaveraging estimator as it runs on the same underlying return
grid (ni ) and samples over intervals of the length kn . In fact, it
corresponds to a preaveraging estimator that equally weights all
return observations within the local window [in , (i + kn )n ]
and thus allows us to study the role of the implicit preaveraging
weighting scheme. Moreover, we use the maximum likelihood
estimator (henceforth MLRV) proposed by Aı̈t-Sahalia, Mykland, and Zhang (2005) and the realized kernel (RK) estimator (henceforth (MLRV)) introduced by BHLS08a based on the
Tukey-Hanning2 kernel with optimal bandwidth. For implementation details, see the Appendix. The preaveraging estimators
and their asymptotic variances are computed as described in the
previous section. As the estimators are not necessarily positive
in all cases, we bound them from below by zero. This happens,
however, quite rarely and only in cases where either θ or n is
chosen to be very large or in the case of an insufficient number
of intraday observations (as sometimes occurs in the case of
more illiquid stocks).
n
for different
Figures 4 and 5 show plots of Ctn and Ct,a
choices of θ based on 3-sec CTS as well as TRTS and TTS using
transaction prices and midquote changes. Note that in the case
of high values for θ , that is, high values of kn , the preaveraging
estimators cannot necessarily be computed when there are not
sufficient intraday observations. This might occur particularly
in the case of less liquid assets. Then, the corresponding figures
are based on only those trading days for which all estimators
can be computed for all values of θ avoiding any sample
selection effects. The range of analyzed realizations of θ is
chosen to guarantee using at least 90% of the overall sample.
Figure 4. Daily averages of Ctn (multiplied by 1e5) depending on θ . Based on T(R)TS and 3-sec CTS using prices and midquotes. Benchmarks:
MLRV and RK.
172
Journal of Business & Economic Statistics, April 2013
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
n
Figure 5. Daily averages of Ct,a
(multiplied by 1e5) depending on θ. Based on T(R)TS and 3-sec CTS using prices and midquotes.
Benchmarks: RVSS, MLRV, and RK.
The estimators’ sensitivity to the choice of θ , and thus the width
of the local preaveraging window, is highest if θ is small. This
sensitivity is particularly strong in the case of T(R)TS inducing
a strongly negative bias (relative to the other estimators) if θ is
small. This bias is obviously induced by the fact that in finite
samples, the statistic Ctn estimates An · IVt with
ψ1kn n
An = 1 −
2θ 2 ψ2kn
(see Section 2.2). For θ → 0, An is downward biased. This
effect is strongest in the case of T(R)TS. For CTS, the bias
is significantly smaller. Since 3-sec CTS and T(R)TS employ
essentially very similar underlying price or quote sampling information, the different θ -sensitivity of both sampling schemes
is only explained by a higher sampling frequency −1
n in the
case of CTS resulting in 7800 sampling intervals per day and
scaling up An toward one. Hence, CTS based on possibly high
frequencies is suggested to remove finite-sample biases. Conversely, as shown in Table 1, even for very actively traded assets,
T(R)TS does not induce sufficiently many sampling intervals to
ensure An ≈ 1. Hence, in the case of T(R)TS, scaling by A−1
n
n
as in Ct,a
is essential to reduce the estimator’s negative bias and
ensures stabilizing the estimates for small values of θ . However,
even after finite-sample adjustments, we still observe a slightly
negative bias of the estimator in the case of T(R)TS. This is obviously induced by a relatively strong impact of the noise-induced
component ψ1 n RVnt /(2θ 2 ψ2 ) driving down the estimator if kn
is small. In this case, the width of the local window is not sufficient to diminish the influence of noise. For larger values of θ
and thus kn , this effect vanishes and we observe a stabilization
of the estimator.
For larger values of θ , the impact of the underlying sampling
scheme diminishes and the estimates tend to stabilize. Preaveraging and RK estimators are quite similar (on average) for
values of θ around 0.4–0.6. This choice of θ seems to widely
resemble the optimal bandwidth choice in the RK according
to BHLS08b. If, however, the preaveraging interval becomes
too high, we observe a downward trend of the estimates. In
these cases, the estimators tend to “oversmooth” and thus yield
downward biased estimates. These results suggest choosing θ
not too small, but rather conservatively, confirming the results
by BHLS08b. Plotting also the subsampled estimators, RVSS,
allows us to analyze the impact of the implicit preaveraging
weighting schemes. We observe that in the case of longer local
intervals, the subsampling estimator is downward biased relative
to the preaveraging estimator and thus tends to “oversmooth”
even more strongly. If the preaveraging interval shrinks, also
the subsampling estimator becomes quite sensitive to the choice
of θ and reveals significant biases. However, interestingly, it is
less biased than T(R)TS-based preaveraging estimators. In these
situations, the implicit preaveraging weighting scheme induces
an even higher θ -sensitivity.
To analyze the impact of the underlying sampling fren
based
quency, Figure 6 shows the θ -sensitivity of Ctn and Ct,a
on TRTS employing different sampling frequencies (n ∈
{1, 3, 5, 10, 20}) based on CTS and T(R)TS. As expected, the
finite-sample adjustment becomes even more important when
n is large, that is, the sampling frequency is small. In these
scenarios, the negative biases for small values of θ become substantial. The lower panel of Figure 6 shows that in these cases,
the finite-sample adjustment is still very effective and significantly reduces the bias (note that the figures use different scales
on the vertical axis). Moreover, we observe a downward bias of
the estimator with increasing n . Hence, preaveraging based on
sparse sampling implies an oversmoothing and thus downward
bias of volatility. This is particularly true for the subsampled
RV estimator. In this case, an implicit equal weighting of underlying returns is clearly inferior to the preaveraging weighting
scheme g(·) (see Section 2) putting less weights on more distant
observations. We also observe that these effects become more
extreme if the underlying liquidity of the stock declines.
Finally, note that we consider average values of θ across
the entire sample. However, recall from (2.15) that an optimal
choice of θ depends on the local signal-to-noise ratio and thus
should also depend on intraday variations of volatility. With the
latter following the typical U-shape intraday pattern, a constant
θ tends to oversmooth in the morning and undersmooth around
midday. This effect is naturally stronger based on CTS than
based on T(R)TS. However, our empirical findings above show
that oversmoothing (implied by a too high θ ) is less harmful
for the estimator’s bias than undersmoothing (implied by a too
small θ ). Consequently, as practical guidance in situations where
the intraday volatility strongly fluctuates, we suggest using θ
rather conservatively with the choices of θ recommended above
serving as lower bounds. Theoretically, however, the parameter
θ should be indeed chosen according to the local signal-tonoise ratio. The precise description of such an approach and its
efficiency gain compared with the case of a constant choice of θ
are investigated by Jacod and Mykland (2012). However, their
method is computationally clearly more involved.
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
Hautsch and Podolskij: Preaveraging-Based Estimation of Quadratic Variation
173
n
Figure 6. Daily averages of Ctn and MLRV (top panel) as well as Ct,a
and RVSSt (bottom panel) for different values of θ and n ∈
{1, 3, 5, 10, 20} using TRTS. All estimators multiplied by 1e5.
Figure 7 depicts the average values of the approximate standard deviations of
the preaveraging estimates with finite-sample
1/4
Ŵt,a . For the more liquid assets, we observe
adjustments, n
that the standard deviation is monotonically increasing in θ regardless of the finite-sample adjustment. Hence, using a larger
preaveraging window ultimately increases the estimation error.
However, for less liquid stocks (see also the online Appendix),
the corresponding plots reveal slight nonmonotonicities with increasing variances if θ becomes too small. In these cases, the
local window is obviously too short to effectively remove market microstructure noise. Hence, choosing too small values of θ
in the case of less liquid assets can be harmful not only for the
estimator’s unbiasedness but also for its efficiency.
The relationships revealed by Figures 4–7 indicate an obvious
trade-off between bias and efficiency. Minimizing the estimator’s variance generally yields very small values of θ and thus
the length of the local interval. For these realizations, however,
the “θ -signature plots” in Figures 4 and 5 reveal the highest
sensitivity of the estimators with respect to θ and the underlying
sampling schemes. Moreover, as shown in Section 4, choosing
−1/4 √
Figure 7. Averages of nt
prices and midquotes.
θ too small turns out to be also quite harmful for jump-robust
estimations and jump detections. Hence, aiming at finding a
“global” choice of θ balancing biases (in the case of too short
local intervals) against inefficiency (in the case of too long local
intervals), θ should be chosen as the smallest possible value at
which the signature plots tend to stabilize and the divergence
between different sampling schemes tends to vanish. According to Figures 4 and 5, the optimal choice of θ is thus around
0.4 based on liquid assets. In the case of less liquid stocks, we
suggest θ ≈ 0.6.
3.
3.1
CONTINUOUS SEMIMARTINGALES WITH
DEPENDENT NOISE
Preaveraging With Dependent Noise
Model (2.2) implies that the noise process is uncorrelated,
which is not realistic at very high frequencies (see, e.g., Hansen
and Lunde 2006). In this section, we allow for serial correlation
in the noise process and derive the corresponding asymptotic
Ŵ t,a (multiplied by 1e5) for different choices of θ. Based on highest frequency T(R)TS and 3-sec CTS using
174
Journal of Business & Economic Statistics, April 2013
results for the preaveraging method. For an alternative approach
of dealing with dependent noise, see Nolte and Voev (2009). We
consider a model
Zin = Xin + Ui ,
(3.21)
where X and U are independent, and (Ui )i≥0 is a stationary qdependent sequence (for some known q > 0), that is, Ui and Uj
are independent for |i − j | > q. As we remarked before, model
(3.21) only makes sense in discrete time due to the dependence
structure of the noise. We define the covariance function by
ρ(k) := cov(U1 , U1+k ).
statistic V (Z, 2)nt . The asymptotic results are presented in the
following theorem.
Theorem 1. Assume that EU12 < ∞. Then we have
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
with
ρ 2 = ρ(0) + 2
q
ρ(k).
where Ŵt (q) =
t
γu2 (q) =
4
ψ22
ni Zni+k Z,
i=0
k = 0, . . . , q + 1. (3.23)
In the econometric literature, such estimators are called realized
autocovariances (see, e.g., BHLS08a). The following lemma
describes the asymptotic behavior of γtn (k).
γu2 (q)du and γu2 (q) is defined by
0
ρ 2 σu2
ρ4
22 θ σu4 + 212
+ 11 3 . (3.26)
θ
θ
As before, we need to estimate the conditional variance Ŵt (q)
to obtain a feasible version of (3.25). In fact, the estimation
procedure is somewhat easier than the one presented in Section
2, because the quantity ρ 2 is not time varying.
Proposition 1. Assume that EU18 < ∞. Then, we obtain
Ŵtn (q)
P
P
γtn (k) −→ t(2ρ(k) − ρ(k − 1) − ρ(k + 1)),
P
Note that Lemma 2 provides estimates ρn (0), . . . , ρn (q) of
ρ(0), . . . , ρ(q) that are obtained by a simple recursion:
2
Thus,
q we deduce a2 n -consistent estimator ρn = ρn (0) +
2 k=1 ρn (k) of ρ that can be used for bias correcting the
i=0
n 4
Z
i
1/4
n
P
γtn (q + 1) −→ −tρ(q).
γtn (q + 1)
,
t
γ n (q)
ρn (q − 1) = − t
+ 2ρn (q),
t
γ n (q − 1)
+ 2ρn (q − 1) − ρn (q).
ρn (q − 2) = − t
t
and it holds that
γtn (q) −→ t(2ρ(q) − ρ(q − 1)),
ρn (q) = −
[t/n ]−kn
+
γtn (0) −→ 2t(ρ(0) − ρ(1)),
k = 1, . . . , q − 1,
422
=
3θ ψ24
√
[t/
n ]−2kn
n 2
22 ψ1
8ρn2 n 12
Z
−
i
4
3
2
θ
ψ2
ψ2
i=0
4ρ 4 11
22 ψ12
12 ψ1
P
+ 3n
+
−→ Ŵt (q),
−
2
θ
ψ22
ψ24
ψ23
Lemma 2. Assume that EU12 < ∞. Then, we obtain
−1/2
(3.25)
(3.22)
[t/n ]
st
n1/4 Ctn (q) − IVt −→ MN(0, Ŵt (q)),
k=1
Similar to (2.12), we obtain a bias that has to be estimated from
the data. The estimation procedure is a bit more involved than
in the previous section. We introduce a class of estimators given
by
γtn (k) = n
(3.24)
Furthermore, when EU18 < ∞, we deduce the associated central
limit theorem
Similar to the convergence in (2.12), we obtain the following
results.
Lemma 1. Assume that EU12 < ∞. Then, we have
√
t
n
ψ1 t
n P
σu2 du + 2 ρ 2
V (Z, 2)t −→
θ ψ2
θ
ψ2
0
√
n
ψ1 t
P
:=
V (Z, 2)nt − 2 ρn2 −→ IVt .
θ ψ2
θ ψ2
Ctn (q)
Ctn (q) − IVt
d
n
−→ N(0, 1).
Ŵt (q)
The corresponding finite-sample adjustments are obtained
similarly as in Section 2.2 and are given by
n
(q)
Ct,a
: = 1−
ψ1kn n
−1
2θ 2 ψ2kn
ψ1kn t 2
−
ρn ,
θ 2 ψ2kn
√
[t/n ] n
([t/n ]−kn +1)θ ψ2kn
V (Z, 2)nt
(3.27)
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
Hautsch and Podolskij: Preaveraging-Based Estimation of Quadratic Variation
175
Figure 8. Daily averages of Ctn (1) (upper panel) and Ctn (2) (lower panel) for different values of θ. Estimators multiplied by 1e5. Based on
highest frequency T(R)TS and 3-sec CTS using prices and midquotes. Benchmarks: RVSS, MLRV, and RK implemented based on trade prices.
and
−2
4k22n [t/n ]
kn 4
3θ ψ2 ([t/n ] − kn + 1)
√
[t/n ]−kn
2
n 4
Z + 8ρn n [t/n ]
×
i
θ 2 ([t/n ] − kn + 1)
i=0
[t/ ]−2k
n
n
n 2
k12n
k22n ψ1
Z
× 3 − 4
i
ψ2kn
ψ2kn
i=0
n
(q) = 1 −
Ŵt,a
+
×
ψ1kn n
2θ 2 ψ2kn
4ρn4 [t/n ]
3
θ ([t/n ] − kn +
k11n
kn 2
ψ2
−
1)
k n ψ k n
2 12 13
ψ2kn
+
2
k22n ψ1kn
kn 4
ψ2
. (3.28)
n
n
Figure 8 shows daily averages of Ct,a
(1) and Ct,a
(2) depending on θ . Comparing these signature plots with Figures 4 and 5,
we observe that in the case of CTS, the dependent noise-robust
version of the preaveraging estimator provides similar shapes
as the iid-noise version. Conversely, for T(R)TS, we observe
n
n
(q) > Ct,a
. Comparing the expressions
that in most cases Ct,a
n
n
for Ct,a (q) and Ct,a (and standardizing t to one), this can be
induced only by
n RVnt
> ρn2 .
2
Obviously, if there is negative (positive) serial dependence in
the noise process {U }, the estimator n RVnt /2 is upward (downward) biased. As shown below, we indeed find evidence for
significant serial correlations in the noise process inducing an
upward bias of the expression n RVnt /2. These effects are most
distinct for TTS for which also the strongest serial dependence
in the return process is found (see Table 1). Similar but weaker
effects are observable in the case of TRTS. The figures show
that the corresponding finite-sample adjustments seem to imply overcorrections of the estimators’ negative biases inducing
n
(q)
even slight upward biases if θ is small. This indicates that Ct,a
n
tends to be more sensitive to the choice of θ than Ct,a . Overall,
n
the signature plots reveal similar shapes as for Ct,a
. Hence, the
θ -sensitivity is clearly reduced for values θ > 0.4.
3.2
A Test for Dependence
In the previous section, we assumed that the number of nonvanishing covariances q is known. In practice, however, we need
a decision rule on the choice of q based on the discrete observations of the price process. Formally speaking, we require a test
procedure to decide whether ρ(k) = 0 for some k ≥ 1. To illustrate the underlying idea, consider a 1-dependent noise model.
In this case, we aim at testing whether ρ(1) = 0, that is, whether
the noise process is actually iid. Thus, we obtain the following
hypothesis:
H0 : ρ(1) = 0,
H1 : ρ(1) = 0.
As we will observe from the proof of Lemma 2, the statistics γtn (k) are asymptotically unaffected by the presence of the
process X (thus, only the noise process U drives their asymptotic behavior). Hence, by a standard central limit theorem for
q-dependent variables, we deduce that
−1/2
n
γtn (2)
τ2
d
−
,
− ρ(1) −→ N 0,
t
t
176
Journal of Business & Economic Statistics, April 2013
Figure 9. Histogram of AR(1) test statistic Sn . Based on highest frequency T(R)TS and 3-sec CTS.
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
where γtn (2) is defined by (3.23) and τ 2 is given by
τ 2 = τ (0) + 2
2
τ (k),
k=1
τ (k) = cov((U1 − U0 )(U3 − U2 ), (U1+k − Uk )(U3+k − U2+k )).
Obviously, we require a consistent estimator of τ (k), 0 ≤ k ≤ 2,
to construct a test statistic. Set Hin = ni Zni+2 Z. Then, by the
weak law of large numbers for q-dependent variables, we deduce
that
[t/ ]
n
n
− Hin Hi+3
n i=0 n Hin Hi+k
P
τn (k) =
−→ τ (k),
t
0 ≤ k ≤ 2.
The latter implies that τn2 = τn (0) + 2 2k=1 τn (k) is a consistent
estimator of τ 2 . Then, the test statistic is given by
Sn =
−1/2
−n γtn (2)
.
tτn2
Observe that under H0 : ρ(1) = 0, we obtain that
d
Sn −→ N(0, 1).
We reject H0 : ρ(1) = 0 at level α when
|Sn | > c1− α2 ,
where c1− α2 denotes the (1 − α2 ) quantile of N(0, 1) distribution. Note that this test is consistent against the alternative
H1 : ρ(1) = 0.
Figure 9 shows the (time series) distribution of the test statistic Sn based on highest frequency T(R)TS and 3-sec CTS. We
observe for liquid stocks that Sn takes highly negative values
supporting the evidence for significantly negative serial dependence in the noise process and confirming the results by Hansen
and Lunde (2006). The strength of the serial dependence is
decreasing if the underlying trading frequency of the stock becomes smaller. Correspondingly, the highest test statistics are
found for XOM, whereas for the less liquid stocks, the time
series distribution of the test statistic is virtually symmetrically
around zero. Moreover, the highest magnitudes are found for
TTS using midquotes for which we also observe the strongest
serial dependence in the underlying return series. Conversely,
3-sec CTS yields significant lower test statistics (in absolute
terms). Hence, we can conclude that the serial dependence in
underlying returns is mainly driven by a serial dependence in
the corresponding noise process. As reported in Section 2, these
autocorrelations and, thus, the test statistic Sn are obviously
strongly dependent on the underlying sampling scheme. This is
most evident for midquote-based sampling. Conversely, based
on CTS, Sn is close to zero on average in most cases. T
ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20
Preaveraging-Based Estimation of Quadratic
Variation in the Presence of Noise and Jumps:
Theory, Implementation, and Empirical Evidence
Nikolaus Hautsch & Mark Podolskij
To cite this article: Nikolaus Hautsch & Mark Podolskij (2013) Preaveraging-Based Estimation
of Quadratic Variation in the Presence of Noise and Jumps: Theory, Implementation,
and Empirical Evidence, Journal of Business & Economic Statistics, 31:2, 165-183, DOI:
10.1080/07350015.2012.754313
To link to this article: http://dx.doi.org/10.1080/07350015.2012.754313
Accepted author version posted online: 04
Feb 2013.
Submit your article to this journal
Article views: 276
View related articles
Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=ubes20
Download by: [Universitas Maritim Raja Ali Haji]
Date: 11 January 2016, At: 22:05
Preaveraging-Based Estimation of Quadratic
Variation in the Presence of Noise and Jumps:
Theory, Implementation, and Empirical Evidence
Nikolaus HAUTSCH
Institute for Statistics and Econometrics and Center for Applied Statistics and Economics (CASE),
Humboldt-Universität zu Berlin and Center for Financial Studies (CFS), Frankfurt, D-10178 Berlin, Germany
([email protected])
Mark PODOLSKIJ
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
Institute of Applied Mathematics, University of Heidelberg, D-69120 Heidelberg, Germany
([email protected])
This article contributes to the theory for preaveraging estimators of the daily quadratic variation of asset
prices and provides novel empirical evidence. We develop asymptotic theory for preaveraging estimators in
the case of autocorrelated microstructure noise and propose an explicit test for serial dependence. Moreover,
we extend the theory on preaveraging estimators for processes involving jumps. We discuss several jumprobust measures and derive feasible central limit theorems for the general quadratic variation. Using
transaction data of different stocks traded at the New York Stock Exchange, we analyze the estimators’
sensitivity to the choice of the preaveraging bandwidth. Moreover, we investigate the dependence of
preaveraging-based inference on the sampling scheme, the sampling frequency, microstructure noise
properties, and the occurrence of jumps. As a result of a thorough empirical study, we provide guidance
for optimal implementation of preaveraging estimators and discuss potential pitfalls in practice.
KEY WORDS: High frequency data; Jump diffusions; Power variations; Sampling schemes; Volatility
estimation.
1.
INTRODUCTION
The availability of high-frequency data has significantly influenced research on volatility estimation during the last decade.
Inspired by the seminal articles by Andersen et al. (2001) and
Barndorff-Nielsen and Shephard (2002), the idea of estimating
daily volatility using realized measures relying on intraday data
has stimulated a new and active area of volatility modeling and
estimation.
This article provides new theory, implementation details, and
extensive and detailed empirical results for the class of preaveraging estimators originally introduced by Jacod et al. (2009)—
henceforth JLMPV09. The article’s aim is two-fold. First, we extend existing theory for preaveraging estimators in the presence
of serially dependent noise and jumps. We derive the asymptotic
properties of preaveraging estimators if market microstructure
noise is autocorrelated and suggest an explicit test for dependence up to q lags (so-called q dependence). To address the
occurrence of jumps, we derive a feasible central limit theorem for preaveraged multipower estimators and appropriate
local estimators for underlying components of the estimators’
asymptotic variance. Second, we provide a thorough empirical
study on the properties of preaveraging estimators in practice.
We study their empirical performance relative to alternative
state-of-the-art estimators and analyze the impact of implementation details, such as the choice of local interval lengths, the
role of sampling frequency, finite-sample adjustments, and the
impact of the sampling scheme depending on noise properties
and jumps.
The main difficulty in estimating the daily quadratic variation of asset prices using noisy high-frequency data is how to
optimally employ a maximum of information without being affected by so-called market microstructure noise arising from
market frictions, such as the bid–ask bounce or the discreteness
of prices, inducing a deviation of the price process from a continuous semimartingale process. As discussed, for example, by
Hansen and Lunde (2006), among others, market microstructure
noise can lead to severe biases and inconsistency of the estimators when the sampling frequency tends to infinity. To overcome
these effects, sparse sampling, for example, based on 20 min, has
been proposed as an ad hoc solution (see, e.g., Andersen et al.
2003). The major disadvantage of sparse sampling, however,
is the enormous data loss and the resulting inefficiency of the
estimator. As a result, various methods for bias corrections and
filtering of noise effects have been proposed in the literature. The
most prominent and general approaches are the realized kernel
(RK) estimator by Barndorff-Nielsen et al. (2008a) (henceforth
BHLS08a), the maximum likelihood estimator by Aı̈t-Sahalia,
Mykland, and Zhang (2005), and the preaveraging estimator by
JLMPV09. Andersen, Bollerslev, and Diebold (2008) provided
an overview of alternative approaches. For sound practical implementations, however, a good understanding of the estimators’
finite-sample behavior, their dependence on plug-in estimators
165
© 2013 American Statistical Association
Journal of Business & Economic Statistics
April 2013, Vol. 31, No. 2
DOI: 10.1080/07350015.2012.754313
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
166
and data-specific adjustments, and implementation details is crucial. In the case of kernel-type estimators, the empirical performance and properties are analyzed by Hansen and Lunde (2006)
and Barndorff-Nielsen et al. (2008b)—henceforth BHLS08b.
The latter shows that the estimators’ empirical (finite-sample)
properties can significantly deviate from theoretical (asymptotic) properties making finite-sample adjustments necessary
and ultimately important in financial practice. The first study
analyzing the performance of preaveraging estimators is Andersen, Dobrev, and Schaumburg (2010) who compared these
estimators with jump-robust nearest neighbor truncation estimators. Nevertheless, preaveraging estimators’ empirical properties and their sensitivity to implementation details as well as
to data features are still widely unknown.
The article’s theoretical contribution is two-fold: First, we develop asymptotic theory for preaveraging estimators in the case
of serially dependent market microstructure noise. In this context, we derive asymptotic theory for preaveraging estimators if
microstructure noise is q dependent and propose a corresponding autocorrelation test. Second, we construct a feasible central
limit theorem for the quadratic variation including a jump part
by estimating the variation’s asymptotic variance involving local jump-robust estimates of the noise and diffusive volatility
components. This makes the pure probabilistic results of Jacod, Podolskij, and Vetter (2009), who proved the asymptotic
mixed normality of the estimator, applicable for determining
confidence regions for the general quadratic variation.
The article’s second major contribution is of empirical nature
and is motivated by the need for a better understanding of how
preaveraging estimators and corresponding inference work in
practice, and how they should be optimally implemented. We
study the estimators’ dependence on the choice of the underlying
local preaveraging interval, the role of the sampling scheme (i.e.,
calendar time sampling vs. business time sampling, transaction
price sampling vs. midquote price sampling), and the impact of
the sampling frequency. Of particular interest is how the derived
noise-robust and jump-robust inference is reflected by the data
and how the estimators perform relative to alternative state-ofthe-art estimators.
Employing preaveraging-based inference to transaction and
quote data of a cross section of New York Stock Exchange
(NYSE) stocks covering different trading frequencies, we analyze the estimators’ sensitivity to the length of the underlying
local preaveraging interval. We show that too small choices of
the length of the local window induce biases and make estimators quite unstable. Addressing an inherent trade-off between
estimators’ bias and variance, we suggest a rule of thumb for
choosing the local window. In this context, finite-sample adjustments of the estimator and the underlying sampling scheme are
particularly important. Using an optimal preaveraging scheme,
we find jump proportions between 5% and 10% on average.
Ignoring the possibility of jumps in the price process leads to an
overestimation of confidence intervals that can be substantial in
the case of less liquid assets. Moreover, our results suggest implementing preaveraging estimators based on maximally high
sampling frequencies. A reduction in sampling frequency tends
to imply an “oversmoothing” of volatility resulting in negative
biases. As a result of this analysis, we derive suggestions for
optimal implementation of this class of estimators and tests in
practice.
Journal of Business & Economic Statistics, April 2013
The remainder of the article is organized as follows. In Section 2, we present the basic case with underlying conditionally
independent noise without jumps, discuss the underlying theoretical framework, and provide evidence on the optimal choice
of the preaveraging interval. Section 3 considers preaveraging
estimators under the assumption of continuous semimartingales
with dependent noise and presents empirical evidence based
on an explicit test. In Section 4, we discuss the discontinuous
case including the possibility of jumps. Finally, in Section 5,
the time series properties and the impact of sampling frequencies and schemes are analyzed, while Section 6 concludes. All
proofs are collected in the Appendix.
2. PREAVERAGING ESTIMATORS FOR THE
QUADRATIC VARIATION: THE CONTINUOUS CASE
2.1 The Basic Model
It is well known in finance that, under the no-arbitrage assumption, price processes must follow a semimartingale (see,
e.g., Delbaen and Schachermayer 1994). In this section, we
consider a continuous semimartingale (Xt )t≥0 of the form
t
t
Xt = X0 +
as ds +
σs dWs .
(2.1)
0
0
Here, W denotes a one-dimensional Brownian motion, (as )s≥0 is
a càglàd drift process, and (σs )s≥0 is an adapted càdlàg volatility
process. Moreover, t denotes (continuous) calendar time.
However, due to various market frictions, such as price discreteness or bid–ask spreads, the efficient price X is contaminated by noise. Thus, we can only observe a noisy version Z of
the process X. More precisely, we consider a filtered probability space (, F, (Ft )t≥0 , P) on which we define the process Z,
observed at n time points, indexed by i = 0, 1, . . . , n, as
Zt = Xt + Ut ,
t ≥ 0,
(2.2)
where Ut denotes the error term. In the case of transaction time
sampling (TRTS), i indexes the (irregular) time points associated
with each n -th trade. Hence, n = [N/n ] with N denoting the
number of trades until t. In the case of calendar time sampling
(CTS), i indexes equal-spaced time intervals of length n with
n = [t/n ]. Here, the price of the most previous observation
occurring before the end of the sampling interval is used (previous tick sampling). For convenience, we show all theoretical
relationships using the CTS notation. However, all relationships
also hold for TRTS with i, n , and n accordingly defined.
We are in the framework of high-frequency asymptotics or
so-called “in-fill” asymptotics, that is, n → 0. We assume that
Ut ’s are, conditional on the efficient price X = (Xs )s≥0 , centered
and independent, that is,
E(Ut |X) = 0,
Ut ⊥⊥ Us , t = s, conditional on X. (2.3)
Furthermore, we assume that the conditional variance of the
noise process U, defined as
(2.4)
αt2 = E Ut2 |X ,
is càdlàg, and introduce the process
Nt (q) = E(|Ut |q |X),
(2.5)
Hautsch and Podolskij: Preaveraging-Based Estimation of Quadratic Variation
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
which denotes the qth conditional absolute moment of the process U. Note that the noise specification does not entail a noise
level tending to 0 as n → ∞. This property is an attractive
feature for empirical work and will also be illustrated in examples given later. Model (2.2) was originally introduced by
JLMPV09. Note that the process (2.2) with conditions (2.3) and
(2.4) is defined in continuous time (see JLMPV09 for the precise
construction of the underlying probability space). In Section 3,
we discuss a discrete version of the model to introduce serial
dependence in noise.
Model (2.2) allows in particular for time-varying variances of
the noise and dependence between the efficient price X and the
microstructure noise U. Let us give some examples to illustrate
the applicability of our model. (i) The standard additive iid noise
model
such that g ′ is piecewise Lipschitz, and g(0) = g(1) = 0. Then,
g is associated with the following real-valued numbers:
1
1
ψ1 =
(g ′ (s))2 ds, ψ2 =
(g(s))2 ds,
φ1 (s) =
Zt = γ [(Xt + Vt )/γ ],
where γ > 0 is a fixed rounding level, and Vt ’s are iid U([0, γ ])
distributed and independent of X. Then the process Ut = Zt −
Xt satisfies the assumption (2.3), and we have the identity
Xt
Xt 2
2
2
−
αt = γ
,
γ
γ
with {x} = x − [x]. The process
αt2
0
0
1
s
g ′ (u)g ′ (u − s)du,
Our main goal is the estimation of the integrated variance
(IV) (or quadratic variation) defined as
t
IVt = [X]t =
σu2 du.
(2.6)
0
The preaveraging method as originally introduced by Podolskij
and Vetter (2009a) is based on certain local moving averages that
reduce the influence of the noise process U. Below, we briefly
review the asymptotic theory for the preaveraging method as
presented in JLMPV09.
To construct the estimator, we choose a sequence kn of integers, which satisfies
(2.7)
kn n1/2 = θ + o n1/4
for some θ > 0, and a nonzero real-valued function g : [0, 1] →
R, which is continuous, piecewise continuously differentiable
1
s
g(u)g(u − s)du,
A typical example of a function g : [0, 1] → R, which we use
in the empirical applications, is given by g(x) = x ∧ (1 − x). In
this case, the aforementioned constants are
ψ1 = 1,
22 =
ψ2 =
1
,
12
11 =
1
,
6
12 =
1
,
96
151
.
80,640
The preaveraged returns are given as
n
Zi
kn
j
ni+j Z,
g
=
k
n
j =1
ni Z = Zin − Z(i−1)n . (2.8)
Note that the latter performs a weighted averaging of the increments nj Z in the local window [in , (i + kn )n ]. It is intuitively clear that such an averaging diminishes the influence
of the noise to some extent. The window size kn is chosen to
−1/2
be of order n to find a balance between the two conflicting
convergence rates that are due to the diffusive part and the noise
part. This choice will lead later to optimal convergence rates.
Our main statistical instruments are the realized variance (RV)
[t/n ]
RVnt
=
n Z 2 ,
i
(2.9)
i=0
and the preaveraging estimator
is obviously càdlàg.
2.2 The Preaveraging Method and Asymptotic Results
φ2 (s) =
i, j = 1, 2.
0
where Ut ’s are iid with E(Ut ) = 0, E(Ut2 ) = ̟ 2 , and independent of the efficient price X, is obviously included in our
framework. (ii) The pure rounding model
at fixed level γ > 0 does not satisfy the first assumption of
(2.3). In fact, it is easy to observe that the integrated variance
cannot be estimated in the pure rounding model, because only
observations on a fixed grid γ Z are available.
(iii) In contrast to the pure rounding process in (ii), iid noise
plus rounding models may satisfy our assumptions. Consider
the model
s ∈ [0, 1],
1
ij =
φi (s)φj (s)ds,
Zt = Xt + Ut ,
Zt = γ [Xt /γ ]
167
[t/n ]−kn
V (Z, 2)nt =
i=0
n 2
Z ,
i
(2.10)
which is a direct analog of RV based on preaveraged returns
n
Z i . We remark that RVnt is not an appropriate estimator of the
integrated volatility in the noisy diffusion model (2.2). More
precisely, we have that
t
n n P
RVt −→
αu2 du,
(2.11)
2
0
that is, a normalized version of RVnt converges to the integrated conditional variance of the noise process U (see, e.g.,
JLMPV09).
Denote a mixed normal distribution with expectation 0
and (conditional) variance F 2 by MN(0, F 2 ) and assume that
the process Nt (2) is locally bounded. Then, as shown by
JLMPV09, we have
√
t
t
n
ψ1
2
n P
α 2 du, (2.12)
σu du + 2
V (Z, 2)t −→
θ ψ2
θ ψ2 0 u
0
168
Journal of Business & Economic Statistics, April 2013
where by (2.11) a consistent estimator of IVt is obtained by
Ctn :=
√
n
ψ1 n
P
V (Z, 2)nt − 2 RVnt −→ IVt =
θ ψ2
2θ ψ2
for given constants a, b, c. Following the arguments of remark
4 in JLMPV09, a consistent estimator of T is obtained by
t
σu2 du.
0
(2.13)
Ttn =
a
2
3θ ψ22
[t/n ]−kn
i=0
[t/n ]−2kn
Hence, the consistency of
is achieved by subtracting the
noise-induced bias term that is similar to Zhang, Mykland, and
Aı̈t-Sahalia (2005). Furthermore, under the assumption that the
process Nt (8) is locally bounded, JLMPV09 showed that
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
22 θ σu4
σ 2α2
α4
+ 212 u u + 11 u3
θ
θ
.
(2.15)
1/4
The estimator Ctn has the convergence rate n , which is
known to be the best attainable (see Gloter and Jacod 2001).
To obtain a feasible version of stable convergence in (2.14),
JLMPV09 suggested a consistent estimator of the conditional
variance Ŵt given by
Ŵtn =
422
3θ ψ24
+
+
[t/n ]−kn
i=0
4n
θ3
n
θ3
n 4
Z
i
[t/
n ]−2kn
12
22 ψ1
−
3
ψ24
ψ2
i=0
n 2
Z
11
12 ψ1
22 ψ1
−2
+
2
3
ψ2
ψ24
ψ2
i+2k
n
i
2
j =i+kn +1
P
n Z 2 n Z 2 −→
Ŵt .
i+2
i
n 2
Z
j
(2.16)
i=1
The statistic Ŵtn is obtained by separately estimating the three
integrals appearing in the definition of Ŵt . It has the appealing property of being positive as long as the weight function
g(x) = x ∧ (1 − x) is used. Then, by the properties of stable
convergence, we deduce the convergence in distribution
−1/4
n
Ctn − IVt
d
n
−→ N(0, 1).
Ŵt
(2.17)
Clearly, we are also
t able to estimate other random quantities
of similar form as 0 γu2 du. Consider a random variable of the
type
Tt = a
t
0
σu4 du
+b
i
i+2k
n
j =i+kn +1
n 2
Z
aψ12 − bθ 2 ψ1 ψ2
c+
θ 4 ψ22
j
P
n Z 2 n Z 2 −→
Tt .
i+2
i
×
(2.18)
i=1
For instance, when taking (a, b, c) = (1, 0, 0),
t we obtain a consistent estimator of the integrated quarticity 0 σu4 du.
To reduce the finite-sample bias in the case of too small
sampling frequencies, we need to slightly modify our statistics.
First, let us introduce the finite-sample analogs of the quantities
ψ1 , ψ2 , ij , φ1 (s), φ2 (s) as given by
ψ1kn = kn
ψ2kn =
2
kn
j +1
j
g
−g
,
k
k
n
n
j =1
kn −1
j
1
,
g2
kn j =1
kn
kn
i
i−1
−g
g
k
k
n
n
i=j +1
i−j
i−j −1
× g
−g
,
kn
kn
kn
i−j
i
φ2kn (j ) =
g
,
g
kn
kn
i=j +1
⎞
⎛
k
n −1
kn 2 1 kn 2
kn
11 = kn ⎝
φ1 (j ) − φ1 (0) ⎠ ,
2
j =0
φ1kn (j ) =
[t/n ]−2
×
n
+
4
(2.14)
where the convergence is stable in distribution [see, e.g.,
Renyi
(1963) for the definition of stable convergence], and
t
Ŵt = 0 γu2 du and γu2 are defined by
4
= 2
ψ2
i=0
n 2
Z
[t/n ]−2
st
n
Ct − IVt −→ MN(0, Ŵt ),
−1/4
n
γu2
×
Ctn
2
n 4
Z + n bθ ψ2 − 2aψ1
i
2θ 4 ψ22
0
t
σu2 αu2 du
+c
t
0
αu4 du
k22n =
k12n
⎛
k
n −1
1 ⎝
kn3 j =0
⎞
kn 2 1 kn 2
φ2 (j ) − φ2 (0) ⎠ ,
2
⎛
⎞
kn −1
1 ⎝
1
=
φ kn (j )φ2kn (j ) − φ1kn (0)φ2kn (0)⎠ .
kn j =0 1
2
Note that ψikn → ψi , kiln → il , and φikn (j ) − φi (j/kn ) → 0
(i, l = 1, 2), but the above approximations are the “correct”
quantities (i.e., these are the quantities that really appear in the
proof). Now, we redefine our main statistics as follows:
−1
ψ1kn n
n
Ct,a = 1 −
2θ 2 ψ2kn
√
[t/n ] n
ψ1kn n
n
n
V (Z, 2)t −
RVt
×
([t/n ] − kn + 2)θ ψ2kn
2θ 2 ψ2kn
Hautsch and Podolskij: Preaveraging-Based Estimation of Quadratic Variation
Table 1. Summary statistics of the underlying data
and
n
Ŵt,a
= 1−
ψ1kn n
2θ 2 ψ2kn
−2
[t/n ]−kn
4
Z̄ n +
i
4k22n [t/n ]
4
3θ ψ2kn ([t/n ] − kn + 1)
4n [t/n ]
3 ([t/ ]−k +1)
θ
n
n
i=0
kn
k n ψ k n
× k123 − 22k 14
ψ2 n
ψ2 n
×
[t/n ]−kn
n 2
Z̄
i+2k
n
n [t/n ]
3 ([t/ ] − 2)
θ
n
j =i+kn +1
i=0
kn
2kn ψ kn
kn (ψ kn )2
× k112 − 12k 31 + 22 k 14
ψ2 n
ψ2 n
ψ2 n
[t/n ]−2
2
n 2 n
i Z i+2 Z
×
,
×
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
169
i
|nj Z|2 +
i=1
where the subscript a stands for adjusted. Note that the finitesample adjustments do not influence the asymptotic results.
To explain the aforementioned modifications, let us consider
n
. First, the factor [t/n ]/([t/n ] − kn + 1) in
the statistic Ct,a
n
the definition of Ct,a
is an adjustment for the true number of
summands in V (Z, 2)nt . On the other hand, we have that
t
n n
n
RVt ≈
IVt ,
αu2 du +
2
2
0
where the error of this approximation is of order n and has
expectation 0. This means that the original statistic Ctn actually
estimates
ψ1kn n
IVt .
1−
2θ 2 ψ2kn
This explains the appearance of (1 −
n
of Ct,a
.
ψ1kn n −1
)
2θ 2 ψ2kn
in the definition
2.3 Choosing θ in Practice
To analyze the empirical properties of the preaveraging estimators and their sensitivity to the choice of θ , we employ
transaction and quote data of the stocks Exxon (XOM), Homedepot (HD), and Sonoco Products Co. (SON) traded at the NYSE
between May and August 2006 corresponding to 5 months or
88 trading days. We see this 5-month period as a good compromise between choosing a sample that, on the one hand, is
sufficiently long to guarantee empirically valid results but, on the
other hand, prevents aggregation over possibly many different
states of volatility. XOM and HD represent highly liquid stocks
with approximately 12 (80) and 8 (60) trade (quote) arrivals per
minute, respectively. Conversely, SON is significantly less liquid with approximately 2 (14) trade (quote) arrivals per minute.
As illustrated below, these stocks represent different empirical features and thus allow us to gain valuable insights into the
empirical performance of volatility estimators. However, to provide even more cross-sectional evidence, all empirical findings
reported in the article are replicated for three additional stocks,
Avg. time between trades (in sec)
Avg. time between quote arrivals (in sec)
Avg. time between nonzero quote changes
(in sec)
Avg. number of trades
Avg. number of quotes
Avg. proportion of nonzero trade returns
Avg. proportion of nonzero MQ returns
2
Avg. α̂ · 1e7
Avg. ξ̂
XOM
HD
SON
5.28
0.74
3.16
7.19
1.07
4.60
36.29
4.33
14.29
4428 3255 644
31,76 21,96 5405
0.73 0.60 0.62
0.23 0.23 0.30
0.10 0.08 0.69
0.31 0.23 0.23
NOTE: The table reports (daily) averages of the time between trades, quote arrivals, and
nonzero quote changes, the number of trades and quotes, the proportions of nonzero trade
2
(or midquote) returns, the (long-run) noise variance α̂ , the (long-run) noise variance per
trade ξ̂ , and first-order autocorrelations of underlying sampled returns.
that is, Citigroup (C), Tektronik (TEK), and Zale Corporation
(ZLC). These results are reported in the online Appendix.
Table 1 gives descriptive statistics on the number of trades,
quote arrivals, and nonzero returns for all stocks. It also yields
information on the magnitude of market
microstructure noise as
t
represented by its variance α 2t = 0 αu2 du and estimated by
⎧
N
N
N N
⎪
1 N N
⎪
⎪
−
i Z i−1 Z < 0,
Z
Z
,
if
⎪
i
i−1
⎪ N −1
⎨
i=2
i=2
2
α̂ t =
N N 2
⎪
⎪
i Z
⎪
⎪
⎪
,
otherwise,
⎩
2N
i=1
(2.19)
N
i Z
denotes trade-to-trade returns and N is the number
where
of all transaction returns. Equation (2.19) corresponds to the estimator proposed by Oomen (2006b) based on TRTS employing
all transactions, that is, n = N and = 1. As it provides only
positive variance estimates as long as the first-order autocorrela (Ni Z)2
tions of trade returns are negative, we replace it by N
i=1 2N
N
N
whenever N
i=2 (i Z)(i−1 Z) ≥ 0. The latter estimator is motivated by (2.11) in the case of iid noise. As suggested by Oomen
2
t /N) gives the noise-to-signal ratio per
(2006a), ξ̂t := α̂ t /(IV
t is computed by the maximum likelihood estitrade, where IV
mator (henceforth MLRV) proposed by Aı̈t-Sahalia, Mykland,
and Zhang (2005) (see the Appendix for implementation de2
tails). As shown in Table 1, α̂ t is varying substantially across
the different stocks, with ξ̂t being quite stable ranging from 0.2
to 0.3. We focus on four sampling schemes:
(i) calendar time sampling (CTS) using transaction prices,
(ii) calendar time sampling (CTS) using midquotes,
(iii) transaction time sampling (TRTS) using transaction
prices,
(iv) tick change time sampling (TTS) using midquotes.
CTS is commonly used in the empirical literature (see, e.g.,
Andersen et al. 2003) and is a natural choice if the sampling
frequency is moderate (e.g., 10 min). However, it faces the
problem of sampling mainly zero return intervals if the sampling frequency is very high. In this case, TRTS is a more natural
sampling scheme since one samples only whenever a transaction
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
170
Journal of Business & Economic Statistics, April 2013
Figure 1. Autocorrelations (up to three lags) of transaction price and midquote returns for XOM. Upper panel: CTS based on 1 sec and 5 sec.
Lower panel: T(R)TS based on 1 tick and 2 ticks.
has occurred avoiding sampling artificial zero returns. However,
even in this case, a high number of zero returns can be sampled
since many trades do not necessarily imply a price or quote
change. As an alternative, TTS ignores any zero returns and
samples only whenever quotes have been changed. As stressed
by Andersen, Dobrev, and Schaumburg (2010), TTS is justified
if volatility is constant in tick time (and thus observation times
of price movements and the quadratic variation of the price process are perfectly correlated), whereas CTS or TTS is optimal
when observation times are exogenous to the price process and
inference can be conditioned on observation times. However,
empirically it is rather unclear which assumption is justifiable
and thus which sampling scheme should be preferred. This motivates us to focus on all commonly applied sampling schemes.
Figures 1–3 show empirical autocorrelations for trade and
midquote returns sampled based on the different sampling
schemes. In the case of TRTS and TTS, we generally observe
negative first-order autocorrelations. The negative sign is well
confirmed by the literature and is predominantly driven by the
bid–ask bounce effect; see Roll (1984). Interestingly, midquote
returns tend to induce higher (negative) autocorrelations than
transaction returns. The same patterns also emerge based on
individual ask or bid quote returns. This is induced by the fact
that in the hybrid trading system of NYSE, transaction prices
are often located inside of the bid–ask spread and thus are less
discrete than (mid-)quote returns. Moreover, the empirical evidence shows that quote processes reveal strong reversals that are
naturally most pronounced in the case of TTS. Corresponding
evidence is also confirmed by Hautsch and Huang (2012) who
analyzed the quote impact of order arrivals based on blue chip
stocks. These findings show that using (mid-)quote returns does
not necessarily allow reducing the discreteness (and thus noise)
in the data. Overall, the serial dependence in CTS returns is
lower. This is particularly true in the case of trade price returns
Figure 2. Autocorrelations (up to three lags) of transaction price and midquote returns for HD. Upper panel: CTS based on 1 sec and 5 sec.
Lower panel: T(R)TS based on 1 tick and 2 ticks.
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
Hautsch and Podolskij: Preaveraging-Based Estimation of Quadratic Variation
171
Figure 3. Autocorrelations (up to three lags) of transaction price and midquote returns for SON. Upper panel: CTS based on 1 sec and 5 sec.
Lower panel: T(R)TS based on 1 tick and 2 ticks.
and if the sampling frequency is higher than the average trade
arrival frequency. As a result, consecutive returns are mostly
zero and trade-to-trade (or tick-to-tick) dynamics break down.
Interestingly, returns based on XOM trading are positively autocorrelated. This is in contrast to common findings and makes
this stock an interesting special case to analyze the properties
of preaveraging estimators under various conditions. See also
Hansen and Lunde (2006), Oomen (2006b), and Gatheral and
Oomen (2010) for an extensive discussion of empirical properties of market microstructure noise effects.
To gain insights into the performance of the proposed preaveraging estimators, we benchmark them against the most common
competing approaches employed in recent literature. In particular, we implement a subsampled RV estimator computed as
[t/n ]−kn
RVSSt =
kn−1
i=0
|Z(i+kn )n − Zin |2 .
(2.20)
Hence, RVSS corresponds to the subsampling counterpart to the
preaveraging estimator as it runs on the same underlying return
grid (ni ) and samples over intervals of the length kn . In fact, it
corresponds to a preaveraging estimator that equally weights all
return observations within the local window [in , (i + kn )n ]
and thus allows us to study the role of the implicit preaveraging
weighting scheme. Moreover, we use the maximum likelihood
estimator (henceforth MLRV) proposed by Aı̈t-Sahalia, Mykland, and Zhang (2005) and the realized kernel (RK) estimator (henceforth (MLRV)) introduced by BHLS08a based on the
Tukey-Hanning2 kernel with optimal bandwidth. For implementation details, see the Appendix. The preaveraging estimators
and their asymptotic variances are computed as described in the
previous section. As the estimators are not necessarily positive
in all cases, we bound them from below by zero. This happens,
however, quite rarely and only in cases where either θ or n is
chosen to be very large or in the case of an insufficient number
of intraday observations (as sometimes occurs in the case of
more illiquid stocks).
n
for different
Figures 4 and 5 show plots of Ctn and Ct,a
choices of θ based on 3-sec CTS as well as TRTS and TTS using
transaction prices and midquote changes. Note that in the case
of high values for θ , that is, high values of kn , the preaveraging
estimators cannot necessarily be computed when there are not
sufficient intraday observations. This might occur particularly
in the case of less liquid assets. Then, the corresponding figures
are based on only those trading days for which all estimators
can be computed for all values of θ avoiding any sample
selection effects. The range of analyzed realizations of θ is
chosen to guarantee using at least 90% of the overall sample.
Figure 4. Daily averages of Ctn (multiplied by 1e5) depending on θ . Based on T(R)TS and 3-sec CTS using prices and midquotes. Benchmarks:
MLRV and RK.
172
Journal of Business & Economic Statistics, April 2013
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
n
Figure 5. Daily averages of Ct,a
(multiplied by 1e5) depending on θ. Based on T(R)TS and 3-sec CTS using prices and midquotes.
Benchmarks: RVSS, MLRV, and RK.
The estimators’ sensitivity to the choice of θ , and thus the width
of the local preaveraging window, is highest if θ is small. This
sensitivity is particularly strong in the case of T(R)TS inducing
a strongly negative bias (relative to the other estimators) if θ is
small. This bias is obviously induced by the fact that in finite
samples, the statistic Ctn estimates An · IVt with
ψ1kn n
An = 1 −
2θ 2 ψ2kn
(see Section 2.2). For θ → 0, An is downward biased. This
effect is strongest in the case of T(R)TS. For CTS, the bias
is significantly smaller. Since 3-sec CTS and T(R)TS employ
essentially very similar underlying price or quote sampling information, the different θ -sensitivity of both sampling schemes
is only explained by a higher sampling frequency −1
n in the
case of CTS resulting in 7800 sampling intervals per day and
scaling up An toward one. Hence, CTS based on possibly high
frequencies is suggested to remove finite-sample biases. Conversely, as shown in Table 1, even for very actively traded assets,
T(R)TS does not induce sufficiently many sampling intervals to
ensure An ≈ 1. Hence, in the case of T(R)TS, scaling by A−1
n
n
as in Ct,a
is essential to reduce the estimator’s negative bias and
ensures stabilizing the estimates for small values of θ . However,
even after finite-sample adjustments, we still observe a slightly
negative bias of the estimator in the case of T(R)TS. This is obviously induced by a relatively strong impact of the noise-induced
component ψ1 n RVnt /(2θ 2 ψ2 ) driving down the estimator if kn
is small. In this case, the width of the local window is not sufficient to diminish the influence of noise. For larger values of θ
and thus kn , this effect vanishes and we observe a stabilization
of the estimator.
For larger values of θ , the impact of the underlying sampling
scheme diminishes and the estimates tend to stabilize. Preaveraging and RK estimators are quite similar (on average) for
values of θ around 0.4–0.6. This choice of θ seems to widely
resemble the optimal bandwidth choice in the RK according
to BHLS08b. If, however, the preaveraging interval becomes
too high, we observe a downward trend of the estimates. In
these cases, the estimators tend to “oversmooth” and thus yield
downward biased estimates. These results suggest choosing θ
not too small, but rather conservatively, confirming the results
by BHLS08b. Plotting also the subsampled estimators, RVSS,
allows us to analyze the impact of the implicit preaveraging
weighting schemes. We observe that in the case of longer local
intervals, the subsampling estimator is downward biased relative
to the preaveraging estimator and thus tends to “oversmooth”
even more strongly. If the preaveraging interval shrinks, also
the subsampling estimator becomes quite sensitive to the choice
of θ and reveals significant biases. However, interestingly, it is
less biased than T(R)TS-based preaveraging estimators. In these
situations, the implicit preaveraging weighting scheme induces
an even higher θ -sensitivity.
To analyze the impact of the underlying sampling fren
based
quency, Figure 6 shows the θ -sensitivity of Ctn and Ct,a
on TRTS employing different sampling frequencies (n ∈
{1, 3, 5, 10, 20}) based on CTS and T(R)TS. As expected, the
finite-sample adjustment becomes even more important when
n is large, that is, the sampling frequency is small. In these
scenarios, the negative biases for small values of θ become substantial. The lower panel of Figure 6 shows that in these cases,
the finite-sample adjustment is still very effective and significantly reduces the bias (note that the figures use different scales
on the vertical axis). Moreover, we observe a downward bias of
the estimator with increasing n . Hence, preaveraging based on
sparse sampling implies an oversmoothing and thus downward
bias of volatility. This is particularly true for the subsampled
RV estimator. In this case, an implicit equal weighting of underlying returns is clearly inferior to the preaveraging weighting
scheme g(·) (see Section 2) putting less weights on more distant
observations. We also observe that these effects become more
extreme if the underlying liquidity of the stock declines.
Finally, note that we consider average values of θ across
the entire sample. However, recall from (2.15) that an optimal
choice of θ depends on the local signal-to-noise ratio and thus
should also depend on intraday variations of volatility. With the
latter following the typical U-shape intraday pattern, a constant
θ tends to oversmooth in the morning and undersmooth around
midday. This effect is naturally stronger based on CTS than
based on T(R)TS. However, our empirical findings above show
that oversmoothing (implied by a too high θ ) is less harmful
for the estimator’s bias than undersmoothing (implied by a too
small θ ). Consequently, as practical guidance in situations where
the intraday volatility strongly fluctuates, we suggest using θ
rather conservatively with the choices of θ recommended above
serving as lower bounds. Theoretically, however, the parameter
θ should be indeed chosen according to the local signal-tonoise ratio. The precise description of such an approach and its
efficiency gain compared with the case of a constant choice of θ
are investigated by Jacod and Mykland (2012). However, their
method is computationally clearly more involved.
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
Hautsch and Podolskij: Preaveraging-Based Estimation of Quadratic Variation
173
n
Figure 6. Daily averages of Ctn and MLRV (top panel) as well as Ct,a
and RVSSt (bottom panel) for different values of θ and n ∈
{1, 3, 5, 10, 20} using TRTS. All estimators multiplied by 1e5.
Figure 7 depicts the average values of the approximate standard deviations of
the preaveraging estimates with finite-sample
1/4
Ŵt,a . For the more liquid assets, we observe
adjustments, n
that the standard deviation is monotonically increasing in θ regardless of the finite-sample adjustment. Hence, using a larger
preaveraging window ultimately increases the estimation error.
However, for less liquid stocks (see also the online Appendix),
the corresponding plots reveal slight nonmonotonicities with increasing variances if θ becomes too small. In these cases, the
local window is obviously too short to effectively remove market microstructure noise. Hence, choosing too small values of θ
in the case of less liquid assets can be harmful not only for the
estimator’s unbiasedness but also for its efficiency.
The relationships revealed by Figures 4–7 indicate an obvious
trade-off between bias and efficiency. Minimizing the estimator’s variance generally yields very small values of θ and thus
the length of the local interval. For these realizations, however,
the “θ -signature plots” in Figures 4 and 5 reveal the highest
sensitivity of the estimators with respect to θ and the underlying
sampling schemes. Moreover, as shown in Section 4, choosing
−1/4 √
Figure 7. Averages of nt
prices and midquotes.
θ too small turns out to be also quite harmful for jump-robust
estimations and jump detections. Hence, aiming at finding a
“global” choice of θ balancing biases (in the case of too short
local intervals) against inefficiency (in the case of too long local
intervals), θ should be chosen as the smallest possible value at
which the signature plots tend to stabilize and the divergence
between different sampling schemes tends to vanish. According to Figures 4 and 5, the optimal choice of θ is thus around
0.4 based on liquid assets. In the case of less liquid stocks, we
suggest θ ≈ 0.6.
3.
3.1
CONTINUOUS SEMIMARTINGALES WITH
DEPENDENT NOISE
Preaveraging With Dependent Noise
Model (2.2) implies that the noise process is uncorrelated,
which is not realistic at very high frequencies (see, e.g., Hansen
and Lunde 2006). In this section, we allow for serial correlation
in the noise process and derive the corresponding asymptotic
Ŵ t,a (multiplied by 1e5) for different choices of θ. Based on highest frequency T(R)TS and 3-sec CTS using
174
Journal of Business & Economic Statistics, April 2013
results for the preaveraging method. For an alternative approach
of dealing with dependent noise, see Nolte and Voev (2009). We
consider a model
Zin = Xin + Ui ,
(3.21)
where X and U are independent, and (Ui )i≥0 is a stationary qdependent sequence (for some known q > 0), that is, Ui and Uj
are independent for |i − j | > q. As we remarked before, model
(3.21) only makes sense in discrete time due to the dependence
structure of the noise. We define the covariance function by
ρ(k) := cov(U1 , U1+k ).
statistic V (Z, 2)nt . The asymptotic results are presented in the
following theorem.
Theorem 1. Assume that EU12 < ∞. Then we have
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
with
ρ 2 = ρ(0) + 2
q
ρ(k).
where Ŵt (q) =
t
γu2 (q) =
4
ψ22
ni Zni+k Z,
i=0
k = 0, . . . , q + 1. (3.23)
In the econometric literature, such estimators are called realized
autocovariances (see, e.g., BHLS08a). The following lemma
describes the asymptotic behavior of γtn (k).
γu2 (q)du and γu2 (q) is defined by
0
ρ 2 σu2
ρ4
22 θ σu4 + 212
+ 11 3 . (3.26)
θ
θ
As before, we need to estimate the conditional variance Ŵt (q)
to obtain a feasible version of (3.25). In fact, the estimation
procedure is somewhat easier than the one presented in Section
2, because the quantity ρ 2 is not time varying.
Proposition 1. Assume that EU18 < ∞. Then, we obtain
Ŵtn (q)
P
P
γtn (k) −→ t(2ρ(k) − ρ(k − 1) − ρ(k + 1)),
P
Note that Lemma 2 provides estimates ρn (0), . . . , ρn (q) of
ρ(0), . . . , ρ(q) that are obtained by a simple recursion:
2
Thus,
q we deduce a2 n -consistent estimator ρn = ρn (0) +
2 k=1 ρn (k) of ρ that can be used for bias correcting the
i=0
n 4
Z
i
1/4
n
P
γtn (q + 1) −→ −tρ(q).
γtn (q + 1)
,
t
γ n (q)
ρn (q − 1) = − t
+ 2ρn (q),
t
γ n (q − 1)
+ 2ρn (q − 1) − ρn (q).
ρn (q − 2) = − t
t
and it holds that
γtn (q) −→ t(2ρ(q) − ρ(q − 1)),
ρn (q) = −
[t/n ]−kn
+
γtn (0) −→ 2t(ρ(0) − ρ(1)),
k = 1, . . . , q − 1,
422
=
3θ ψ24
√
[t/
n ]−2kn
n 2
22 ψ1
8ρn2 n 12
Z
−
i
4
3
2
θ
ψ2
ψ2
i=0
4ρ 4 11
22 ψ12
12 ψ1
P
+ 3n
+
−→ Ŵt (q),
−
2
θ
ψ22
ψ24
ψ23
Lemma 2. Assume that EU12 < ∞. Then, we obtain
−1/2
(3.25)
(3.22)
[t/n ]
st
n1/4 Ctn (q) − IVt −→ MN(0, Ŵt (q)),
k=1
Similar to (2.12), we obtain a bias that has to be estimated from
the data. The estimation procedure is a bit more involved than
in the previous section. We introduce a class of estimators given
by
γtn (k) = n
(3.24)
Furthermore, when EU18 < ∞, we deduce the associated central
limit theorem
Similar to the convergence in (2.12), we obtain the following
results.
Lemma 1. Assume that EU12 < ∞. Then, we have
√
t
n
ψ1 t
n P
σu2 du + 2 ρ 2
V (Z, 2)t −→
θ ψ2
θ
ψ2
0
√
n
ψ1 t
P
:=
V (Z, 2)nt − 2 ρn2 −→ IVt .
θ ψ2
θ ψ2
Ctn (q)
Ctn (q) − IVt
d
n
−→ N(0, 1).
Ŵt (q)
The corresponding finite-sample adjustments are obtained
similarly as in Section 2.2 and are given by
n
(q)
Ct,a
: = 1−
ψ1kn n
−1
2θ 2 ψ2kn
ψ1kn t 2
−
ρn ,
θ 2 ψ2kn
√
[t/n ] n
([t/n ]−kn +1)θ ψ2kn
V (Z, 2)nt
(3.27)
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
Hautsch and Podolskij: Preaveraging-Based Estimation of Quadratic Variation
175
Figure 8. Daily averages of Ctn (1) (upper panel) and Ctn (2) (lower panel) for different values of θ. Estimators multiplied by 1e5. Based on
highest frequency T(R)TS and 3-sec CTS using prices and midquotes. Benchmarks: RVSS, MLRV, and RK implemented based on trade prices.
and
−2
4k22n [t/n ]
kn 4
3θ ψ2 ([t/n ] − kn + 1)
√
[t/n ]−kn
2
n 4
Z + 8ρn n [t/n ]
×
i
θ 2 ([t/n ] − kn + 1)
i=0
[t/ ]−2k
n
n
n 2
k12n
k22n ψ1
Z
× 3 − 4
i
ψ2kn
ψ2kn
i=0
n
(q) = 1 −
Ŵt,a
+
×
ψ1kn n
2θ 2 ψ2kn
4ρn4 [t/n ]
3
θ ([t/n ] − kn +
k11n
kn 2
ψ2
−
1)
k n ψ k n
2 12 13
ψ2kn
+
2
k22n ψ1kn
kn 4
ψ2
. (3.28)
n
n
Figure 8 shows daily averages of Ct,a
(1) and Ct,a
(2) depending on θ . Comparing these signature plots with Figures 4 and 5,
we observe that in the case of CTS, the dependent noise-robust
version of the preaveraging estimator provides similar shapes
as the iid-noise version. Conversely, for T(R)TS, we observe
n
n
(q) > Ct,a
. Comparing the expressions
that in most cases Ct,a
n
n
for Ct,a (q) and Ct,a (and standardizing t to one), this can be
induced only by
n RVnt
> ρn2 .
2
Obviously, if there is negative (positive) serial dependence in
the noise process {U }, the estimator n RVnt /2 is upward (downward) biased. As shown below, we indeed find evidence for
significant serial correlations in the noise process inducing an
upward bias of the expression n RVnt /2. These effects are most
distinct for TTS for which also the strongest serial dependence
in the return process is found (see Table 1). Similar but weaker
effects are observable in the case of TRTS. The figures show
that the corresponding finite-sample adjustments seem to imply overcorrections of the estimators’ negative biases inducing
n
(q)
even slight upward biases if θ is small. This indicates that Ct,a
n
tends to be more sensitive to the choice of θ than Ct,a . Overall,
n
the signature plots reveal similar shapes as for Ct,a
. Hence, the
θ -sensitivity is clearly reduced for values θ > 0.4.
3.2
A Test for Dependence
In the previous section, we assumed that the number of nonvanishing covariances q is known. In practice, however, we need
a decision rule on the choice of q based on the discrete observations of the price process. Formally speaking, we require a test
procedure to decide whether ρ(k) = 0 for some k ≥ 1. To illustrate the underlying idea, consider a 1-dependent noise model.
In this case, we aim at testing whether ρ(1) = 0, that is, whether
the noise process is actually iid. Thus, we obtain the following
hypothesis:
H0 : ρ(1) = 0,
H1 : ρ(1) = 0.
As we will observe from the proof of Lemma 2, the statistics γtn (k) are asymptotically unaffected by the presence of the
process X (thus, only the noise process U drives their asymptotic behavior). Hence, by a standard central limit theorem for
q-dependent variables, we deduce that
−1/2
n
γtn (2)
τ2
d
−
,
− ρ(1) −→ N 0,
t
t
176
Journal of Business & Economic Statistics, April 2013
Figure 9. Histogram of AR(1) test statistic Sn . Based on highest frequency T(R)TS and 3-sec CTS.
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016
where γtn (2) is defined by (3.23) and τ 2 is given by
τ 2 = τ (0) + 2
2
τ (k),
k=1
τ (k) = cov((U1 − U0 )(U3 − U2 ), (U1+k − Uk )(U3+k − U2+k )).
Obviously, we require a consistent estimator of τ (k), 0 ≤ k ≤ 2,
to construct a test statistic. Set Hin = ni Zni+2 Z. Then, by the
weak law of large numbers for q-dependent variables, we deduce
that
[t/ ]
n
n
− Hin Hi+3
n i=0 n Hin Hi+k
P
τn (k) =
−→ τ (k),
t
0 ≤ k ≤ 2.
The latter implies that τn2 = τn (0) + 2 2k=1 τn (k) is a consistent
estimator of τ 2 . Then, the test statistic is given by
Sn =
−1/2
−n γtn (2)
.
tτn2
Observe that under H0 : ρ(1) = 0, we obtain that
d
Sn −→ N(0, 1).
We reject H0 : ρ(1) = 0 at level α when
|Sn | > c1− α2 ,
where c1− α2 denotes the (1 − α2 ) quantile of N(0, 1) distribution. Note that this test is consistent against the alternative
H1 : ρ(1) = 0.
Figure 9 shows the (time series) distribution of the test statistic Sn based on highest frequency T(R)TS and 3-sec CTS. We
observe for liquid stocks that Sn takes highly negative values
supporting the evidence for significantly negative serial dependence in the noise process and confirming the results by Hansen
and Lunde (2006). The strength of the serial dependence is
decreasing if the underlying trading frequency of the stock becomes smaller. Correspondingly, the highest test statistics are
found for XOM, whereas for the less liquid stocks, the time
series distribution of the test statistic is virtually symmetrically
around zero. Moreover, the highest magnitudes are found for
TTS using midquotes for which we also observe the strongest
serial dependence in the underlying return series. Conversely,
3-sec CTS yields significant lower test statistics (in absolute
terms). Hence, we can conclude that the serial dependence in
underlying returns is mainly driven by a serial dependence in
the corresponding noise process. As reported in Section 2, these
autocorrelations and, thus, the test statistic Sn are obviously
strongly dependent on the underlying sampling scheme. This is
most evident for midquote-based sampling. Conversely, based
on CTS, Sn is close to zero on average in most cases. T