07350015%2E2012%2E669668

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

Tests of Short Memory With Thick-Tailed Errors
Christine Amsler & Peter Schmidt
To cite this article: Christine Amsler & Peter Schmidt (2012) Tests of Short Memory
With Thick-Tailed Errors, Journal of Business & Economic Statistics, 30:3, 381-390, DOI:
10.1080/07350015.2012.669668
To link to this article: http://dx.doi.org/10.1080/07350015.2012.669668

Accepted author version posted online: 03
Apr 2012.

Submit your article to this journal

Article views: 203

View related articles

Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=ubes20
Download by: [Universitas Maritim Raja Ali Haji]

Date: 11 January 2016, At: 22:44

Supplementary materials for this article are available online. Please go to http://tandfonline.com/r/JBES

Tests of Short Memory With Thick-Tailed Errors
Christine AMSLER
Department of Economics, Michigan State University, East Lansing, MI 48824 (amsler@msu.edu)

Peter SCHMIDT

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:44 11 January 2016

Department of Economics, Michigan State University, East Lansing, MI 48824, and College of Business and
Economics, Yonsei University, Seoul, Korea (schmidtp@msu.edu)
In this article, we consider the robustness to fat tails of four stationarity tests. We also consider their
sensitivity to the number of lags used in long-run variance estimation, and the power of the tests. Lo’s
modified rescaled range (MR/S) test is not very robust. Choi’s Lagrange multiplier (LM) test has excellent

robustness properties but is not generally as powerful as the Kwiatkowski–Phillips–Schmidt–Shin (KPSS)
test. As an analytical framework for fat tails, we suggest local-to-finite variance asymptotics, based on
a representation of the process as a weighted sum of a finite variance process and an infinite variance
process, where the weights depend on the sample size and a constant. The sensitivity of the asymptotic
distribution of a test to the weighting constant is a good indicator of its robustness to fat tails. This article
has supplementary material online.
KEY WORDS: Fat tails; Robustness; Stationarity.

1.

INTRODUCTION

This article considers four tests of the null hypothesis that a
series is stationary and has short memory: the Kwiatkowski–
Phillips–Schmidt–Shin (KPSS) test of Kwiatkowski et al.
(1992), the modified rescaled range (MR/S) test of Lo (1991),
the rescaled variance (V/S) test of Giraitis et al. (2003), and the
Lagrange multiplier (LM, or ω1 ) test of Choi (1994). There are
a number of other tests that could have also been considered, including the tests of Leybourne and McCabe (1994, 1999); Xiao
(2001); Sul, Phillips, and Choi (2005); Harris, Leybourne, and

McCabe (2007, 2008); de Jong, Amsler, and Schmidt (2007);
and Pelagatti and Sen (2009). The tests that we do consider all
share the features that they depend on the cumulations of the
demeaned series and they use a nonparametric long-run variance estimate, so they are amenable to the same methods of
asymptotic analysis.
Let the observed series be Xt , t = 1, . . . , T. The assumed datagenerating process under the null hypothesis is Xt = µ + εt ,
where εt is a zero-mean, stationary, short-memory process.
Kwiatkowski et al. (1992) and Choi (1994) considered the alternative hypothesis to be that the series has a unit root, whereas Lo
(1991) and Giraitis et al. (2003) considered the alternative to be
stationary long memory. We will not distinguish these two cases
carefully because tests that have power against one alternative
will generally also have power against the other. For example,
Lee and Schmidt (1996) and Lee and Amsler (1997) showed
that the KPSS test is consistent against stationary and nonstationary I(d) alternatives. Similarly, the MR/S test and the V/S
test, which are intended to have power against I(d) alternatives,
will also have power against a unit root.
In this article, we are primarily concerned with the magnitude of possible finite sample size distortions caused by errors
with thick tails. The original draft of this article (Amsler and
Schmidt 1999) was motivated by the intuition that these size
distortions should be less for the KPSS test than for the MR/S

test. Both tests depend on the cumulations of the demeaned series. However, the KPSS test depends on the sum of squares of

the cumulations, while the MR/S test depends on the difference
between the maximal and the minimal value of the cumulations.
Intuitively, maximal and minimal values may be very sensitive
to tail thickness, so that the finite sample size distortions caused
by thick tails may be worse for the MR/S test than for the KPSS
test. Simulations reported in that paper and in Section 3 of this
article indicate that this intuition is correct. Our simulations also
give some less expected results, notably that Choi’s LM test is
much more robust to thick-tailed errors than the KPSS or the
V/S test. This robustness comes at the expense of some sacrifice
of power.
To move beyond a pure simulation study, some further analytical framework is needed. When the data distribution is in
the domain of attraction of a stable law, different limit theories
apply, based on the L´evy process. In this article, we are interested in fat tails without infinite variance, for which standard
Wiener-process asymptotics apply but may fail to be adequate
in finite samples when tails are thick. We therefore consider
local-to-finite variance asymptotics, based on a representation
of the series Xt as follows: Xt = X1t + (c/g(T ))X2t . Here, X 1t

is a short-memory process without thick tails (e.g., normal) and
X 2t has a symmetric stable distribution. The term “c” is a constant, while the term g(T) is chosen so that the cumulation of
Xt converges weakly to a weighted sum of the Wiener and L´evy
processes, with the weight depending
√ on c. For example, for
the case that X 2t is Cauchy, g(T) = T . The asymptotic distributions of the various statistics under this representation of the
process depend on the parameter c, and we evaluate the asymptotic distributions to see which are more sensitive to the value
of c. The statistics for which the asymptotic distribution is more
sensitive to c are indeed those that have more substantial size
distortions with thick-tailed errors in our simulations.

381

© 2012 American Statistical Association
Journal of Business & Economic Statistics
July 2012, Vol. 30, No. 3
DOI: 10.1080/07350015.2012.669668

382


Journal of Business & Economic Statistics, July 2012

2.

NOTATION AND FURTHER DISCUSSION

As discussed above, let the observed series be Xt , t = 1, . . . ,
T, where Xt = µ + εt , and where εt is a zero-mean, stationary,
short-memory process. For the asymptotic analyses of this article, we need to require that εt has cumulations that satisfy a
functional central limit theorem (FCLT). For the moment, we
consider the case that εt has finite variance (we will discuss
the infinite variance case later), and we assume that εt satisfies
the following condition (which we borrow from M¨uller 2005),
which therefore defines the null hypothesis.

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:44 11 January 2016

Condition 1. (a) E(εt ) = 0. (b) εt is
stationary with finite
covariances γj = E(εt εt−j ). (c) σ 2 = ∞

j =−∞ γj is finite and
[rT ]
nonzero. (d) T −1/2 t=1
εt ⇒ σ W (r), where ⇒ indicates weak
convergence, and W(r) is a Wiener process.
¯ and define the
Define the demeaned series as et = Xt −
X,
t
cumulation (partial sums) of et to be St = j =1 ej . Define the

estimated autocovariances as γˆj = T −1 Tt=j +1 et et−j (for j =
0, 1, 2, . . .) and define the long-run variance estimator using
the Bartlett kernel and
ℓ lags: for ℓ = 0, s 2 (ℓ) = γˆ0 , while for
s
2
ℓ ≥ 1, s (ℓ) = γˆ0 + 2 ℓs=1 (1 − ℓ+1
)γˆs . Then, we can define
the KPSS, MR/S, and V/S statistics as follows:

KPSS = ηˆ µ = T −2

T


St2 /s 2 (ℓ),

(1)

t=1

MR/S = T −1/2 Qµ = T −1/2 [maxt St − mint St ]/s(ℓ), (2)
V/S = T −2

T

t=1

¯ 2 /s 2 (ℓ).
(St − S)


(3)

Choi’s test demeans differently.
Define the cumulations of the

original series as Ct = tj =1 Xj and let Sˆt be the residuals if
Ct is regressed on t (for t
= 1, . . . , T), without intercept. We
define Sˆ0 = 0, σˆ X2 = T −1 Tt=1 (Sˆt )2 , and s 2 (ℓ) as above but
with Sˆt used in place of et . Then, Choi’s test statistic is
ω1 = T −1

T

t=2



Sˆt−1 Sˆt /s 2 (ℓ) − 1/2 1 − σˆ X2 /s 2 (ℓ) .


(4)

Under Condition 1, the cumulations St converge to a multiple of a Brownian bridge: T −1/2 S[rT ] ⇒ σ B(r) for 0 ≤ r ≤
1, where B(r) = W (r) − rW (1). Then, if s 2 (ℓ) is a consistent
estimator of the long-run variance σ 2, the asymptotic distributions of KPSS, MR/S, and V/S do not depend on σ 2; they are
functionals of B(r) only. This will be true if the errors εt are
iid with mean zero and finite variance and ℓ is any fixed integer (including zero), or it will also be true if the errors satisfy
Condition 1 and the number of lags grows at an appropriate
(slow) rate. Basically, we require that as T → ∞, ℓ → ∞ but
ℓ/T → 0. General conditions under which s 2 (ℓ) is consistent
can be found in de Jong and Davidson (2000) and Jansson
(2002).
We will hereafter refer to the asymptotics, just discussed,
as the “standard” asymptotics for these tests. They lead to 5%
upper tail critical values of 0.463, 1.733, and 0.187 for the KPSS,
MR/S, and V/S tests, respectively.
Similar remarks apply to Choi’s test. See Choi (1994, p. 727)
for the relevant asymptotic distribution when s 2 (ℓ) is a consistent


estimator of σ 2. The standard asymptotics lead to a 5% upper
tail critical value of approximately 0.2496 (his reported value)
for Choi’s test. An interesting and, apparently, not widely understood feature of the standard asymptotic distribution for this test
is the large amount of probability mass located near 0.25. While
we were programming our simulations, we rounded 0.2496 to
0.250 and got almost no rejections. Over 8% of the probability
in the asymptotic distribution turns out to lie between 0.249 and
0.250. We found (in simulations with T = 10,000 and 100,000
replications) that the probability of rejection was 0.083, with a
critical value of 0.249; 0.052, with a critical value of 0.2496;
0.027, with a critical value of 0.2499; and 0.002, with a critical
value of 0.2500. We ultimately used a critical value of 0.249636.
The point is that numerical accuracy is very, very important for
this test.
In practice, Kwiatkowski et al. (1992, p. 165) recommended
ℓ = o(T 1/2 ). We will follow standard notation by defining (for
integer k) ℓk = integer[k(T /100)1/4 ]. A popular choice for the
Bartlett kernel is ℓ = ℓ12, which equals 10, 12, 14, 17, 21,
25, and 31 for T = 50, 100, 200, 500, 1000, 2000, and 5000,
respectively.
In this article, we also consider the use of critical values based
on the “fixed-b” asymptotics of Kiefer and Vogelsang (2005)
and Hashimzade and Vogelsang (2008). In this case, we are
interested in the asymptotic distribution of s 2 (ℓ), and, therefore,
of the various test statistics, under the assumption that ℓ = bT ,
where b ∈ (0, 1] is a fixed constant. The point is that in finite
samples, the fixed-b critical values will often lead to a better
finite sample approximation to the distribution of the statistics.
That is, no matter how one actually chooses the number of lags, if
it is positive, the fixed-b critical values with b = ℓ/T will give a
test of more accurate size than the standard critical values, which
correspond to b = 0. The fixed-b asymptotic distributions of the
KPSS statistic, under the null that Condition 1 holds and also
under the unit root alternative, are given by Amsler, Schmidt,
and Vogelsang (2009), and a similar analysis would apply to the
other three tests that we consider here. Table 1 gives the fixed-b
upper tail 5% level critical values for the four tests considered
in this article, calculated from a simulation with T = 10,000
and 50,000 replications. Note that these critical values are for
the case that the Bartlett kernel is used; different kernels lead to
different fixed-b critical values.

Table 1. Fixed-b 5% upper tail critical values for the four tests using
Bartlett kernel
b

KPSS

MR/S

V/S

LM

0
0.01
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.20
0.25

0.463
0.457
0.454
0.449
0.441
0.435
0.427
0.419
0.413
0.400
0.391

1.733
1.708
1.691
1.663
1.645
1.633
1.634
1.636
1.654
1.757
1.878

0.187
0.183
0.178
0.172
0.166
0.161
0.156
0.151
0.147
0.139
0.142

0.249636
0.249622
0.249602
0.249578
0.249541
0.249530
0.249498
0.249466
0.249450
0.249360
0.249256

Amsler and Schmidt: Tests of Short Memory With Thick-Tailed Errors

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:44 11 January 2016

3.

SIMULATIONS

In this section, we report the results of simulations designed to compare the finite sample size and the power of our
four tests. The data-generating process is very simple: the Xt ,
t = 1, . . . , T , are iid. The data are centered on zero (µ = 0).
We consider sample sizes T = 50, 100, 200, 500, 1000, and
2000. The following distributions for Xt are considered: standard normal; Student’s t with 10, 5, 3, and 2 degrees of freedom;
and standard Cauchy. All the four tests are valid asymptotically
for the normal case and for the t-distributions with 10, 5, or 3
degrees of freedom. The standard and fixed-b asymptotics are
not valid for the Cauchy case, for which the mean does not exist
and the variance is infinite. The case of the t2 distribution is
more interesting. This distribution has an infinite variance but
its variance is “just barely infinite” in the sense of Kourogenis
and Pittis (2008) (it has finite absolute moments of order δ for
δ < 2). Abadir and Magnus (2004) showed that the t2 distribution is in the domain of attraction, but not in the normal domain
of attraction, of a normal distribution. Specifically, it follows a
central limit theorem but with a nonstandard normalization of
(T · lnT)−1/2. The results by Kourogenis and Pittis also apply,
and so, an invariance principle holds. As a result, it is reasonable
to presume that all of the tests we consider will be asymptotically valid in this case. However, a proof of that result would be
a detour from the path of this article and we will not pursue it.
We have attempted to pick distributions for Xt that span the
empirically relevant range. Few studies report the kurtosis of the
data so that is a bit hard to know. One simple example, in de Jong,
Amsler, and Schmidt (2007, p. 321), reported kurtoses ranging
from 4.84 to 11.05 for five nominal exchange rate series. Since
the kurtosis for a t-distribution with k degrees of freedom is 3 +
6/(k – 4), these kurtoses are consistent with t-distributions with
k in the range of 5–7, that is, with moderately fat tails.
The data are generated by drawing pseudo-random normal
deviates and transforming them as appropriate. (For example,
the Cauchy is the ratio of two standard normals.) The number
of replications was 25,000 for T = 1000 and 2000, and 50,000
for the other sample sizes. We consider only upper tail 5% tests.
We consider the case of ℓ = 0 and also ℓ = ℓ12 (where ℓk =
integer[k(T /100)1/4 ]). The use of ℓ > 0 is unnecessary with iid
data, but was considered because we want to see how sensitive
the various tests are to the choice of ℓ.
Table 2 gives the size of the various tests, using the standard
and the fixed-b critical values. There is an expanded version of
this table in the online Appendix to this article. We will discuss
first the case that ℓ = 0, in which case there is no distinction
between these two sets of critical values.
For the cases where the standard asymptotics apply (normal,
and t-distribution with three or more degrees of freedom), the
KPSS, V/S, and LM tests are all quite accurate (have size close
to 0.05) for all sample sizes, even T = 50. This is not true for the
MR/S test, however. For example, note its size, in the normal
case, of 0.014, 0.023, 0.032, and 0.038 for T = 50, 100, 200, and
500, respectively. Apparently, its convergence to its asymptotic
distribution is slower than that for the other three tests.
Next, consider the issue of robustness to fat tails. Here, there
is a very clear ranking: the LM test is the best, and the MR/S
test is the worst. The KPSS test is marginally better than the V/S

383

test. For example, note the size for T = 2000 in the Cauchy case:
0.026, 0.001, 0.017, and 0.042 for the KPSS, MR/S, V/S, and
LM tests, respectively. The MR/S test essentially does not reject
in the Cauchy case. These results are similar for smaller sample
sizes as well. A very general conclusion from these results is
that, except for the MR/S test, fat tails are not really a serious
problem. It takes a very extreme distribution, such as Cauchy,
before the size distortions are large enough to be worrisome.
Now consider the case that ℓ12 lags are used. When the
standard critical levels are used, more lags cause size distortions
(too few rejections). The LM test is only minimally affected by
the number of lags, but for the other tests, these size distortions
can be large. For example, with T = 200 and ℓ12 ( = 14) lags,
size in the normal case is 0.039, 0.004, 0.020, and 0.045 for the
KPSS, MR/S, V/S, and LM tests. The size distortions caused by
a large number of lags are more or less completely eliminated
(except for the MR/S test at the smaller sample sizes) by using
the fixed-b critical values. This basically is the argument for
using the fixed-b critical values.
Amsler and Schmidt (2012) considered the robustness of the
four tests considered here to autocorrelation, namely first-order
autoregressive [AR(1)] errors with parameters 0.8 and 0.9. Their
results showed that the LM test is far more robust to autocorrelation than the other three tests. So, in terms of robustness to fat
tails, to the number of lags used, and to short-run dynamics, the
LM test appears to be the best of the four tests.
Table 3 gives the power of the tests against a unit root alternative. The online Appendix contains an expanded version
of this table. The parameterization of the alternative is a slight
generalization of the data-generating process (DGP) in KPSS, to
accommodate cases of infinite variance. We generate the series
as Xt = λ1/2 rt + εt , rt = rt−1 + ut , where the ε and u processes
are iid and both are drawn from the same distribution (normal, t,
or Cauchy). The null hypothesis is λ = 0 and we give results for
the alternative that λ = 0.01. Amsler and Schmidt gave results
for some other values of λ, for KPSS and MR/S only, but the
results for λ = 0.01 are representative. We consider T = 50, 100,
200, 500, 1000, and 2000, as above, and ℓ = 0 and ℓ = ℓ12.
Consider first the results for ℓ = 0, for which the fixed-b
critical values are the same as the standard critical values. The
most striking result is that the LM test is less powerful than the
other three tests. There is a tendency for the KPSS test to have
a higher power than the V/S test when power is low, and vice
versa, but the similarities outweigh the differences. The power
of the MR/S test is generally low, but it is not low in those cases
in which it did not suffer from large size distortions under the
null.
The conclusions for the case that ℓ = ℓ12 and we use the
size-adjusted critical values are very similar.
We also consider size-adjusted power. Apart from the Cauchy
case, the size distortions when fixed-b critical values are used
were small for all of the tests except the MR/S test. So, to save
space, we give size-adjusted power only for the KPSS and MR/S
tests. The results for the other two distributions are provided in
the online Appendix. There are some anomalous results for cases
with small T and large ℓ, but the power of the MR/S test is now
comparable with or often even slightly larger than the power of
the KPSS test. Apparently, its problem is not an intrinsic lack
of power, but just poor size control.

384

Journal of Business & Economic Statistics, July 2012

Table 2. Size of various tests
T



N(0,1)

t10

t5

t3

t2

Cauchy

N(0,1)

t10

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:44 11 January 2016

KPSS

t5

t2

Cauchy

MR/S
Standard 5% critical values
0.042
0.026
0.014
0.043
0.027
0.023
0.044
0.027
0.032
0.044
0.028
0.038
0.046
0.026
0.047

50
100
200
500
2000

0
0
0
0
0

0.050
0.049
0.049
0.051
0.049

0.049
0.049
0.050
0.050
0.050

0.049
0.048
0.049
0.048
0.049

0.047
0.046
0.047
0.050
0.049

50
100
200
500
1000
2000

10
12
14
17
21
25

0.014
0.031
0.039
0.045
0.047
0.048

0.013
0.031
0.040
0.046
0.045
0.049

0.012
0.029
0.039
0.044
0.046
0.046

0.012
0.030
0.036
0.043
0.048
0.049

0.011
0.027
0.036
0.041
0.042
0.044

50
100
200
500
1000
2000

10
12
14
17
21
25

0.049
0.048
0.046
0.050
0.051
0.051

0.045
0.048
0.045
0.050
0.048
0.052

0.045
0.048
0.046
0.048
0.048
0.050

0.043
0.046
0.044
0.046
0.046
0.053

0.013
0.021
0.030
0.039
0.046

0.011
0.019
0.028
0.036
0.044

0.007
0.014
0.022
0.029
0.038

0.004
0.008
0.012
0.017
0.020

0.001
0.001
0.001
0.001
0.001

0.007
0.001
0.004
0.021
0.032
0.039

0.009
0.001
0.004
0.020
0.030
0.040

0.009
0.002
0.004
0.015
0.028
0.039

0.009
0.002
0.003
0.015
0.024
0.033

0.011
0.003
0.002
0.008
0.013
0.018

0.010
0.004
0.002
0.001
0.001
0.001

Fixed-b critical values
0.044
0.029
0.006
0.045
0.028
0.008
0.042
0.026
0.014
0.045
0.029
0.036
0.045
0.028
0.043
0.050
0.028
0.055

0.006
0.008
0.014
0.037
0.041
0.055

0.007
0.009
0.016
0.032
0.040
0.054

0.007
0.008
0.014
0.027
0.034
0.045

0.009
0.007
0.009
0.015
0.019
0.031

0.008
0.007
0.004
0.002
0.002
0.002

0.007
0.015
0.021
0.025
0.026
0.026

V/S

LM
Standard 5% critical values
0.037
0.016
0.050
0.038
0.018
0.049
0.040
0.017
0.050
0.043
0.017
0.049
0.043
0.018
0.049
0.043
0.017
0.052

50
100
200
500
1000
2000

0
0
0
0
0
0

0.048
0.050
0.049
0.051
0.051
0.052

0.048
0.049
0.047
0.051
0.051
0.052

0.047
0.045
0.050
0.049
0.050
0.051

0.044
0.046
0.047
0.047
0.050
0.049

50
100
200
500
1000
2000

10
12
14
17
21
25

0.000
0.003
0.020
0.038
0.045
0.047

0.000
0.002
0.020
0.038
0.042
0.045

0.000
0.003
0.021
0.035
0.041
0.046

0.000
0.002
0.018
0.035
0.040
0.045

0.000
0.001
0.015
0.030
0.036
0.040

50
100
200
500
1000
2000

10
12
14
17
21
25

0.055
0.047
0.050
0.051
0.053
0.050

0.056
0.047
0.048
0.051
0.052
0.050

0.052
0.047
0.048
0.052
0.050
0.050

0.049
0.044
0.045
0.049
0.050
0.049

4.

t3

0.048
0.049
0.050
0.049
0.048
0.050

0.047
0.048
0.050
0.051
0.052
0.048

0.047
0.049
0.049
0.048
0.051
0.050

0.049
0.048
0.046
0.047
0.050
0.045

0.043
0.044
0.043
0.042
0.045
0.040

0.037
0.041
0.045
0.046
0.047
0.049

0.037
0.042
0.045
0.046
0.047
0.049

0.039
0.042
0.045
0.048
0.050
0.050

0.037
0.042
0.044
0.046
0.049
0.047

0.038
0.039
0.042
0.045
0.049
0.047

0.034
0.038
0.039
0.039
0.044
0.039

Fixed-b critical values
0.041
0.023
0.049
0.036
0.017
0.049
0.039
0.016
0.050
0.041
0.019
0.049
0.045
0.019
0.050
0.044
0.016
0.050

0.050
0.051
0.051
0.050
0.049
0.050

0.049
0.050
0.051
0.048
0.053
0.052

0.051
0.049
0.050
0.049
0.050
0.048

0.050
0.050
0.047
0.048
0.047
0.048

0.045
0.043
0.044
0.041
0.046
0.041

LOCAL-TO-FINITE VARIANCE ASYMPTOTICS

We continue to assume the model Xt = µ + εt . However, we
now consider cases where εt may have infinite variance, which is
a violation of Condition 1. We make the following assumption.
Condition 2. (a) εt is iid. (b) εt is distributed symmetrically
around zero. (c) εt is in the normal domain of attraction of a
stable law with index α, with 0 < α < 2.
 ]
[rT ] 2
−2/α
Define [UT (r), VT (r)] = [T −1/α [rT
j =1 εj , T
j =1 εj ].
Then, Chan and Tran (1989) showed the following result: if
Condition 2 holds, [UT (r), VT (r)] ⇒ [Uα (r), Vα (r)].
Here, Uα (r) is a standard
 r stable process (L´evy process),
with index α and Vα (r) = 0 (dUα (s))2 ds. See Phillips (1990,

0.000
0.000
0.005
0.012
0.014
0.015

p. 46) for further discussion and definitions of terms. See
also Ahn, Fotopoulos, and He (2001) for a good expository
treatment of these results and their application to unit root
tests.
This result is easily extended to the demeaned data. Let
Xt∗ = Xt − X¯ = εt − ε¯ ≡ εt∗ , let UT∗ (r) and VT∗ (r) be defined
in the same way as UT (r) and VT (r), except with Xj∗ (or εj∗ )
replacing εj , and let Uα∗ (r) = Uα (r) − rUα (1), the analog of a
r
Brownian bridge, and Vα∗ (r) = 0 (dUα∗ (s))2 ds. Then, the same
convergence result that was given above for UT (r) and VT (r) still
holds, with these replacements (i.e., UT∗ (r) in place of UT (r),
Uα∗ (r) in place of Uα (r), etc.); see Phillips (1990, p. 55).
Amsler and Schmidt (1999) used these results to prove the
following result.

Amsler and Schmidt: Tests of Short Memory With Thick-Tailed Errors

385

Table 3. Power of various tests, λ = 0.01
T



N(0,1)

t10

t5

t3

t2

Cauchy

N(0,1)

t10

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:44 11 January 2016

KPSS

t5

t3

t2

Cauchy

MR/S

50
100
200
500
1000
2000

0
0
0
0
0
0

0.293
0.593
0.848
0.988
1.00
1.00

0.294
0.591
0.851
0.988
1.00
1.00

0.300
0.583
0.848
0.987
1.00
1.00

0.307
0.588
0.846
0.986
1.00
1.00

Fixed-b critical values
0.327
0.396
0.126
0.587
0.566
0.517
0.822
0.727
0.875
0.973
0.873
0.997
0.996
0.924
1.00
0.999
0.966
1.00

0.125
0.512
0.874
0.997
1.00
1.00

0.129
0.506
0.874
0.996
1.00
1.00

0.138
0.504
0.860
0.995
1.00
1.00

0.167
0.495
0.815
0.979
0.997
0.999

0.275
0.482
0.679
0.852
0.924
0.961

50
100
200
500
1000
2000

10
12
14
17
21
25

0.194
0.430
0.640
0.868
0.953
0.988

0.190
0.432
0.642
0.868
0.954
0.988

0.197
0.423
0.640
0.870
0.954
0.989

0.199
0.428
0.646
0.873
0.955
0.989

0.208
0.430
0.641
0.866
0.949
0.989

0.002
0.004
0.284
0.885
0.974
0.997

0.002
0.003
0.282
0.883
0.975
0.998

0.003
0.003
0.277
0.882
0.976
0.998

0.003
0.004
0.273
0.886
0.972
0.998

0.004
0.005
0.274
0.870
0.969
0.997

0.009
0.005
0.289
0.738
0.883
0.953

50
100
200
500
1000
2000

0
0
0
0
0
0

0.296
0.546
0.851
0.987
1.00
1.00

0.295
0.593
0.851
0.988
0.999
1.00

0.301
0.596
0.848
0.987
1.00
1.00

0.316
0.596
0.850
0.986
1.00
1.00

Size-adjusted power
0.344
0.426
0.603
0.598
0.831
0.752
0.975
0.886
0.996
0.941
1.00
0.970

0.242
0.597
0.891
0.997
1.00
1.00

0.246
0.598
0.896
0.997
1.00
1.00

0.255
0.606
0.895
0.997
1.00
1.00

0.330
0.618
0.896
0.996
1.00
1.00

0.334
0.628
0.875
0.985
0.999
1.00

0.441
0.617
0.771
0.899
0.947
0.974

50
100
200
500
1000
2000

10
12
14
17
21
25

0.198
0.434
0.649
0.870
0.955
0.988

0.202
0.433
0.647
0.868
0.955
0.988

0.207
0.428
0.648
0.872
0.956
0.989

0.217
0.433
0.660
0.879
0.954
0.989

0.231
0.444
0.652
0.872
0.955
0.990

0.025
0.032
0.477
0.900
0.977
0.997

0.025
0.036
0.486
0.899
0.977
0.998

0.026
0.034
0.499
0.905
0.979
0.997

0.027
0.042
0.515
0.913
0.981
0.998

0.027
0.054
0.555
0.919
0.982
0.998

0.029
0.185
0.601
0.857
0.933
0.969

0.165
0.428
0.598
0.745
0.896
0.954

0.301
0.472
0.636
0.824
0.910
0.959

V/S

LM

50
100
200
500
1000
2000

0
0
0
0
0
0

0.241
0.598
0.887
0.995
1.00
1.00

0.243
0.592
0.887
0.996
1.00
1.00

0.250
0.592
0.884
0.995
1.00
1.00

0.262
0.593
0.880
0.995
1.00
1.00

Fixed-b critical values
0.285
0.361
0.073
0.585
0.553
0.296
0.848
0.723
0.586
0.982
0.877
0.814
0.997
0.937
0.897
0.999
0.967
0.940

0.073
0.295
0.586
0.814
0.898
0.940

0.078
0.295
0.582
0.816
0.900
0.939

0.089
0.302
0.583
0.814
0.898
0.939

0.117
0.322
0.583
0.808
0.892
0.938

0.219
0.382
0.553
0.734
0.831
0.892

50
100
200
500
1000
2000

10
12
14
17
21
25

0.059
0.305
0.677
0.920
0.983
0.998

0.060
0.301
0.676
0.919
0.984
0.998

0.057
0.303
0.679
0.919
0.983
0.997

0.057
0.300
0.671
0.922
0.981
0.998

0.050
0.299
0.655
0.910
0.975
0.997

0.031
0.020
0.070
0.521
0.664
0.758

0.029
0.021
0.070
0.521
0.672
0.762

0.030
0.021
0.067
0.527
0.671
0.760

0.030
0.019
0.065
0.524
0.676
0.766

0.028
0.022
0.054
0.492
0.659
0.762

Proposition 1. Suppose that Xt = µ + εt and that εt satisfies
Condition 2. Then,
 1
Uα∗ (r)2 dr/Vα∗ (1).
(5)
ηˆ µ (0) ⇒
0

The only interesting technical detail is the way
the normalization
of the sums is implicitly
handled:


= T −2 t St2 / T −1 t (Xt∗ )2 = T −1 t (T −1/α St )2 /
ηˆ µ (0) 
T −2/α t (Xt∗ )2 . That is, the partial sums of the data converge
at a different rate in the infinite variance case than in the
finite variance case, but the statistic converges at the same
rate.

0.033
0.287
0.565
0.809
0.907
0.956

0.030
0.020
0.070
0.524
0.667
0.760

Phillips (1990, p. 50) showed how to modify these results to
allow certain forms of short-run dependence.
Specifically, he allowed for a linear process: εt = d(L)ut = ∞
j =0 dj ut−j , where
ut satisfies Condition 2 and 0 < d(1) < ∞. Under this assumption, he showed that the semiparametric correction designed for
the finite variance case works also in the infinite variance case.
Neither the infinite variance asymptotics nor the finite variance asymptotics may do a good job of approximating the distribution of the stationarity test statistics when the data have a finite
variance but fat tails. Basically, we want to find a way to move
continuously from the normal case to the Cauchy case, and the
t-distributions with varying degrees of freedom (which we used
in our earlier simulations) did not accomplish this. We therefore

386

Journal of Business & Economic Statistics, July 2012

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:44 11 January 2016

consider a local-to-finite variance parameterization that may be
able to do so. This is based on the following representation of
the series εt :


(6)
εt = ε1t + c/T [(1/α)−(1/2)] ε2t ,

where ε1t satisfies Condition 1 (has finite variance) and ε2t satisfies Condition 2 (has infinite variance). Here, c is an arbitrary
constant that weights the two components of εt . The order in
probability of the cumulations of the infinite variance process
ε2t is larger than that of the cumulations of the finite variance
process ε1t , and the power of T in the representation (6) is chosen so that the cumulation of the series εt , suitably normalized,
converges to a stochastic process that is the weighted sum of a
Wiener process and a L´evy process.
We call this a local-to-finite variance representation because
εt has infinite variance for all T, but loosely speaking, it converges to a finite variance process as T → ∞. This is philosophically similar to other types of local representations, such
as local to an autoregressive unit root (e.g., Elliott, Rothenberg,
and Stock 1996) or local to a unit moving average root (Pantula
1991) or local to useless instruments (Staiger and Stock 1997),
in that it is an attempt to create asymptotics that can be used to
analyze features of the data that are relevant in finite samples
but not in the standard asymptotics.
The situation in the literature on the local-to-finite variance
parameterization and related convergence results is somewhat
unusual. The parameterization was first suggested and the proofs
of the asymptotic results given in the current article were presented by Amsler and Schmidt (1999), but that paper was never
published. These results were used, and some of them were
proved again, by Callegari, Cappuccio, and Lubian (2003) and
Cappuccio and Lubian (2007). Those articles quoted Amsler
and Schmidt (1999) but gave a somewhat self-contained presentation of the results. In the current article, we will simply
quote these results and indicate how we use them.
The basic convergence result is the following.
Proposition 2. Suppose that εt satisfies Equation (6), where
ε1t satisfies Condition 1 and ε2t satisfies Condition 2. Then,
T −1/2

[rT ]

t=1

εt = T −1/2

[rT ]

t=1

ε1t + cT −1/α

[rT ]


ε2t

t=1

⇒ W (r) + cUα (r) ≡ Gα,c (r).

(7)

To allow for the fact that the data for the KPSS, MR/S, and
V/S tests are in deviations from means, we introduce the analog
of a Brownian bridge: G∗α,c (r) = Gα,c (r) − rGα,c (1). Amsler
and Schmidt (1999) derived the asymptotic distribution of the
KPSS statistic with no lags (ℓ = 0), which is exactly as in Equation (5) except that G∗α,c (r) replaces Uα∗ (r) in the numerator of
the expression for the asymptotic distribution. That is,
1 ∗
Gα,c (r)2 dr
.
(8)
ηˆ µ (0) ⇒ 0
1 + c2 Vα∗ (1)
Cappuccio and Lubian (2007) also derived these results for
the KPSS and MR/S tests, as well as the LM test and several
other tests. We will therefore omit the proofs here.
We note that the same asymptotic distribution holds if the
number of lags is positive and grows at an appropriate rate. A

precisely stated proposition and its proof are given in the online
Appendix.
5.

SIMULATIONS WITH LOCAL-TO-FINITE
VARIANCE DATA

We now return to the question of the relative robustness of the
KPSS, MR/S, V/S, and LM tests to thick tails. We use the localto-finite variance asymptotics in a conceptually simple way. The
asymptotic distribution of each of the statistics depends on the
quantity “c” that determines the relative weight of the infinite
variance component. We will say that one statistic is more robust
than another if its asymptotic distribution is less sensitive to the
value of c.
Before we make such comparisons, we first present the results
of some simulations that investigate the finite sample accuracy
of the local-to-finite variance asymptotics. In these simulations,
we consider only the statistics with ℓ = 0, and we consider the
while ε2t is Cauchy (α = 1).
case that ε1t is standard normal

Therefore, εt = ε1t + (c/ T )ε2t . We consider T between 50
and 2000, as before, and c = 0, √
0.1, 0.316, 1, 3.16, 10, and
31.6. (Here, 3.16 is shorthand for 10, etc.) Values of c larger
than 31.6 had only very minimal effects on the results; we
have essentially reached the pure Cauchy case. The number of
replications is 50,000 except that it is only 25,000 for T = 1000
or 2000. These simulations are similar to those of Cappuccio
and Lubian (2007, Table 2), who considered more values of α
but less values of T.
The results of these simulations are given in the top half
of Table 4, where we report the size (frequency of rejection)
of the 5% upper tail tests. For the KPSS, V/S, and LM tests,
the asymptotics are quite reliable, in the sense that, holding c
constant, changing T did not have much effect on the results.
For the MR/S test, the asymptotics are not very reliable for T ≤
200, but they seemed fairly reliable for T ≥ 500. Reading across
the table horizontally, we can observe a smooth increase in the
size distortions of all of the tests as we move from the normal
case (c = 0) to the near Cauchy case (c = 31.6). As in Table 2,
we can observe that the size distortions are biggest for the MR/S
test and smallest for the LM test.
We now return to the task of assessing the relative sensitivity
of the asymptotic distributions of the various test statistics to the
value of c. We have evaluated the distributions of the statistics by
simulation for T = 10,000, using 100,000 replications. Various
quantiles are presented in Table 5. There is a question of how
to measure the sensitivity of these distributions to the value of
c. Rather than use a summary statistic, we simply provide all
of the deciles of the distributions, as well as the 0.001, 0.025,
0.050, 0.950, 0.975, and 0.999 quantiles.
For the LM statistic, there are only minor changes in any of
the quantiles as c changes. Thus, the LM test is very robust to
fat tails, as we saw in Section 3.
For the KPSS statistic, the shifts in the asymptotic distribution
as c changes are relatively minor. Most of the distribution shifts
slightly right as c increases. For example, the median is 0.118 for
c = 0 and rises to 0.132 for c = 31.6. Most of the other quantiles
show a similar pattern. However, the upper tail quantiles are the
exception; they decrease as c increases. For example, the 0.950
quantile decreases from 0.476 to 0.395 as c increases from 0 to

Amsler and Schmidt: Tests of Short Memory With Thick-Tailed Errors

387

Table 4. Size (frequency of rejection) of the 5% upper tail tests
T

0

0.1

0.316

1

3.16

10

31.6

0

0.1

0.316

1

3.16

10

31.6

0.002
0.003
0.004
0.004
0.005
0.005

0.001
0.001
0.001
0.002
0.002
0.002

0.001
0.001
0.001
0.001
0.001
0.001

0.044
0.044
0.043
0.044
0.046
0.043

0.041
0.042
0.041
0.042
0.040
0.043

0.043
0.043
0.043
0.042
0.043
0.041

0.008
0.013
0.016
0.021
0.023
0.023

0.017
0.023
0.027
0.029
0.030
0.032

0.025
0.035
0.038
0.041
0.042
0.044

0.047
0.047
0.047
0.045
0.049
0.045

0.047
0.047
0.045
0.047
0.045
0.048

0.050
0.049
0.048
0.049
0.053
0.049

Size, local-to-finite variance data, standard critical values
KPSS
50
100
200
500
1000
2000

0.050
0.050
0.049
0.049
0.049
0.048

0.048
0.048
0.047
0.048
0.047
0.047

0.045
0.043
0.044
0.044
0.045
0.045

0.038
0.038
0.036
0.038
0.036
0.038

MR/S
0.032
0.030
0.030
0.031
0.031
0.030

0.027
0.028
0.027
0.028
0.028
0.028

0.029
0.027
0.027
0.028
0.027
0.026

0.014
0.024
0.032
0.040
0.046
0.046

0.013
0.022
0.029
0.035
0.041
0.043

0.011
0.017
0.023
0.028
0.033
0.033

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:44 11 January 2016

V/S
50
100
200
500
1000
2000

0.046
0.049
0.049
0.050
0.051
0.052

0.044
0.047
0.045
0.047
0.049
0.048

0.039
0.041
0.042
0.041
0.042
0.043

0.031
0.031
0.031
0.033
0.031
0.032

0.005
0.010
0.012
0.016
0.016
0.018
LM

0.020
0.021
0.022
0.021
0.022
0.021

0.017
0.017
0.017
0.019
0.018
0.017

0.016
0.017
0.018
0.017
0.018
0.017

0.049
0.048
0.050
0.049
0.053
0.051

0.048
0.049
0.049
0.049
0.051
0.051

0.047
0.047
0.048
0.046
0.049
0.049

0.045
0.046
0.046
0.046
0.044
0.046

Size, local-to-finite variance data, interpolated critical values
KPSS
50
100
200
500
1000
2000

0.050
0.050
0.050
0.049
0.050
0.049

0.049
0.049
0.048
0.049
0.049
0.048

0.048
0.046
0.047
0.048
0.048
0.048

0.044
0.045
0.044
0.045
0.044
0.046

MR/S
0.044
0.042
0.041
0.044
0.045
0.044

0.045
0.046
0.046
0.046
0.046
0.045

0.050
0.050
0.047
0.047
0.049
0.048

0.014
0.025
0.032
0.040
0.046
0.048

0.013
0.022
0.030
0.037
0.042
0.043

0.011
0.019
0.025
0.031
0.037
0.039

V/S
50
100
200
500
1000
2000

0.047
0.049
0.050
0.051
0.054
0.052

0.045
0.046
0.048
0.048
0.048
0.049

0.042
0.043
0.044
0.045
0.045
0.049

0.036
0.039
0.040
0.040
0.042
0.041

0.006
0.014
0.019
0.025
0.026
0.030
LM

0.036
0.038
0.038
0.039
0.040
0.039

0.042
0.042
0.043
0.045
0.044
0.045

0.048
0.048
0.049
0.049
0.049
0.049

31.6. This explains the smaller number of rejections of the null
when c increases, since we are using an upper tail test.
For the V/S statistic, the pattern is quite similar to that for
KPSS. However, the changes are bigger for V/S than for KPSS,
and indeed, in Section 3, we saw that the V/S test was less robust
to fat tails than the KPSS test.
For the MR/S statistic, things are rather different. The asymptotic distribution shifts left as c increases, and this is true across
the whole distribution. The shift is largest in the upper tail. For
example, as we move from c = 0 to c = 31.6, the median changes
from 1.216 to 0.999, while the 0.95 quantile changes from 1.739
to 1.361. These changes are larger for the MR/S statistic than
for the KPSS statistic, for example, and this at least partially
explains the lower degree of robustness of the MR/S test to fat
tails. However, the thickness of the tails of the distributions of
the test statistics under thick-tailed errors is also relevant. For
example, the change in the 0.95 quantile as we go from normal
to Cauchy errors is larger but not terribly larger in percentage
terms for the MR/S statistic compared with the KPSS statistic
(1.739 becomes 1.361 for MR/S, while 0.476 becomes 0.395
for KPSS). But the implication of this change for rejections un-

0.048
0.048
0.049
0.049
0.049
0.051

0.049
0.048
0.049
0.049
0.052
0.051

0.047
0.048
0.044
0.045
0.049
0.051

0.046
0.046
0.048
0.045
0.052
0.049

der the null is very different, because the longer tail of KPSS
(compared with MR/S) in the Cauchy case puts a much higher
fraction of observations above the standard critical value.
An interesting question is whether the size distortions that
arise when c is nonzero can be reduced by estimating c and then
using the appropriate critical value based on the estimated value
of c. Obviously, in some sense, this must depend on how
√well one
=
ε
+
(c/
T )ε2t as
can estimate c. To be more explicit, let
ε
t
1t

T
.
As
we
will
show,
it is
discussed above, and
define
γ
=
c/

possible to find a T -consistent estimate√of γ , say γˆ , so that
γˆ − γ = Op (T −1/2 ). Then, we define cˆ = T γˆ so that cˆ − c =
Op (1). That is, we can estimate c though not consistently. As
a practical matter, this does not indicate whether or not we can
estimate it reasonably well; it just implies that any corrections to
the critical values based on the estimated c will not necessarily
improve as the sample size increases.
To be more precise, we observe deviations from means of
[σ · N(0,1) + γ ·Cauchy], and we need to √
estimate the two
parameters σ and γ , and then calculate cˆ = T (γˆ /σˆ ). (Division by an estimate of σ is necessary because the discussion of
the local-to-finite variance parameterization above had σ = 1,

388

Journal of Business & Economic Statistics, July 2012

Table 5. Quantiles of test statistics, local-to-finite variance data

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:44 11 January 2016

Quantile

c=0

0.1

0.316

1

3.16

10

KPSS
0.026
0.032
0.040
0.051
0.069
0.086
0.103
0.126
0.153
0.191
0.242
0.328
0.424
0.524
0.663

0.027
0.035
0.043
0.054
0.074
0.090
0.108
0.130
0.157
0.193
0.243
0.320
0.405
0.494
0.607

0.026
0.034
0.044
0.056
0.076
0.091
0.109
0.132
0.160
0.196
0.244
0.320
0.398
0.480
0.586

0.026
0.034
0.044
0.056
0.076
0.091
0.109
0.132
0.160
0.196
0.244
0.318
0.398
0.475
0.584

Table 5. (continued) Quantiles of test statistics, local-to-finite
variance data

31.6

0.010
0.025
0.050
0.100
0.200
0.300
0.400
0.500
0.600
0.700
0.800
0.900
0.950
0.975
0.990

0.024
0.029
0.036
0.045
0.061
0.077
0.095
0.118
0.146
0.184
0.241
0.346
0.458
0.564
0.721

0.024
0.030
0.036
0.046
0.062
0.079
0.097
0.119
0.147
0.185
0.243
0.347
0.455
0.573
0.722

0.025
0.031
0.037
0.048
0.065
0.082
0.101
0.123
0.152
0.189
0.245
0.338
0.439
0.552
0.705

0.010
0.025
0.050
0.100
0.200
0.300
0.400
0.500
0.600
0.700
0.800
0.900
0.950
0.975
0.990

0.740
0.796
0.847
0.910
1.005
1.077
1.144
1.212
1.283
1.362
1.463
1.611
1.738
1.855
1.985

0.744
0.797
0.853
0.918
0.998
1.059
1.125
1.192
1.262
1.346
1.444
1.591
1.719
1.837
1.978

MR/S
0.746 0.756
0.802 0.804
0.854 0.852
0.918 0.909
0.993 0.974
1.033 0.999
1.094 1.036
1.158 1.090
1.229 1.115
1.308 1.231
1.408 1.325
1.556 1.465
1.687 1.582
1.802 1.693
1.937 1.830

0.746
0.789
0.830
0.882
0.942
0.980
0.998
1.020
1.070
1.134
1.216
1.335
1.441
1.539
1.674

0.719
0.769
0.812
0.863
0.924
0.964
0.990
0.999
1.037
1.093
1.169
1.280
1.372
1.458
1.558

0.716
0.762
0.804
0.856
0.919
0.959
0.986
0.999
1.029
1.084
1.159
1.269
1.357
1.439
1.540

0.010
0.025
0.050
0.100
0.200
0.300
0.400
0.500
0.600
0.700
0.800
0.900
0.950
0.975
0.990

0.019
0.023
0.027
0.032
0.041
0.050
0.059
0.069
0.081
0.096
0.116
0.152
0.186
0.219
0.267

0.019
0.023
0.027
0.033
0.042
0.051
0.060
0.070
0.082
0.095
0.115
0.151
0.185
0.217
0.265

V/S
0.020
0.024
0.028
0.034
0.044
0.053
0.062
0.072
0.082
0.094
0.113
0.147
0.179
0.213
0.256

0.021
0.025
0.030
0.036
0.047
0.057
0.067
0.076
0.084
0.094
0.111
0.138
0.168
0.198
0.243

0.022
0.026
0.032
0.039
0.051
0.061
0.070
0.078
0.084
0.093
0.107
0.132
0.157
0.183
0.222

0.022
0.026
0.033
0.041
0.052
0.062
0.071
0.079
0.085
0.094
0.107
0.130
0.152
0.179
0.212

0.022
0.027
0.033
0.040
0.052
0.062
0.071
0.079
0.085
0.093
0.107
0.129
0.153
0.176
0.209

0.010
0.025
0.050
0.100
0.200
0.300
0.400
0.500
0.600
0.700
0.800

0.001
0.007
0.022
0.056
0.115
0.155
0.185
0.207
0.223
0.235
0.243

0.001
0.007
0.022
0.057
0.113
0.153
0.183
0.206
0.222
0.234
0.243

LM
0.001
0.006
0.021
0.056
0.113
0.153
0.182
0.204
0.222
0.234
0.243

0.001
0.006
0.021
0.056
0.112
0.150
0.178
0.201
0.219
0.232
0.242

0.001
0.007
0.023
0.057
0.113
0.149
0.176
0.198
0.217
0.231
0.241

0.001
0.006
0.021
0.057
0.113
0.148
0.174
0.197
0.216
0.230
0.241

0.001
0.007
0.022
0.056
0.113
0.148
0.175
0.197
0.216
0.230
0.241

Quantile

c=0

0.1

0.316

1

3.16

10

31.6

0.900
0.950
0.975
0.990

0.248
628

911

990

0.248
656

916

988

0.248
563

900

986

0.248
610

914

989

0.247
556

892

984

0.248
489

863

980

0.247

477

877

981













NOTE: ∗ 628 means 0.249628, ∗ 656 means 0.249656, etc.

which is empirically restrictive.) We will use a very simple consistent estimator of σ and γ obtained by matching the empirical
and theoretical characteristic functions of the deviations from
means of εt for two values (0.2 and 1.0) of its argument. Details
are given in the Appendix. We then find critical values for the
various tests, for the estimated value of c, by interpolating in
Table 5 in the rows corresponding to the 0.95 quantiles.
The results are given in the bottom half of Table 4. The interpolation procedure works reasonably (and perhaps surprisingly)
well. For all of the tests, the size distortions when the interpolated critical values are used are considerably smaller than when
the standard critical values are used. Of course, this is most true
when c is large, because that is when the size distortions were
biggest to begin with. Interestingly, we do not estimate c as well
when it is large as when it is small, but when c is large, the
critical values are much less sensitive to its value than when it
is small.
6.

CONCLUDING REMARKS

In this article, we have considered four tests of the null hypothesis that a series is stationary and has short memory. We
are primarily interested in their robustness to fat tails. A very
general conclusion from our simulations is that, except for the
MR/S test, fat tails are not really a very serious problem. It
takes a very extreme distribution, such as Cauchy, before the
size distortions are large enough to be worrisome.
However, there were nontrivial differences between the various tests. The LM test was by far the most robust to fat tails.
This was true in our basic simulations, and it was also evident
in our analysis of the local-to-finite variance asymptotic distributions. It was true in the case of no lags (ℓ = 0) and also when
a large number of lags were used. [The LM test is also by far
the most robust to error autocorrelation, as reported by Amsler
and Schmidt (2012).] So, it had excellent size properties. Unfortunately, it was the least powerful of the four tests against unit
root alternatives. Perhaps unsurprisingly, excellent size control
comes at some price in power.
At the other extreme, Lo’s MR/S test was very nonrobust to
small sample sizes (it was slow to converge to its asymptotic
distribution) and also to fat tails. It was often substantially undersized and this carried over into low apparent power against
unit root alternatives. However, it did reasonably well in terms
of size-adjusted power. Our results do not support its use at this
time, but if its size could be corrected, it would be a viable
alternative to the other tests.
As a very general statement, neither the KPSS test nor the
V/S test clearly dominated the other. Both were more robust

Amsler and Schmidt: Tests of Short Memory With Thick-Tailed Errors

than the MR/S test, but less robust than the LM test. Both were
more powerful than the LM test. So, in choosing between either
one and the LM test, there is an obvious trade-off between size
control and power.
In addition to these practical conclusions, the article has
demonstrated the usefulness of fixed-b asymptotics in finding
accurate critical values for stationarity tests. It has also suggested
asymptotics based on a local-to-finite variance parameterization,
and shown how they can be useful in understanding the behavior of statistics in the presence of fat-tailed distributions and in
suggesting corrected critical values that yield better-sized tests
in the presence of fat tails.

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:44 11 January 2016

APPENDIX: ESTIMATION OF σ AND γ
We have εt = σ ε1t + γ ε2t , where ε1t is standard normal and
ε2t is standard Cauchy. We will proceed through a series of
special cases because there are some efficiency issues (not necessarily relevant to this article, but nevertheless interesting) to
discuss that are tractable in those cases.

389

optimal value of s ( = 0.797/γ ), we obtain AV (γˆ ) = 3.08γ 2 /T ,
so that the loss in asymptotic efficiency relative to the MLE is
about 50%.
A.3

In this case, Xt = σ ε1t + γ ε2t , where ε1t is standard normal
and ε2t is standard Cauchy. Perhaps surprisingly, the density of
this sum is not known, and the integral that would define it appears to be intractable. (It is not in the many tables of thousands
of integrals that we looked in, for example, nor will standard
integral calculators such as Wolfram Mathematica calculate it.)
So, the MLE is not readily available, and neither is the asymptotic variance bound. However, we can estimate σ and γ by
matching the empirical and theoretical characteristic functions
at two values of its argument, say s1 and s2 .
The characteristic function of X is φ(s) = exp(− 21 σ 2 s 2 − γ s)
for s > 0. Let h(s) = ln(φ(s)) =