07350015%2E2013%2E803973

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

Automatic Specification Testing for Vector
Autoregressions and Multivariate Nonlinear Time
Series Models
Juan Carlos Escanciano , Ignacio N. Lobato & Lin Zhu
To cite this article: Juan Carlos Escanciano , Ignacio N. Lobato & Lin Zhu (2013)
Automatic Specification Testing for Vector Autoregressions and Multivariate Nonlinear
Time Series Models, Journal of Business & Economic Statistics, 31:4, 426-437, DOI:
10.1080/07350015.2013.803973
To link to this article: http://dx.doi.org/10.1080/07350015.2013.803973

View supplementary material

Accepted author version posted online: 31
May 2013.

Submit your article to this journal


Article views: 470

View related articles

Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=ubes20
Download by: [Universitas Maritim Raja Ali Haji]

Date: 11 January 2016, At: 22:17

Supplementary materials for this article are available online. Please go to http://tandfonline.com/r/JBES

Automatic Specification Testing for Vector
Autoregressions and Multivariate Nonlinear Time
Series Models
Juan Carlos ESCANCIANO
Department of Economics, Indiana University, Bloomington, IN 47405 ([email protected])

Ignacio N. LOBATO
Instituto Tecnologico

Autonomo
de Mexico,
Av. Camino Sta Teresa 930, Col. Heroes
de Padierna,
´
´
´
´
Mexico
D.F. 10700, Mexico
([email protected])
´
´

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:17 11 January 2016

Lin ZHU
School of Economics and Management, Tsinghua University, Beijing 100084, China ([email protected])
This article introduces an automatic test for the correct specification of a vector autoregression (VAR)
model. The proposed test statistic is a Portmanteau statistic with an automatic selection of the order of

the residual serial correlation tested. The test presents several attractive characteristics: simplicity, robustness, and high power in finite samples. The test is simple to implement since the researcher does
not need to specify the order of the autocorrelation tested and the proposed critical values are simple to
approximate, without resorting to bootstrap procedures. In addition, the test is robust to the presence of
conditional heteroscedasticity of unknown form and accounts for estimation uncertainty without requiring the computation of large-dimensional inverses of near-to-singularity covariance matrices. The basic
methodology is extended to general nonlinear multivariate time series models. Simulations show that the
proposed test presents higher power than the existing ones for models commonly employed in empirical
macroeconomics and empirical finance. Finally, the test is applied to the classical bivariate VAR model
for GNP (gross national product) and unemployment of Blanchard and Quah (1989) and Evans (1989).
Online supplementary material includes proofs and additional details.
KEY WORDS: Akaike’s AIC; Autocorrelation; Diagnostic test; Model checking; Schwarz’s BIC.

1.

INTRODUCTION

The vector autoregression (VAR) model has been one of the
most popular tools employed by macroeconomists in recent
years for the analysis of multivariate time series. The main reasons for this success are its flexibility and simplicity to implement, which has led to its intensive use in financial and macroeconomic applications, where it has proven to be useful for data
description and forecasting. VAR models, as simple extensions
of univariate autoregressions, have been known for many years,

but only in the last 30 years, have they become widely used by
macroeconomists and policy-oriented researchers. VAR models
have been used in macroeconomics mainly for two purposes:
first, as a device to derive “stylized facts” of the effects of some
shocks (mainly policy shocks) on relevant economic variables
and, second, as a mechanism to evaluate economic theory models, see Christiano, Eichenbaum, and Evans (1999). For the
purpose of linking data to behavioral relations, short-term or
long-term restrictions are typically included, see, for instance,
Blanchard and Quah (1989). Starting with the seminal article
by Sims (1980), VAR models have often been used for structural, causal, and policy analysis, so that Granger-causality tests,
impulse response functions, and forecast error variance decompositions are nowadays standard macroeconomists’ tools.
For these inference procedures to be reliable, a critical aspect
is to test for the correct specification of the VAR model. If the
researcher employs a misspecified model, interesting dynamics

of the economic variables can be ignored and conclusions from
the impulse response functions can be misleading. A natural
way of validating the specification of the VAR model, as with
any other time series model, is to check if the residuals are white
noise, that is, uncorrelated.

Testing for serial correlation in residuals is a distinguished
literature in statistics and economics dating back to Quenouille
(1947) for univariate autoregressions. For recent proposals in
a univariate setting, see, for example, Delgado and Velasco
(2011) and Guay, Guerre, and Lazarova (2011). One of the most
popular tools has been the Portmanteau tests proposed by Box
and Pierce (1970, hereafter BP) for univariate autoregressive–
moving-average (ARMA) models. A multivariate version of the
BP’s test was proposed by Chitturi (1974) for VAR processes.
Hosking (1980, 1981a,b) gave several equivalent forms of this
statistic, see also Ahn (1988), and Poskitt and Tremayne (1982)
showed that BP’s test in a multivariate context can be interpreted
as a lagrange multiplier (LM) test.
Inference related to the Portmanteau statistic for the dependent residual case varies according to the assumptions made on
the errors and the lag order h. For the univariate homoscedastic residual independent case, BP suggested comparing the

426

© 2013 American Statistical Association
Journal of Business & Economic Statistics

October 2013, Vol. 31, No. 4
DOI: 10.1080/07350015.2013.803973

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:17 11 January 2016

Escanciano, Lobato, and Zhu: Automatic Specification Testing

Portmanteau statistic to the upper critical values from a χ 2
distribution where the degrees of freedom are the number of
autocorrelations tested minus the number of estimated parameters. This result is motivated by the fact that the autocorrelations
obey as many linear restrictions as parameters have to be estimated, so when the number of autocorrelations tested is taken
“sufficiently large” (BP, p. 1517), the effect of parameter estimation is annihilated. For a framework similar to BP, Newbold
(1980) proposed a modified Portmanteau statistic that employed
a consistent estimator of the asymptotic covariance matrix of the
sample autocorrelations. For the univariate residual dependent
case, Francq, Roy, and Zakoian (2005) kept the Portmanteau
statistic and estimated the critical values.
Inference with the Portmanteau statistic in the context of
multivariate models has also been investigated for the case where
a fixed number of autocorrelations, say h, is tested. Similar to

the univariate case, the solutions have relied on either modifying
the BP test statistic and keeping the chi-squared critical value
or keeping the BP test statistic and modifying the critical value.
The first approach was followed by Chabot-Hall´e and
Duchesne (2008), who modified the BP test statistic to take
into account the effect of estimated parameters in a setting with
heteroscedastic errors. In this case, the limiting distribution of
2
the modified test is a χhm
2 , where m is the number of considered
series. A practical disadvantage that we have observed with this
approach is that the modified BP test statistic is not robust because it requires the estimation of the inverse of a covariance
matrix that can be close to singular or even singular in some situations; see Section 4 for an illustrative example. Hence, the problem with this approach is that in finite samples, it cannot control
the Type I error of the test even for very large sample sizes.
The second approach was followed by Francq and Ra¨ıssi
(2007) who noticed that for a fixed h, the limiting distribution of
the Portmanteau test statistic is a weighted sum of independent
χ12 random variables, where the weights are given by the eigenvalues of an appropriate covariance matrix. The critical values
are not readily available in this approach, but they can be easily
estimated from a suitable consistent estimator of a potentially

large-dimensional covariance matrix. As we will see in detail in
Section 3, in our experience, this approach works better in finite
samples.
Despite the previous generalizations of the classical Portmanteau statistic to the multivariate residual context, an important limitation of the multivariate Portmanteau statistics still
remains; namely, that inference can be rather sensitive to h, the
selected number of autocorrelations. Often, different values of
h lead to conflicting conclusions in empirical applications, and
there is little guidance about how to deal with this multiple testing problem. The objective of this article is to overcome this
important limitation by proposing a Portmanteau statistic where
the parameter h is not fixed but selected automatically from the
data.
Although our initial motivation focuses on residuals from
VAR models and proposes fully automatic model checks with
simple implementation, we also extend our approach to general nonlinear multivariate models and, in particular, to generalized autoregressive conditional heteroscedasticity (GARCH)
models. For the latter application, our construction leads to automatic asymptotic distribution-free versions of the Portman-

427

teau test proposed by Ling and Li (1997). In this application,
the distribution-free property critically rests on the fact that the

asymptotic distribution of the automatic test exclusively depends
on the asymptotic variance of the first-order sample correlation;
a fact that leads to more robust and simpler to implement tests.
Finally, it is worth stressing that automatic order selection
has been used in the context of VAR models for identification
and estimation; see Akaike (1974) and Schwarz (1978) for the
Akaike information criterion (AIC) and Bayesian information
criterion (BIC) selection criteria, respectively, and L¨utkepohl
(2005, chap. 4) for a survey of results. However, note that this
classical analysis focused on identification and estimation, not
on diagnostic testing. As far as we know, this article is the first
to address the automatic selection of h in the framework of
testing for serial correlation of residuals from VAR, GARCH,
and related multivariate models.
The plan of the article is as follows. Section 2 introduces notation and the new automatic test in detail for the VAR case. Then,
in Section 3, we extend the proposed procedure to a general multivariate nonlinear framework and consider in detail volatility
models. Section 4 studies the finite sample behavior through
simulations. Finally, Section 5 presents an empirical application to modeling GNP (gross national product) and unemployment, and Section 6 concludes. Proofs are gathered in the online
Appendix D. A word about notation, henceforth for an m × p
matrix A := (aij )1≤i≤m,1≤j ≤p , A′ denotes its transpose, and |A|

and |A|∞ denote the Euclidean and sup norm, respectively, that
is, |A|2 := tr(A′ A) and |A|∞ := sup1≤i≤m,1≤j ≤p |aij |. Let Im denote the identity matrix of order m. In addition, ⊗ and ⊙ denote
Kronecker and Hadamard products, respectively.
2.

AUTOMATIC DIAGNOSTIC CHECKING
FOR VAR MODELS

Consider an m-dimensional VAR(p) process Yt
Yt = μ + 1 Yt−1 + · · · + p Yt−p + εt ,

(1)

where εt has zero mean and nonsingular covariance matrix
E(εt εt′ ) = Ŵ(0), μ is an m-dimensional vector, and the ’s
are m × m matrices. Let θ := (μ, 1 , . . . , p , Ŵ(0)) be the
unknown parameters of the model. We assume the process
{Yt }t∈Z is strictly stationary and ergodic, so that the roots of
det(Im − 1 z − · · · − p zp ) = 0 lie outside the complex unit


), j ∈ Z. In this article, we aim
circle. Define Ŵ(j ) := E(εt εt−j
to test the null hypothesis of correct specification of the autocorrelations of the VAR(p) model, that is,
H0 : Ŵ(j ) = 0,

for all j = 0,

against the fixed alternative hypotheses, for K ≥ 1,
H1K : Ŵ(K) = 0.

Notice that our null hypothesis is composite, as Ŵ(j ) depends on
θ, although we do not make explicit this dependence to simplify
the notation.
We follow L¨utkepohl (2005) in much of the notation that follows. Define the m × (mp + 1) matrix B := (μ, 1 , . . . , p )


. . . Yt−p
)′ .
and the (mp + 1)-dimensional vector Zt := ( 1 Yt−1
Define also the (m × n) matrix Y := (Y1 , . . . , Yn ) and the
(mp + 1) × n matrix Z := (Z0 , . . . , Zn−1 ). The parameter B

428

Journal of Business & Economic Statistics, October 2013

is estimated by the least squares (LS) estimator (we focus on
LS since it is the natural estimator, but other estimators could
be entertained as well, with obvious changes in the theory)
 = Y Z ′ (ZZ ′ )−1 ,
B

and a consistent estimator for Ŵ(j ) is given by the sample residual autocovariance matrix of order j ≥ 0,

Ŵ (j ) :=

n
1 

εt
ε′ ,
n t=1+j t−j

 t = εt −
where
εt denotes the vector of residuals
εt := Yt − BZ
 − B)Zt . For testing H0 , the Portmanteau test statistic with h
(B
lags, h ≥ 1, is given by

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:17 11 January 2016

BP(h) := n

h

j =1

Ŵ −1 (0)
tr(
Ŵ (j )′
Ŵ (j )
Ŵ −1 (0)).

This test is often implemented in its Ljung–Box modification
Q(h) := n2

h

j =1

1
tr(
Ŵ (j )′
Ŵ −1 (0)
Ŵ (j )
Ŵ −1 (0))
n−h

n2
γ ′ (Ih ⊗ 

=
Ŵ −1 (0) ⊗ 
Ŵ −1 (0))
γh ,
n−h h
Ŵ (1))′ . . . vec(
Ŵ (h))′ )′ is an m2 hwhere 
γh := (vec(
dimensional vector. The null hypothesis is tested by comparing
the value of any version of the multivariate Portmanteau statistic
2
with upper critical values from a χ(h−p)m
2 ; see for instance
Hosking (1980, 1981a) or Ahn (1988). Although the multivariate Portmanteau statistic has been employed repeatedly, notice
that the test presents two important practical drawbacks. First,
2
the use of the critical values from a χ(h−p)m
2 distribution is
2
questionable since the χ(h−p)m2 is a good approximation only
when the number of autocorrelations h is taken “sufficiently
large” (BP, p. 1517) so that the effect of parameter estimation is
annihilated. Second, inference can be sensitive to the selected
number h.
The limiting distribution for Q(h) for a fixed h has been established under different dependence restrictions on the innovations
{εt }. For strictly stationary and ergodic martingale difference
sequence (mds) innovations {εt }, this distribution is obtained
as follows. A standard Taylor expansion, see, for example,
Lemma 4.2 in L¨utkepohl (2005), relates sample covariances
of residuals to sample covariances of true errors through the
equation



 − B) + oP (1),
n
γh = n
γh − G′h · nvec(B
(2)

γh , but replacing 
Ŵ (j ) with
where 
γh is defined as 

˜h ⊗

Ŵj := n−1 nt=1+j εt εt−j
, and Gh is defined as Gh := G

˜
Im , Gh := (H1 , . . . , Hh ) and Hj := E(Zt εt−j ), j = 1, . . . , h.

(CLT)
for n
From (2), a Central Limit Theorem
γh follows from


 − B))′ . Moreover,
γh , nvec(B
a CLT for the joint process ( n
under some mild moment conditions, the LS estimator satisfies


n


 − B) = √1
vt + oP (1),
nvec(B
n t=1

where vt := −1 Zt √
⊗ εt and  := E[Zt Zt′ ]. Hence, the asymptotic distribution of n
γh will follow from the asymptotic joint


distribution of Jn := n−1/2 nt=1 ut , where ut is an (m2 p +
m2 h + m)-dimensional vector defined as ut := (wt′ vt′ )′ where


wt := ξth ⊗ εt and ξth := (εt−1
, . . . , εt−h
)′ . The process {ut }
is also a stationary and ergodic mds, so that the CLT in
d
Billingsley (1961) implies that Jn −→ J ∼ N(0,
), provided
E[|ut |2 ] < ∞, with


c˜ c,B
˜

:=
,
(3)

B
c,B
˜
:= E[wt vt′ ].
where c˜ := E[wt wt′ ], B := E[vt vt′ ] and c,B
˜
√ Thus,d under the null hypothesis and regularity conditions
n
γh −→ N(0, Dh ), where


Dh = c˜ + G′h B Gh − c,B
˜ Gh − Gh c,B
˜ .

(4)
d

Francq and Ra¨ıssi (2007) used the asymptotic result Jn −→ J
to provide the asymptotic null distribution of Q(h) for a fixed
h ≥ 1 under weak dependence restrictions on {εt }.
In this article, we propose the automatic test statistic
AQ := Q(
h),

where 
h is chosen from the data as follows


h := min{h : 1 ≤ h ≤ d; Lh ≥ Lz , z = 1, 2, . . . , d},

where Lh is defined as Lh := Q(h) − π (h, n, q); d is a fixed
upper bound, d ≥ 1; and
π (h, n, q)
⎧ 2
hm log n,






:=

2m2 h,







if max1≤j ≤d n|
Ŵ (j )
Ŵ −1 (0)|∞

≤ q log n,
(5)

Ŵ (j )
Ŵ −1 (0)|∞
if max1≤j ≤d n|

> q log n.

For the motivation of this selection rule for testing, see Inglot
and Ledwina (2006a,b) and Escanciano and Lobato (2009).
As explained in these references, the motivation behind this
procedure is to combine the advantages of AIC and BIC criteria.
On the one hand, tests constructed using the BIC criterion
are able to properly control the Type I error and are more
powerful when the serial correlation is present in the first-order
autocorrelations. On the other hand, tests based on the AIC
cannot properly control the Type I error, but they are more
powerful when the serial correlation is present in high-order
autocorrelations. Our selection for h allows the data to choose
the preferable criterion according to the data characteristics.
The theoretical results hold for any fixed q. Following extensive simulations in Inglot and Ledwina (2006a,b), Escanciano
and Lobato (2009), and this article suggest that the choice of
q = 2.4 works well in finite samples. Note that a small value
for q would lead to the use of the AIC criterion, while a large q
would lead to the choice of the BIC criterion. Moderate values,
such as 2.4, provide a “switching effect” in which one combines
the advantages of the two selection rules. For further motivation
for the choice of q, see the aforementioned references and, for
other choices of penalty terms and optimality properties of the
resulting tests, see Kallenberg (2002).
Our theoretical results in the following are proved under
the assumption that d is a fixed large number, but they can be

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:17 11 January 2016

Escanciano, Lobato, and Zhu: Automatic Specification Testing

429

extended to the case where d ≡ d(n) grows slowly to infinity
with n under additional regularity conditions that include strong
mixing assumptions. A theoretical advantage of considering
d ≡ d(n) would be to achieve a consistent test against all alternatives H1K for all K ≥ 1. However, for all practical matters, a
theory with d finite suffices, as we can take d as large as desired,
as long as, of course, d ≤ n − 1. Notice that the resulting
data-driven test will be different from the BP test using d(n)
autocorrelations, as proposed in Hong (1996). We interpret d as
an upper bound on the number of correlations that the researcher
is interested in. That is, we only consider alternative hypotheses
for which K ≤ d. This is not a practical limitation, since in
applications, the first correlations are often the most significant
ones. Inference is far less sensitive to d than to the number
of correlations h. For empirical evidence, see Table 6 in our
simulations.
Before establishing the asymptotic theory for the automatic AQ test, denote by Ft−1 to the σ -field generated by
(Yt−1 , Yt−2 , . . .), and define the matrix
h := (Ih ⊗ Ŵ −1/2 (0) ⊗ Ŵ −1/2 (0))

× Dh (Ih ⊗ Ŵ −1/2 (0) ⊗ Ŵ −1/2 (0)).

(6)

Also, introduce the following assumption:
Assumption A1.
(a) The innovations {εt } are a strictly stationary and ergodic
mds with respect to Ft−1 with nonsingular variance Ŵ(0)
and such that E[|ut |2 ] < ∞.
(b) The roots of det(Im − 1 z − · · · − p zp ) = 0 lie outside
the complex unit circle.
Our first results establish the null asymptotic distribution and
the behavior under the alternative for the automatic test statistic
AQ and prove that under the null the probability of 
h being one
tends to one as n tends to infinity.
Theorem 1. Under the null hypothesis and Assumption A1,
d

AQ −→ Z, where Z can be represented as
2

Z :=

m


λj Zj2 ,

j =1

with (λ1 , . . . , λm2 ) the eigenvalues of 1 in (6) and {Zj } are
independent N (0, 1) variables.
Remark 1. The p-value of the weighted χ 2 distribution is calculated by using Imhof’s (1961) algorithm using the estimator
1 (
1 = (

Ŵ −1/2 (0) ⊗ 
Ŵ −1/2 (0))D
Ŵ −1/2 (0) ⊗ 
Ŵ −1/2 (0)), (7)

1 can be found in Appendix A. Thus,
where the expression for D
1 , and the nonsingularity of 1
Theorem 1, the consistency of 
yield an asymptotic α-level automatic test.

Theorem 2. Under Assumption A1, as n → ∞, the test based
on AQ is consistent against H1K , for K ≤ d.
Remark 2. Note that Assumption A1(a) allows for conditional heteroscedasticity of unknown form and other forms of
nonlinear dependence present in financial data. Alternative weak

dependence conditions, such as mixing assumptions, can be employed (see Francq and Ra¨ıssi 2007, sec. 3). For independent and
identically distributed (iid) innovations {εt }, our conditions and
some of the expressions can be simplified. For instance, a sufficient condition for E[|ut |2 ] < ∞ is that E[|εit εj t εkt εlt |4 ] < ∞
for all i, j, k, l = 1, . . . , m, where εit is the ith component of
εt . Moreover, by proposition 4.5 in L¨utkepohl (2005), Dh =
˜ h ) ⊗ Ŵ(0), which suggests the simple es˜ ′ −1 G
(Ih ⊗ Ŵ(0) − G
h
ˆ′ 
ˆ h) ⊗ 
ˆ −1 G


timator Dh = (Ih ⊗ Ŵ (0) − G
Ŵ (0).
h

Remark 3. An alternative to the classical Portmanteau test
statistic is the following modified Portmanteau test statistic:
h−1 
T (h) := 
γh′ D
γh ,

h denotes the sample analog of Dh , whose expression
where D
can be found in the online Appendix A. Several versions of the
test statistic T (h) have appeared in the literature, see Poskitt and
Tremayne (1982) for the case of homoscedastic vector ARMA
(VARMA) models and Chabot-Hall´e and Duchesne (2008) for
the case of heteroscedastic nonlinear VAR models.
By construction, it is straightforward to show that under the
null hypothesis H0 , and some further regularity conditions, for
2
a fixed h, T (h) converges to a χhm
2 . Compared to the Portmanteau Q(h), the Portmanteau T (h) has the advantage that its
critical values are readily available. However, we have observed
in simulations that the test statistic T (h) and its automatic verh is close to singular for
sion are highly unstable because D
some parameter values, especially when h is large. A result
that could be expected in general since Dh becomes singular as
h → ∞.
Remark 4. Note that the singularity of Dh , and hence of h ,
can occur when p = 0. This implies a discontinuity in the
asymptotic distribution of 
γh , and hence, of those of Q(h) and
AQ. In particular, singularity can even occur when h = 1, as in
the following example:
Example 1. Consider the bivariate VAR(1) model
Yt = θ I2 Yt−1 + εt ,
where θ ∈ R, I2 is the identity matrix, and {εt } are iid standard
Gaussian. This example extends the simple example in Durbin
(1970, p. 419) who first noticed the discontinuity in the asymptotic distribution of the residual sample autocovariances for a
simple univariate AR(1).
Note that when θ = 0, it can be easily shown that the eigenvalues of h in (6) are 0 with algebraic multiplicity m2 and
1 with multiplicity m2 (h − 1). Hence, the modified Portmanteau test T (h) cannot be applied in this case. However, Theorems 1 and 2 are still valid, but with Z = 0 almost surely
(a.s.; all eigenvalues are zero for h = 1) when θ = 0. An analysis of the limiting size of our automatic test for situations
where rank(1 ) = 0 requires a case-by-case analysis, and it
is beyond the scope of this article. However, notice that most
existing specification tests that employ h autocorrelations require h to be nonsingular, exceptions are the likelihood ratio
test of Durbin (1970) and the LM test of Godfrey (1976). In
contrast, the proposed AQ test is an asymptotic α-level test
under the weaker condition rank(1 ) > 0. Furthermore, even
in situations where rank(1 ) = 0 but rank(l ) > 0 for some l,

430

Journal of Business & Economic Statistics, October 2013

a simple modification of our automatic test delivers an asymptotic α-level test, simply by changing the automatic lag choice to

h := min{h : l ≤ h ≤ d; Lh ≥ Lz , z = l, . . . , d}. For instance,
we can take l = 2 in the current VAR(1) example if θ = 0. Furthermore, in the simulations below, we show that our automatic
AQ test presents an empirical size that is accurate for values
of θ close to zero, such as 0.1, unlike alternative tests based on
T (h).

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:17 11 January 2016

3.

FURTHER APPLICATIONS OF THE
METHODOLOGY

The methodology introduced in the previous section can be
extended to constructing automatic specification tests in a variety of frameworks. This section discusses a generalization of
the automatic AQ test in a general multivariate nonlinear setting
and then examines in detail the leading application to volatility models, which will also be considered in our simulations
exercises.
We consider a generalized error process
εt = Ht (θ0 ) ≡ H (Ft , θ0 ),

(8)

where H (·) is a known vector function and θ0 ∈  ⊂ Rl is
an unknown parameter vector to be estimated. The objective
is to test for correct model specification by testing whether
the process εt is uncorrelated. Applying the AQ test to the
estimated generalized errors is an automatic specification test
for two leading cases: general dynamic parametric models and
models defined by conditional moment restrictions, as we briefly
comment next.
In many dynamic parametric models, the distribution is assumed to belong to a parametric class of distributions, for instance, in Gaussian VAR-GARCH models, and so, the distribution of the stochastic process Yt conditional on (Yt−1 , Yt−2 , . . .)
is known. Calling Ft (y; θ0 ) ≡ F (y; θ0 , Yt−1 , Yt−2 , . . .) to this
known distribution, then it is simple to show that under correct
specification, the generalized errors defined as εt = Ft (Yt ; θ0 ),
for all t, are uncorrelated. Hence, the proposed AQ test can be
applied to the residuals 
εt = Ft (Yt ; 
θ ), for a suitable estimate

θ of θ0 , for example, the conditional maximum likelihood
esti√
γ h − γh )
mator. For this case, the asymptotic normality of n(
can be established under the conditions below. See also Bai and
Chen (2008) for an alternative testing approach.
In many other cases, the researcher does not assume that the
conditional distribution of the data is known, but the economic
model just establishes that there exists some generalized error
εt that behaves as an mds. Then, our AQ test can again be applied to test for the correct specification of these models since
εt is uncorrelated when the model is correctly specified. For instance, the first-order conditions of many Rational Expectations
and Asset Pricing Models are given in the form of conditional
moment restrictions, such as
E[εt | Ft−1 ] = E[Ht (θ0 ) | Ft−1 ] = 0, a.s.
for a vector of moment functions Ht (·) = H (Ft , ·). For instance,
for Asset Pricing models, Ht (θ0 ) = m(·, θ0 )Rte , where m is a
parametric stochastic discount factor function and Rte is a vector of excess returns, see Hansen and Singleton (1982). In this
context, the automatic AQ test can be employed since the correct

specification of the discount factor implies that the generalized
errors εt = H (Ft , θ0 ) are uncorrelated. In these cases, it is customary to estimate the parameter θ0 by the generalized methods
of moments.
To develop a general theory, which covers the previous two
cases and many others, we introduce the following assumption
on Ht (θ ), θ ∈ .
Assumption A2.
(a) For each θ ∈ , H (·, θ ) is a Ft -measurable function
with values in Rs . For each t ≥ 1, Ht (·) is a.s. continuously differentiable in a neighborhood of θ0 , say 0 ⊂
, with derivative H˙ t (θ ) := ∂Ht (θ )/∂θ ′ . Furthermore,
E[supθ∈0 |Ht (θ )|2 ] < ∞ and E[supθ∈0 |H˙ t (θ )|2 ] <
∞.
(b) The sequence {Ht (θ )} is strictly stationary and ergodic
for each θ ∈ 0 ⊂ .
(c) The innovations {Ht (θ0 )} are a strictly stationary and ergodic mds.
Since θ0 is unknown, we require the existence of an asymptotic linear estimator.
Assumption A3.
(a) There is an estimator of θ0 , say 
θ , which satisfies the
asymptotic linear representation


n

1 
n(
θ − θ0 ) = √
vt + oP (1),
n t=1

where {vt } is a stationary and ergodic mds and B :=
E[vt vt′ ] is well defined and positive definite;
(b) The parameter space  is compact.
Under Assumptions A2 and A3, we obtain the analogous
results of Section 2 for the automatic Portmanteau test based
on generalized residuals 
εt := Ht (
θ ), where Dh is defined as
before but with vt as in Assumption A3(a) and
G′h := −E[ξth ⊗ H˙ t (θ0 )],



where recall ξth := (εt−1
)′ . The following result is
, . . . , εt−h
proved in the online Appendix D.

Theorem 3. If Assumptions A2 and A3 are satisfied, then the
conclusion of Theorem 1 holds for the generalized automatic
Portmanteau test. Moreover, if we drop Assumption A2(c), then
the conclusion of Theorem 2 also holds.
3.1 Volatility Models
Similar to the VAR models, there is an extensive empirical
literature on the estimation of multivariate GARCH models in
empirical finance; see Bauwens, Laurent, and Rombouts (2006)
and references therein. Since financial returns behave approximately as mds processes, for these data, the conditional mean is
not typically the object of interest, and research has shifted to
modeling the conditional higher moments of these returns. The
data are assumed to follow the model
Yt = μ(Ft−1 , θ0 ) + 1/2 (Ft−1 , θ0 )ηt ,

Escanciano, Lobato, and Zhu: Automatic Specification Testing

where μt (θ0 ) ≡ μ(Ft−1 , θ0 ) is the parametric conditional mean
vector, which can be zero; t (θ0 ) ≡ (Ft−1 , θ0 ) is a parametric conditional covariance matrix, assumed to be nonsingular;
and {ηt } are standardized iid innovations. The specification of
traditional multivariate GARCH-type models for t (·), such as
the one employed in Section 4, can be automatically tested by
checking whether the standardized residuals are white noise using the AQ test. In particular, Ling and Li (1997) suggested to
look at the following transformation εt ≡ Ht (θ0 ), where

of AT is a χm2 2 , so its critical values are readily available. Unlike
AQ, AT requires the matrix Dh to be nonsingular, which as we
will see it may lead to inaccurate empirical sizes in some cases.

In this context, a natural estimator for θ0 is the Gaussian quasimaximum likelihood estimator 
θ . Hence, primitive conditions
for Assumptions A2 and A3 in this example are straightforward to find and are standard in the literature, see Ling and Li
(1997) for details. In particular, these authors showed that the
expression for the Dh matrix in (4) in this application is given
by

Yt = θ I2 Yt−1 + εt ,

Ht (θ ) = (Yt − μt (θ ))′ t−1 (θ )(Yt − μt (θ )) − m.

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:17 11 January 2016

431

Dh = ((cm)2 Ih − X(cB −1 − B −1 AB −1 )X′ ),

where c = E[ηj4t ] − 1, j = 1, . . . , m; X = (X1 , . . . , Xh )′ ; Xl =

E[∂ t (θ0 )/∂θ × vec( t−1 (ηt−l
ηt−l − m))]; B = −E[n−1 ∂ 2 L/

−1
∂θ ∂θ ]; A = E[n (∂L/∂θ )(∂L/∂θ ′ )]; and L is the quasiconditional log-likelihood. If the innovations {ηt } are Gaussian,
then A = B, but in general, A = B. Ling and Li (1997)
proposed a modified Portmanteau test that corresponds to
θ) =
T (h) in the previous section with residuals 
εt := Ht (
′ −1 


(Yt − μt (θ )) t (θ )(Yt − μt (θ )) − m. An alternative approach
is to use the automatic Portmanteau test AQ. In this application,
the estimation of the critical values is further simplified, since
Z = 1 Z12 a.s., where 1 is a positive number that can be
1 , with
1 = 
Ŵ −2 (0)D
consistently estimated by 
1′ (
−1 − B
−1 A
1 ,
B
−1 )X
1 = 
c 2 m2 − X
cB
D

where we employ the natural sample analog consistent estimators for c, X1 , A, and B. Therefore, after an appropriate
standardization, the limit distribution of AQ is a simple χ12 .
Alternative Portmanteau tests have been proposed in the literature, see, for example, Tse (2002), and our methods could
again be used to provide automatic versions of them, details are
omitted for the sake of space.
4.

MONTE CARLO EXPERIMENTS

In this section, we compare the finite-sample performance
of the proposed automatic test AQ with the Portmanteau test,
Q(h), for different choices of h. For some experiments, we
will also report the results for QBIC , which is the automatic
Portmanteau test statistic that uses the BIC criterion to select
the lag order h of the error autocorrelation tested, that is, the
employed penalty term is π (h, n, q) = hm2 log n. We do not
report results for the automatic test that employs the Akaike
criterion because this test is not able to control the Type I error,
as we comment later. In addition, for some cases, we also report
the test results for Q1, which is a robustified version of Q(1) that
uses the estimated weighted χ 2 critical values and AT , which
is the automatic version of T (h), that is, AT := T (
h), where 
h
is defined as before but with Q(h) replaced by T (h). Theorem 1
can be easily extended to show that the null limiting distribution

4.1

VAR Examples: Size

We first illustrate the problems associated with AT , which
has led us to prefer the use of AQ. We examine the sensitivity
of the tests to the magnitude of the true parameter values in the
bivariate VAR(1) model

where θ ranges from 0.1 to 0.95. We consider two scenarios for
the process εt : a bivariate iid standard Gaussian sequence and a
bivariate Gaussian GARCH(1,1). The GARCH(1,1) innovations
1/2
can be written as εt = Ht ut , where ut follows a standard
1/2
bivariate iid Gaussian sequence, and Ht denotes the matrix
1/2 1/2

such that Ht = Ht Ht , where Ht = C + A ⊙ (εt−1 εt−1
)+
G ⊙ Ht−1 , with C = diag(1, 1), A = G = diag(0.3, 0.3).
Figures 1 and 2 plot the empirical rejection rates at the 5%
nominal level for the AQ, the AT , the Q(2), the Q(4), and
the Q(12) tests for the iid normal innovations case and for
the GARCH(1,1) innovations case, respectively. The considered
sample size is n = 300. Figure 1 provides several messages.
First, it indicates that the classical Portmanteau tests, especially
Q(2), present large size distortions for high values for θ . Second,
Figure 1 also shows that the AT test presents extremely high size
distortions for low and moderate values for θ ; for example, AT
rejects about 90% at 5% nominal level when θ equals 0.1. This
result resembles similar empirical findings by Ljung (1986, sec.
5) who noticed that Newbold’s (1980) modified Portmanteau
test could not control the Type I error because the asymptotic
covariance matrix of the sample autocorrelations was close to
singular. In our case, the asymptotic covariance matrix Dh turns
out to be already close to singular for moderate values of h and
small θ . Finally, Figure 1 shows that AQ is able to control the
Type I error irrespective of θ.
Figure 2 considers the case of conditional heteroscedasticity.
The main difference with Figure 1 is that the classical Portmanteau tests do not control the Type I error, as the theory predicts.
Similarly to the homoscedastic case, AT presents severe size
distortions for moderate and small values for θ. In contrast, the
behavior of the data-driven test AQ is not affected by the value
of θ both in the homoscedastic and heteroscedastic cases.
We have performed additional simulation results for higherorder VAR processes and all the results indicate that AT is
unable to control the size for a range of parameter values even for
large sample sizes. This has been the main reason for focusing
on the AQ test rather than on the AT test.
Next, we consider additional size results to compare AQ to
QBIC , Q1, Q(2), Q(4), Q(8), Q(12), and Q(24). We consider
three VAR models with lag orders 1, 3, and 6. The coefficient
matrices used in the data-generating processes (DGPs) are taken
from L¨utkepohl (2005). The VAR specifications used can be
found in the online Appendix B. In these size results, the correct
VAR order is fitted.
We consider n = 100, 150, 300, and 1000 to accommodate
empirically relevant sample sizes in macro and finance and d =
25. As discussed in Escanciano and Lobato (2009), the choice

432

Journal of Business & Economic Statistics, October 2013
1

Q(2)

0.9

Q(4)
Q(12)

0.8

AQ

0.6

0.5

0.4

0.3

0.2

0.1

0
0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

θ
Figure 1. Size performance at 5% level: iid innovations (n = 300).

0.9

Q(2)

0.8

Q(4)
Q(12)

0.7

AT
AQ
Empirical rejection rate

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:17 11 January 2016

Empirical rejection rate

AT
0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
0.1

0.2

0.3

0.4

0.5

0.6

0.7

θ
Figure 2. Size performance at 5% level: GARCH innovations (n = 300).

0.8

0.9

1

Escanciano, Lobato, and Zhu: Automatic Specification Testing

433

Table 1. Empirical rejection rates (percentage) of nominal 5% test: iid innovations

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:17 11 January 2016

VAR(1)

VAR(3)

VAR(6)

n

100

150

300

1000

100

150

300

1000

100

150

300

1000

AQ
QBIC
Q1
Q(2)
Q(4)
Q(8)
Q(12)
Q(24)

5.9
5.6
5.6
5.8
4.7
4.3
4.8
5.4

5.8
5.6
5.6
5.8
5.2
4.9
4.9
5.3

5.3
5.2
5.2
6.3
4.7
4.7
4.5
5.1

5.3
5.3
5.3
6.3
5.4
5.0
5.8
5.5

7.7
7.7
7.7
NA
9.5
4.9
3.9
3.3

7.1
7.1
7.1
NA
8.7
5.0
4.1
4.0

5.6
5.6
5.6
NA
7.9
4.8
4.8
4.6

4.8
4.8
4.8
NA
7.1
4.9
4.8
4.6

8.7
8.7
8.7
NA
NA
19.7
8.7
4.3

7.3
7.3
7.3
NA
NA
14.2
6.9
3.8

6.0
6.0
6.0
NA
NA
11.2
6.6
3.9

5.8
5.8
5.8
NA
NA
9.4
6.3
4.7

of the upper bound d plays a secondary role in our testing
procedure, a result also verified at the end of this section in
Table 6. Tables 1 and 2 report the empirical rejection rates for
the tests for the 5% nominal level, for the iid Gaussian and
the GARCH innovations, respectively. The number of Monte
Carlo replications is 10,000 for each sample size. We do not
include the results for the automatic test that employs the AIC
since it does not control the Type I error. For instance, for
the VAR(1) case with iid innovations, the empirical percentage
rate is 10.7 when n = 100 and it is 10.6 for n = 1000. For
the GARCH innovations, the distortions even increase with the
sample size since the empirical rejection percentage rate is 14.0
for n = 100 and 18.2 when n = 1000. Note that the Portmanteau
tests are only applicable when h is larger than p. Table 1 shows
that the Portmanteau tests are more accurate as the difference
h − p increases. However, we observe quite large distortion
when h is close to p; for example, in the right panel, p = 6, the
Portmanteau test Q(8) rejects 19.7% when n = 100 and 9.4%
even for a sample size as large as 1000.
Table 2 indicates large size distortions of the Portmanteau
tests for many choices of n and h. For instance in the VAR(1)
model, when the sample size is 1000, Q(12) rejects 10.7% at
the 5% nominal level. These distortions do not decrease as
the sample size increases since these Portmanteau tests are not
robust to conditional heteroscedasticity. In contrast, our datadriven test AQ exhibits moderate size distortions even for small
sample sizes common for macro data, such as n = 300. Tables 1
and 2 show a similar behavior for the AQ, QBIC , and Q1 tests.
This fact could be expected since, under the null, the optimal
value for h is one because the series are uncorrelated, and the
penalty term (5) would choose the BIC criterion.

4.2

VAR Examples: Power

In this subsection, we consider the finite-sample behavior
under the alternative. For the sake of space, we only consider
Gaussian homoscedastic innovations and, given that Table 1
shows that the classical Portmanteau tests cannot control the
Type I error in many occasions, we report size-corrected power.
We first examine the performance of the AQ test in detecting
the incorrect specification of the VAR model. In Table 3, we
report the results for two experiments. In the first experiment,
the data are generated from a VAR(2) model (the modulus of
roots for the characteristic function are 2.98 and 1.12)



0.5 0.4
0.03
Y
+
VAR(2) : Yt =
0.1 0.5 t−1
0.02


0
0
+
Y + εt ,
0.25 0 t−2
where εt are iid N(0, I ), but the fitted model is a VAR(1). In
the second experiment, the data are generated from a VAR(6)
model (the modulus of roots for the characteristic function are
1.86, 1.29, 1.16, 1.08, 1.04, and 1.03)



0.03
0.5 0.4
VAR(6) : Yt =
+
Y
0.02
0.1 0.5 t−5


0
0
+
Y + εt ,
0.25 0 t−6
but the fitted model is a VAR(3). Table 3 reports the sizecorrected empirical rejection rates for three sample sizes,
n = 100, 300, and 1000. In the first experiment, the empirical power of the classical Portmanteau tests, Q(h), decreases

Table 2. Empirical rejection rates (percentage) of nominal 5% test: GARCH innovations
VAR(1)

VAR(3)

VAR(6)

n

100

150

300

1000

100

150

300

1000

100

150

300

1000

AQ
QBIC
Q1
Q(2)
Q(4)
Q(8)
Q(12)
Q(24)

6.5
5.6
5.6
11.8
8.5
6.6
6.5
6.0

6.5
5.8
5.7
12.1
10.6
8.5
7.6
6.7

6.1
5.4
5.4
14.0
11.4
8.5
7.7
7.0

5.5
5.2
5.2
15.9
13.9
11.0
10.7
8.1

8.4
8.4
8.4
NA
11.0
5.3
4.4
4.2

7.5
7.5
7.5
NA
10.9
5.8
5.0
3.8

6.2
6.2
6.2
NA
10.9
6.4
5.6
4.7

5.0
5.0
5.0
NA
10.6
7.4
7.0
6.0

9.5
9.5
9.5
NA
NA
20.3
8.6
4.1

8.3
8.3
8.3
NA
NA
15.5
7.4
4.0

6.3
6.3
6.3
NA
NA
11.9
7.0
3.8

5.6
5.6
5.6
NA
NA
11.1
7.1
5.1

434

Journal of Business & Economic Statistics, October 2013

Table 3. Empirical size-corrected power (percentage) of nominal 5% test: iid innovations
DGP: VAR(2); fitted model: VAR(1)
n
100
300
1000

AQ

QBIC

Q1

Q(2)

Q(4)

Q(8)

Q(12)

Q(24)

39.0
92.5
100.0

35.0
89.9
100.0

32.3
86.9
100.0

63.9
99.2
100.0

46.8
96.6
100.0

30.2
87.6
100.0

23.3
76.8
100.0

15.9
55.6
99.9

DGP: VAR(6); fitted model: VAR(3)
n

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:17 11 January 2016

100
300
1000

AQ

QBIC

Q1

Q(2)

Q(4)

Q(8)

Q(12)

Q(24)

99.6
100.0
100.0

26.7
49.9
100.0

26.7
49.9
80.5

NA
NA
NA

34.3
39.8
41.2

100.0
100.0
100.0

99.9
100.0
100.0

99.7
100.0
100.0

with h, whereas in the second experiment, it increases with h.
Therefore, a researcher who employs the classical Portmanteau
would prefer to employ a small value for h in the first case, and a
large value for h in the second case. Hence, this table illustrates
the importance of employing an automatic criterion to select h
to render more powerful tests. As we can see, our data-driven
test always performs well and is comparable to Q(h) for the best
choice of h, which is in general unknown to the researcher.
Next, we simulate data from the vector moving average process MA(1) (the modulus of roots for the characteristic function
are 2.22 and 1.82)


−0.5 0.05
,
MA(1) : Yt = εt + εt−1 , where  =
0.05 −0.5
and fit two VAR(p) models, in particular, p = 1 and p = 3.
Since an invertible MA(1) process can be written as a VAR(∞)
process, we could expect that, as we increase the lag p, the
power of detecting misspecification is smaller. Table 4 confirms
this result and shows that for all tests, the power when fitting a
VAR(3) reduces considerably with respect to the case of fitting
a VAR(1). Table 4 shows that, in terms of empirical power, our
proposed test AQ is comparable to the Portmanteau test for
small values of h and performs better than the Portmanteau test
for large values of h. Also note that the AQ, QBIC , and Q1 tests
present a similar behavior.
Table 5 reports the results for the case where the DGP is the
VAR(12) (the modulus of roots for the characteristic function

are 1.06)
VAR(12) : Yt = 12 Yt−12 + εt where 12




0.5 −0.05
=
,
0.05 0.5

and the fitted models are VAR(p)’s with p = 1, 3, 5, 7, 9, 11,
and 12. Note that the last case, p = 12, represents the empirical level. The last column of Table 5 indicates that for the two
cases where it can be computed, the classical Portmanteau tests
Q(h) cannot control the Type I error, whereas AQ, QBIC , and
Q1 tests present a similar behavior under the null hypothesis.
In terms of power, Table 5 shows that AQ strongly dominates
QBIC , especially when using p = 3, 5, 7, and 9. The intuition
for this result is clear: when the higher-order autocorrelations
are significant, our test employs the AIC criterion that tends to
choose high values for h. Table 5 also shows the power sensitivity of the classical Portmanteau statistic. In particular, Q(h)
rejects more often for larger h because, when the order of the
fitted VAR is smaller, the residuals present more autocorrelation
at higher lags.
Finally, we also examine the sensitivity of our test to the selection of d. Specifically, Table 6 reports the empirical rejection
rates for one case under the null, fitting data with the true DGP
(VAR(1) with iid innovation) and one case under the alternative,
fitting data generated by MA(1) with a VAR(3). For this experiment, the sample size is set at n = 1000 and seven values for d
are used (d = 25, 50, 75, 100, 125, 150, and 300). The results
indicate that our test is insensitive to the choice of d.

Table 4. Empirical size-corrected power (percentage) of nominal 5% test: iid innovations
DGP: MA(1); fitted model: VAR(1)
n
100
300
1000

AQ

QBIC

Q1

Q(2)

Q(4)

Q(8)

Q(12)

Q(24)

53.2
98.8
100.0

52.8
98.8
100.0

52.1
98.8
100.0

66.2
99.6
100.0

45.5
97.2
100.0

33.5
88.1
100.0

29.8
79.9
100.0

24.7
65.4
99.9

DGP: MA(1); fitted model: VAR(3)
n

AQ

QBIC

Q1

Q(2)

Q(4)

Q(8)

Q(12)

Q(24)

100
300
1000

6.2
11.8
38.4

6.2
11.8
38.3

6.2
11.8
38.3

NA
NA
NA

8.7
17.9
56.9

6.7
11.1
28.7

7.2
10.0
22.1

6.5
8.3
15.9

Escanciano, Lobato, and Zhu: Automatic Specification Testing

435

Table 7. Empirical rejection rates(percentage) of nominal 5% test:
skewed t innovation

Table 5. Empirical size-corrected power (percentage) of nominal 5%
test: DGP: bivariate VAR(12) with iid Gaussian innovations. Fitted
model is bivariate VAR(p). Sample size n = 300
p
AQ
QBIC
Q1
Q(2)
Q(4)
Q(8)
Q(16)
Q(24)

1

3

5

7

9

11

12

100.0
25.4
20.4
20.9
36.2
60.9
100.0
100.0

100.0
17.4
17.4
NA
19.2
53.6
100.0
100.0

100.0
24.7
24.7
NA
NA
19.0
100.0
100.0

99.6
19.5
19.5
NA
NA
30.8
100.0
100.0

98.4
19.6
19.6
NA
NA
NA
100.0
100.0

99.6
97.7
97.7
NA
NA
NA
100.0
100.0

5.6
5.6
5.6
NA
NA
NA
54.0
13.5

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:17 11 January 2016

4.3 Multivariate GARCH Examples: Size
To check the finite performance of our automatic test applied to a multivariate GARCH model, we simulate a bivariate
GARCH(1,1) model following the Baba, Engle, Kraft, Kroner
(BEKK) formulation:
1/2

(9)

Yt = t ηt

GARCH(1,1)
n

100

200

300

500

1000

AQ
T (3)
T (6)
T (9)

5.1
3.9
3.0
3.9

4.6
5.2
3.8
4.3

5.1
5.4
4.6
6.0

5.0
5.3
5.3
5.5

5.3
4.2
5.4
6.8

4.4

Multivariate GARCH Examples: Power

We study the finite sample power performance in GARCH
specifications by considering two DGPs: an asymmetric bivariate GARCH(1,1) model and a bivariate ARCH(2) model. The
asymmetric GARCH(1,1) model is specified as
1/2



t = C ′ C + A′ Yt−1 Yt−1
A + B ′ ξt−1 ξt−1
B + G′ t−1 G,

where ξti = Yti 1(Yti < 0), i = 1, 2. The ARCH(2) model is as
follows
1/2


t = C ′ C + A′ Yt−1 Yt−1
A + G′ t−1 G,

Yt = t η t


where C, A, and G are 2 × 2 matrices, with C upper triangular.
The innovations {ηt,i : i = 1, 2; t = 1, . . . , n} are iid skewed t
distributed with 5 degrees of freedom. The parameter values
are taken from L¨utkepohl (2005, p. 573) and reported in the
online Appendix B. Stationarity is guaranteed by the fact that
the eigenvalues of A ⊗ A + G ⊗ G are less than 1 in modulus
(which are 0.94, 0.90, 0.89, and 0.86). The test statistic is formed
by applying our automatic procedure to the residuals
t−1 Yt − 2,

εt = Yt′

t ’s are the parametric estimators of t . We compare
where
the finite sample performance of our automatic test with the
standardized tests, T (h), proposed in Ling and Li (1997), where
these tests are denoted by Q(h). Table 7 reports the empirical
rejection rates with sample sizes n = 100, 200, 300, 500, and
1000 at the 5% nominal level. The number of replications is
1000 for the multivariate GARCH examples. Table 7 shows that
our data-driven test can properly control the size even when the
sample size is as small as n = 100. It also indicates that the finite sample size performance of the T (h) tests varies with the
choice of h, for example, T (3) exhibits less size disto