Directory UMM :Data Elmu:jurnal:J-a:Journal of Econometrics:Vol99.Issue2.Dec2000:
Journal of Econometrics 99 (2000) 195}223
Robust out-of-sample inference
Michael W. McCracken*
Department of Economics, Louisiana State University, 2107 CEBA, Baton Rouge, LA 70803-0306, USA
Received 25 September 1998; received in revised form 29 November 1999; accepted 13 March 2000
Abstract
This paper presents analytical, empirical and simulation results concerning inference
about the moments of nondi!erentiable functions of out-of-sample forecasts and forecast
errors. Special attention is given to the measurement of a model's predictive ability using
the test of equal mean absolute error. Tests for equal mean absolute error and mean
square error are used to evaluate predictions of excess returns to the S & P 500
composite. Simulations indicate that appropriately constructed tests for equal mean
absolute error can provide more accurately sized and more powerful tests than inappropriately constructed tests for equal mean absolute error and mean square error. ( 2000
Elsevier Science S.A. All rights reserved.
JEL classixcation: C52; C53; C32; C12
Keywords: Forecasting; Forecast evaluation; Hypothesis testing; Model comparison
1. Introduction
It is becoming common to evaluate a forecasting model's ability to predict
using out-of-sample methods. Meese and Rogo! (1983), in predicting exchange
rates, report the mean square error (MSE) of forecast errors. Akgiray (1989) uses
the mean absolute error (MAE) to evaluate volatility forecasts of stock returns.
Engel (1994) reports the number of times the direction of change in exchange
* Corresponding author. Tel.: #1-225-388-3782; fax: #1-225-388-3807.
E-mail address: [email protected] (M.W. McCracken).
0304-4076/00/$ - see front matter ( 2000 Elsevier Science S.A. All rights reserved.
PII: S 0 3 0 4 - 4 0 7 6 ( 0 0 ) 0 0 0 2 2 - 1
196
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
rates is accurately predicted. Swanson and White (1995) report the Schwarz
information criterion as well as the out-of-sample R2 that result when forward
interest rates are used to predict future spot rates.
These papers, and many others, evaluate predictive ability in one of two ways.
Most do so by simply constructing point estimates of some measure of predictive ability. The most common measure is MSE. A few others argue heuristically
that their tests of predictive ability are limiting normal and hence asymptotically
valid t-statistics can be used to test hypotheses. For example, Pagan and
Schwert (1990) and Fair and Shiller (1990) construct regression based tests for
e$ciency and encompassing respectively. However, they do not provide a set of
su$cient conditions for their statistics to be asymptotically standard normal.
Recent theoretical work has attempted to provide those su$cient conditions.
When parametric forecasts and forecast errors are used to estimate moments
or conduct inference there are two sources of uncertainty. There is uncertainty
that exists even when we know the model parameters and there is uncertainty
due to the estimation of parameters. Diebold and Mariano (1995) show how to
construct asymptotically valid out-of-sample tests of predictive ability when
there is no parameter uncertainty, for example, when parameters are known.
Under this restriction, they are able to construct tests of hypotheses that involve
moments of di!erentiable and nondi!erentiable functions such as those used to
construct tests for equal MSE and equal MAE between two predictive models.
When parameters are unknown, and must be estimated, parameter uncertainty can play a role in out-of-sample inference. West (1996) has shown how the
uncertainty due to parameter estimation can a!ect the asymptotic distribution
of moments of di!erentiable functions of out-of-sample forecasts and forecast
errors. Given a parametric forecasting model, this allows for inference concerning tests of serial correlation, e$ciency, encompassing, zero mean prediction
error and equal MSE between two predictive models.
In this paper I close some of the gaps between the work by Diebold and
Mariano (1995) and West (1996). I extend the work by Diebold and Mariano
(1995) by showing that parameter uncertainty can a!ect out-of-sample inference
regarding moments of nondi!erentiable functions. As in West (1996), the parameter uncertainty causes the limiting covariance structure to be nonstandard.
The limiting covariance matrix contains two components: a standard component that would exist if the parameters used to construct forecasts were known in
advance and a second component due to the fact that parameters are not known
and have to be estimated.
I extend the work of West (1996) in two ways. Firstly, I provide su$cient
conditions for asymptotic normality of sample averages of nondi!erentiable
functions of parametric forecasts and forecast errors. In particular, the analytical
results provide conditions that are weak enough to construct tests that use the
absolute value and indicator functions. These conditions also allow the application of White's (2000) bootstrap of data snooping e!ects to instances when the
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
197
measure of predictive ability is nondi!erentiable. Secondly, I allow model
parameters to be estimated using loss functions that are not di!erentiable. By
doing so I permit a greater degree of freedom in choosing the loss function used
to estimate the parameters to match the loss function used to evaluate the
forecasts. This may be bene"cial in light of the discussion in Weiss (1996).
These extensions are potentially useful since nonsmooth measures of predictive ability have been used to evaluate parametric predictive models. Granger
(1969) provides an early theoretical discussion. Empirical examples are plentiful.
Gerlow et al. (1993) use MAE to evaluate predictive ability. Swanson and White
(1997) use mean absolute percentage error (MAPE) to measure predictive
ability. Stekler (1991) compares the predictive ability of two parametric models
using the test of percent better, or what Diebold and Mariano (1995) refer to as
the &sign test'. Engel (1994) constructs a test for sign predictability based upon
the binomial distribution. Henriksson and Merton (1981) and Pesaran and
Timmermann (1992) construct tests for sign predictive ability using a standard
normal approximation.
Each of the measures of predictive ability mentioned above can be used to
construct tests of forecast accuracy. As presented though, most ignore the
possibility that the forecasts are generated parametrically and hence may be
a!ected by parameter uncertainty. The results of West and McCracken (1998),
concerning smooth measures of predictive ability, suggest that in many circumstances it is inappropriate to ignore the parameter uncertainty.
In this paper I provide analytical, empirical and simulation results indicating
that ignoring parameter uncertainty can be inappropriate when nonsmooth
measures of predictive ability are used. I focus on the test of equal MAE as an
example in which accounting for parameter uncertainty can be important.
Although I emphasize the absolute value function, the asymptotic results are
applicable to tests that use indicator functions.
For the results of this paper to hold, however, certain conditions must be met.
Perhaps the most important is Assumption 4. There I assume that the expectation of the function of interest must be continuously di!erentiable in the
parameters. This assumption is not very restrictive when the absolute value
function is being used and is the reason I use the test of equal MAE as a foil
throughout the paper. It can be a problem when indicator functions are used. In
particular it can be a problem for tests of sign predictability. See the discussion
following Assumption 4 for further detail.
The remainder of the paper proceeds as follows. Section 2 describes a general
environment and provides the asymptotic results. Section 3 tests the predictability of excess returns to the S & P 500 composite portfolio using tests of equal
MAE and equal MSE. Section 4 provides simulation evidence on the "nite
sample size and power characteristics of the test statistics used in Section 3. The
paper concludes with a discussion of these results and some topics for future
research. The appendix presents proofs. An additional appendix available on
198
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
request from the author presents details of proofs omitted from the paper to save
space.
2. Theoretical results
This section presents su$cient conditions for asymptotic inference about the
moments of functions of out-of-sample forecasts and forecast errors. These
conditions will su$ce to show, in Theorem 2.3.1, that out-of-sample averages
consistently estimate population means, and when appropriately scaled are
asymptotically normal. These conditions also su$ce to show, in Theorem 2.3.2,
that the limiting covariance structure can be consistently estimated by
a straightforward application of Slutsky's theorem.
For any function f, f (bK ) will denote the parametric estimate of f (bH). Also,
t,q t
t`q
in order to minimize notation, f
will denote f (bH).
t`q
t`q
2.1. Environment
Throughout it is assumed that MX NT`q is a given sample of observables. The
s s/1
latter portion of that sample contains a continuous stream of P s-step ahead
forecasts. The "rst forecast, y (bK ), is based upon a parameter vector estimated
R,q R
using observations s"1,2, R. Further forecasts, y (bK ), are each constructed
t,q t
using an estimated parameter vector that is based on observations
s"1,2, t, R)t)¹,R#P!q. The time period for which the P forecasts
are generated will be referred to as the out-of-sample period.
As in West and McCracken (1998) I will allow for three di!erent forecasting
schemes. The recursive, rolling and "xed forecasting schemes di!er in how they
construct the sequence of parameter estimates used to construct the sequence of
forecasts and forecast errors. A brief description is given below.
Keim and Stambaugh (1986) use the recursive scheme. Under this scheme
a sequence of forecasts is generated using updated parameter estimates. At
each time t"R,2, ¹ the parameter estimate bK depends explicitly on all obsert
vables from s"1,2, t. If OLS is used to estimate the parameters from
a scalar linear model with regressors Z and predictand y then bK "
s
s
t
(t~1+t Z Z@ )~1(t~1+t Z y ). The "rst forecast is then of the form y (bK ).
s/1 s s
R,q R
s/1 s s
The second forecast, y
(bK
) is constructed similarly using observations
R`1,q R`1
s"1,2, R#1. This process is iterated P times so that for each t3[R, ¹], the
parameter estimates use observations s3[1, t].
Chen and Swanson (1996) use the rolling scheme. Under this scheme the
sequence of parametric forecasts is constructed in much the same way as
the recursive scheme. The rolling scheme di!ers from the recursive in its treatment of observations from the distant past. The rolling scheme uses only a
"xed window of the past R observations. As t increases from R to ¹, older
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
199
observations are not used in estimating the parameters. If OLS is used to
estimate the parameters from a scalar linear model with regressors Z and
s
Z y ). This imZ Z@ )~1(R~1+t
predictand y then bK "(R~1+t
s/t~R`1 s s
s/t~R`1 s s
s
t
plies that the "rst rolling forecast, y (bK ), and forecast error are identical to
R, q R
those for the recursive. The second rolling forecast, y
(bK
), is constructed
R`1,q R`1
using only observations s"2,2, R#1 to estimate the model parameters. This
implies that the second rolling forecast and forecast error are distinct from those
using the recursive scheme. The process is iterated P times such that for each
t3[R, ¹] the parameter estimates use observations s3[t!R#1, t].
Kuan and Liu (1995) use the "xed scheme. This method is distinct from the
previous two in that the parameters are not updated when new observations
become available. Since the parameter vector is estimated only once, each
of the P forecasts, y (bK ), uses the same parameter estimate.1 If OLS is
t,q R
used to estimate the parameters using regressors Z and predictand y then
s
s
bK "(R~1+R Z Z@ )~1(R~1+R Z y ). Hence for each forecast from time
s/1 s s
s/1 s s
t
t3[R, ¹], the parameter estimate only uses observations s3[1, R].
Since, we are ultimately interested in conducting inference concerning the
population moments of functions of parametric forecasts and forecast errors,
a description of these functions is in order. The function
f (bK ),f (q, X , bK ) (l]1)
(1)
t,q t
t t
depends upon three arguments. The "rst is a "nite forecast horizon, q*1. The
second, X , is a "nite dimensioned vector of observables. The dating of the
t
subscript t is not meaningful. For example, if we are interested in the one-step
ahead MAE from a scalar linear regression model, f (bK )"Dy !Z@ bK D. Since
t t
t,1 t
t`1
the realized scalar left-hand side variable is y , and the variables used for
t`1
prediction are Z , X "(y , Z@ )@.
t t
t`1 t
The third argument, bK , is an estimate of a (k]1) unknown parameter vector
t
bH. When the inference to be conducted is simply a diagnostic of a single
parametric model, such as the test of zero median error for which
f (bK )"1My !Z@ bK )0N, bH is the vector of parameters that index that
t t
t,1 t
t`1
particular parametric model. On the other hand, if the inference to be conducted
is meant to detect which of two nonnested competing models is more accurate,
bH is formed by stacking the vector of parameters that index each of the two
models. For example, suppose that we are interested in comparing the one-step
ahead MAE from two scalar nonnested linear regression models. If we
let i"1, 2 index the two models (along with their respective regressors and
parameter estimates), f (bK )"Dy !Z@ bK D!Dy !Z@ bK D and hence
2,t 2,t
1,t 1,t
t`1
t,1 t
t`1
bK "(bK @ , bK @ )@.
1,t 2,t
t
1 Notice that the "xed and rolling parameter estimates should be subscripted both by t and R. In
order to simplify the notation the subscript R will be suppressed.
200
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
Given such a function, we are interested in testing (say) the scalar null
hypothesis H : E f "h for some "nite h . To do so, we will focus on test
0
t`q
0
0
statistics of the form XK ~0.5P~0.5+T ( f (bK )!E f ) where XK is a consistent
t/R t,q t
t`q
estimate of the appropriate limiting variance. In Theorem 2.3.1 it is shown that
this statistic is asymptotically standard normal and hence asymptotically valid
inference can be conducted using standard normal tables.
2.2. Assumptions
Within the following, for any matrix A, DAD"max Da D,DD.DD is the ¸Q norm,
i,j i,j
Q
sup denotes sup
, and for h (b) de"ned in Assumption 1,
t
RxtxT
t
g (b)"[( f (b)!E f (b))@, h (b)@]@.
t
t,q
t,q
t
(2)
Assumption 1. The estimate bK satis"es bK !bH"B(t)H(t), where B(t) is (k]q)
t
t
and H(t) is (q]1), with (a) B(t)P B, B a matrix of rank k, (b)
a.s.
h and R~1+R h for the recursive, rolling
H(t)"t~1+t h , R~1+t
s/1 s
s/t~R`1 s
s/1 s
and "xed schemes respectively; for the orthogonality condition h ,h (bH), and
s
s
(c) E h "0.
s
Assumption 1 provides for a wide range of methods of estimating parameters.
In particular, it allows for maximum likelihood, nonlinear least squares and
a range of generalized method of moments estimators. It allows for linear and
nonlinear models as well as single and multiple equation systems.
As an example of the notation in Assumption 1 consider that our statistic is
used to test for equal MAE between two competing linear models. Suppose that
each of the two models, for y , has the representation y "Z@ bH#u
i,t`1
i,t i
t`1
t`1
for i"1, 2. Consider further that for each i"1, 2 OLS provides a consistent
estimate of bH (k ]1). Since there are two sets of parameters needed to construct
i i
this test, bK "(bK @ , bK @ )@ (k #k "k]1), and hence B (k]q, q"q #q ,
1
2
1
2
1,t 2,t
t
q "k , q "k ) and h (q]1) are
1
1 2
2
s
A
B"
B
A
B
u
Z
(E Z Z@ )~1
01 2
1,t 1,t
k Cq
, h " 1,s`1 1,s .
s
u
Z
02 1
(E Z Z@ )~1
2,s`1 2,s
k Cq
2,t 2,t
(3)
Assumption 2. R, PPR as ¹PR, and lim
P/R"n, 0)n(R.
T?=
Assumption 2 provides our "rst insight as to how the asymptotic approximation is achieved. When asymptotically valid in-sample results are derived, one
allows ¹PR. I do so here, but impose the stronger condition that both the
out-of-sample (P) and in-sample (R) sample sizes become arbitrarily large
simultaneously. This assumption eventually allows application of central limit
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
201
theorem results to both the parameters estimated in-sample and the out-ofsample average of the function f .
t`q
Assumption 3. For some d'1, (a) E f "h , (b) X is strong mixing with
t`q
0
t
coe$cients of size !2d/(d!1), (c) g (bH) is covariance stationary, and (d) for an
t
open neighborhood N of bH, sup DDsup g (b)DD (R, (e) X is p.d. .
t
b|N t
2d
Assumption 3 is similar to that in West and McCracken (1998) with two
important distinctions. The "rst is that I weaken the moment conditions so that
only 2d rather than 4d need exist. This may prove helpful in the context of
forecasting excess returns for which there is evidence of leptokurtosis. The
second di!erence is that I reduce the order of the mixing coe$cients from
!3d/(d!1) to !2d/(d!1). The covariance stationarity assumption is primarily for simplifying the algebra when constructing a consistent estimate of the
asymptotic covariance matrix in Theorem 2.3.2.
Assumption 4. For each i3M1,2, l#qN: (a) E g (b) is continuously di!erentii,t
able in the neighborhood N (from Assumption 3) of bH admitting a mean value
expansion E g (b)"E g #(LE g (bI )/Lb)(b!bH) where g is a scalar, b
i,t
i,t
i,t
i,t
is (k]1) and bI is on the line between b and bH, (b) there exists a "nite constant D such that sup sup DLE g (b)/LbD(D, and (c) for all t, G"G ,
t
b|N
i,t
t
LE h (b)/LbD H and F"F ,LE f (b)/LbD H .
t
t,q
b/b
t
b/b
As we will see in Lemma 2.3.2 I separate the parameter uncertainty from the
sampling uncertainty by taking a mean value expansion of E f (b)D K t as in
t,q b/b
Randles (1982), rather than of f (b)D K t as in West (1996). The bound provided
t,q b/b
by D su$ces to show certain terms are o (1).
1
Although Assumption 4(a) is weaker than the di!erentiability condition in
West (1996), it is not always satis"ed. A simple counterexample can be constructed that is relevant for tests of sign predictability.2 Suppose that
y "bHy #u with u &i.i.d. N(0, 1) and DbHD(1. Let one-step ahead foret
t~1
t
t
casts of the form y bK be used to predict y . Consider the function
t t
t`1
f (b)"1My b*0N. There are two cases. If bH"0 then E f (bH)"
t,1
t
t`1
E1My bH*0N"E1M0*0N"1 for all t. But in every open neighborhood of
t
bH"0 there exists a b such that E f (b)"E1My b*0N"0.5. In this case
t,1
t
Assumption 4(a) fails. On the other hand, if bHO0 then there exists an open
neighborhood of bH such that for all b in that neighborhood, E f (b)"
t,1
E1My b*0N"0.5. In this case Assumption 4(a) holds. In the former case the
t
results of this paper cannot be applied to determine the limiting distribution of
2 The same type of problem exists for the Henriksson}Merton test (1981) and the
Pesaran}Timmermann test (1992). I use this example to simplify the presentation.
202
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
our test statistic. In the latter case, not only can the results be applied it is clear
that F"0.
This type of problem does not occur for all tests that use indicator functions.
Consider the test of zero median error. Using the same environment as in the
preceding example we have f (b)"1My !y b)0N"1Mu )y (b!bH)N.
t,1
t`1
t
t`1
t
Taking expectations, and letting U denote the standard normal c.d.f., we have
E f (b)"EU(y (b!bH)). Since U is continuously di!erentiable Assumption
t,1
t
4(a) holds regardless of the value of bH. Once again, not only can the results of
this paper be applied it is clear that F"0. See Kim and Pollard (1990, p. 205) for
a set of conditions su$cient for continuous di!erentiability of expectations of
indicator functions.
Assumption 5. Let N(e)"N(bH, e),Mb3Rk : Db!bHD(eN. There exist "nite
constants C, u'0 and Q*2d such that for all N(e)LN (from Assumption 3),
sup DDsup
(g (b)!g )DD )Cer.
t
b|N(e) t
t Q
In some circumstances it is straightforward to verify the ¸Q continuity
condition in Assumption 5. For example, if the parametric model is linear and
f is Lipschitz (as is the case for the absolute value function), this assumption is
t,q
automatically satis"ed. When indicator functions are used, verifying the condition is more di$cult. It will frequently be the case that Assumption 4 and
reasonable assumptions on the continuity of the p.d.f. of X will be needed to
t
verify the condition.
2.3. Results
In this section I utilize the assumptions of Section 2.2 to show that
P~0.5+T ( f (bK )!E f ) is asymptotically normal with a positive-de"nite
t/R t,q t
t`q
covariance matrix X which will usually depend on n"lim
P/R. In order to
T?=
construct an asymptotically valid test statistic, I then show that there exists
a straightforward and consistent estimator XK of X.
For the "rst step in the derivation, I borrow a decomposition used by Randles
(1982). Let
T
m "P~0.5 + ( f (bK )!E f (b)D K t !f #E f )
t`q
t`q
t,q t
t,q b/b
0,P
t/R
and
T
m "P~0.5 + ( f !E f #E f (b)D K t !E f )
t`q
t`q
t,q b/b
1,P
t`q
t/R
(4)
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
203
such that
T
P~0.5 + ( f (bK )!E f )"m #m .
t,q t
t`q
0,P
1,P
t/R
This decomposition leads to the following two lemmas upon which limiting
normality is based.
Lemma 2.3.1. Given Assumptions 1}5, m "o (1).
0,P
1
Lemma 2.3.2. Given Assumptions 1}5, m "[P~0.5+T ( f !E f )#
t/R t`q
t`q
1,P
FBP~0.5+T H(t)]#o (1).
t/R
1
It is now clear how both types of uncertainty are present. Sampling uncertainty is the "rst term and parameter uncertainty is the second term in the expansion
of Lemma 2.3.2. It is important to note that Lemma 2.3.2 provides the
same decomposition as in West and McCracken (1998) with F suitably rede"ned. If we de"ne C ( j)"E( f !E f )( f
!E f )@, C ( j)"
ff
t`q
t`q t`q~j
t`q
fh
C ( j), S "+=
, S "+=
, C ( j)"E h h@
E( f !E f )h@
j/~=
j/~= ff
fh
t`q t`q~j ff
t`q
t`q t`q~j hh
C ( j) we immediately know that the limiting variance
C ( j) and S "+=
j/~= hh
fh
hh
of the bracketed term on the right-hand side of Lemma 2.3.2 is
X"S #j (FBS@ #S B@F@)#j FBS B@F@,
fh
fh
hh
hh
ff
fh
where
Recursive
j
j
fh
hh
1!n~1 ln(1#n) 2[1!n~1 ln(1#n)]
Rolling, n)1
n/2
Scheme
n!n2/3
Rolling, 1(n(R 1!(2n)~1
1!(3n)~1
Fixed
n
0
(5)
(6)
Theorem 2.3.1. Given Assumptions 1}5, (a) P~0.5+T ( f (bK )!E f )
t/R t,q t
t`q
P N(0, X) for X dexned in (5), (b) if either F"0 or n"0 then
d
P~0.5+T ( f (bK )!E f )P N(0, S ), and (c) P~1+T f (bK )P E f .
t/R t,q t
1 t`q
t/R t,q t
t`q
d
ff
Theorem 2.3.1 shows that the statistic is limiting normal and that out-ofsample averages provide consistent estimates of population moments. The
distinction between parts (a) and (b) is exclusively whether or not parameter
uncertainty is relevant to the asymptotic covariance. To make this more clear,
notice that for all sampling schemes X"S when either F"0 or n"0. Since
ff
the latter covariance terms are those which result from parameter uncertainty,
parameter uncertainty is said to be asymptotically irrelevant when either F"0
204
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
or n"0.3 If this is the case then the results of Diebold and Mariano (1995) are
applicable even though parameters have been estimated.
The "nal step is to construct a consistent estimate of the covariance matrix X.
To do so, we need to design consistent estimates of S , S , B, F, BS B@, j
ff fh
hh
fh
and j . One can estimate B consistently by simply using the in-sample informahh
tion from the "nal parameter estimates. Since n( "P/R is a consistent estimate of
n and both j and j are continuous in n, we can use jK ,j (n( ) and
fh
hh
fh
fh
jK ,j (n( ) to consistently estimate j and j . The term BS B@ is the
fh
fh
fh
hh
hh
asymptotic covariance matrix of the parameter estimates. Since most software
packages automatically provide a consistent estimate of this matrix, an estimator of BS B@ is immediate from the "nal parameter estimates. If this
hh
estimator is unavailable, another option is presented in Theorem 2.3.2.
The matrix F is a bit more di$cult to estimate. Since F varies with f , so will
t,q
its estimator. In any respect, F is an expectation and hence Theorem 2.3.1(c) can
be used to estimate it.
To clarify the issues in estimating F, I will brie#y present F for the test of
equal MAE. For the sequel let t (x) and W (x) denote the marginal p.d.f. and
x
x
c.d.f. of a random variable x, and let t (xDz) and W (xDz) denote the conditional
x
x
p.d.f. and c.d.f. of a random variable x given the value of another random
variable z. Assume that each p.d.f. is continuous and has a bounded density in an
open neighborhood of the origin.
The test for equal MAE with q"1 involves the null hypothesis
H : E(Du
D!Du
D)"0. If the two potential models for the predictand
0
1,t`1
2,t`1
y
are scalar linear regression models with regressors Z and Z then
t`1
1,t
2,t
the relevant test statistic is XK ~0.5P~0.5+T ( f (bK )!0) with f (bK )"Dy !
t/R t,1 t
t,1 t
t`1
Z@ bK D!Dy !Z@ bK D and bK "(bK @ , bK @ )@. To present F it is convenient to
1,t 2,t
2,t 2,t
t
1,t 1,t
t`1
de"ne F"(F , F ) relative to the partition of the parameter vector
1 2
bH"(bH@, bH@)@. Since both components are similar, consider F .
1
1 2
Letting F (b),LE f (b)/Lb , we have
1,t
t,1
1
!Z@ (b !bH)D]/Lb
F (b)"LE[Du
!Z@ (b !bH)D!Du
2
1
2,t 2
1
2,t`1
1,t 1
1,t
1,t`1
"LE[Du
!Z@ (b !bH)D]/Lb
1
1
1,t 1
1,t`1
=
!Z@ t 1,t`1 (xDZ ) dx
"
1,t u
1,t
Z1,t Z@1,t (b1 ~bH1 )
Z@1,t (b1 ~bH1 )
(7)
#
Z@ t 1,t`1 (xDZ ) dx dW 1,t .
1,t
Z
1,t u
~=
P CP
P
D
3 West (1996) notes that under the recursive scheme, parameter uncertainty is also irrelevant when
j (FBS@ #S@ B@F@)#j FBS B@F@"0. Also, West and McCracken (1998) show that augmented
fh
th
fh
hh
hh
regression-based tests can remove parameter uncertainty.
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
205
If we evaluate at the true parameter bH we have
P CP
F (bH)"!
1,t
Z1,t
P
=
0
Z@ t 1,t`1 (xDZ ) dx#
1,t u
1,t
0
~=
D
!Z@ t 1,t`1 (xDZ ) dx dW 1,t
1,t
Z
1,t u
(8)
"!E sgn (u
)Z@
1,t`1 1,t
for a function sgn(x) that takes the value 1 if x is nonnegative and !1 if x is
negative. If we impose the condition that X is strictly stationary, F"F all t.
t
t
For the test of equal MAE, F in (8) can be consistently estimated by
1
T
(9)
FK "!P~1 + sgn(y !Z@ bK )Z@ .
1,t 1,t 1,t
t`1
1
t/R
Reintroducing F into the discussion (and noticing that there is an extra minus
2
sign introduced), we can estimate F consistently using FK "(FK , FK ) with
1 2
FK de"ned as
2
T
(10)
FK "P~1 + sgn(y !Z@ bK )Z@ .
2,t 2,t 2,t
t`1
2
t/R
If we are willing to impose the stronger assumption that for each i"1, 2, u
,
i,t`1
is independent of Z then F can also be consistently estimated by FK "(FK , FK )
i,t
1 2
where
A
BA
B
T
T
FK "(!1)i P~1 + sgn(y !Z@ bK ) P~1 + Z@ .
(11)
i
i,t
i,t i,t
t`1
t/R
t/R
Similar arguments can be used to derive F and a consistent estimator FK for other
test statistics. Rather than do so, for the remainder of the paper I will assume
that such an estimator FK exists.
To complete the construction of a consistent estimate of X, we need to
generate consistent estimates of S , S and possibly S . If f
and h are
ff fh
hh
t`q
t
m-dependent of known order then Assumptions 1}5 su$ce for constructing
consistent estimates of S , S and S (as we will see in Theorem 2.3.2(a)). For
ff fh
hh
example, when evaluating the q-step ahead predictive ability of two models,
Swanson and White (1997) estimate S using the "rst q!1 sample autocorrelaff
tions of f .4 However, if f and h are autocorrelated of in"nite order I suggest
t,q
t`q
t
4 It should be noted that q!1 dependence in the levels of a forecast error does not imply q!1
dependence of a function of those forecast errors. For example, a one-step ahead forecast error may
form a martingale di!erence sequence but still exhibit serial correlation in its square. See Harvey et
al. (1998) for a discussion.
206
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
using a kernel-based estimator. Such an estimator requires imposing conditions
on a kernel, K(x), as well as stronger moment and mixing conditions on g (b).
t
Assumption 6. (a) Let K(x) be a kernel such that for all
x, DK(x)D)1, K(x)"K(!x), K(0)"1, K(x)
is
continuous,
and
:= DK(x)D dx(R, (b) for u de"ned in Assumption 5, some bandwidth M and
~=
constant i, i3(0, min(u, 0.5)), M"O(Pi), and (c) There exists r6 3(1, 2] such that
(1!i)~1(r6 (d and += a(r6 ~1~d~1)(R.
j/1 j
( f (bK )!
Throughout the following, and for "xed j*0, CK ( j)"P~1+T
t/R`j t,q t
ff
(bK
) and CK ( j)"
( f (bK )!fM )h@
fM )( f
(bK
)!fM )@, CK ( j)"P~1+T
t`q~j t~j
hh
t/R`j t,q t
t~j,q t~j
fh
(bK
) where fM "P~1+T f (bK ). Furthermore, for
h (bK )h@
P~1+T
t/R t,q t
t/R`j t`q t t`q~j t~j
j(0, CK ( j)"CK (!j)@, CK ( j)"CK (!j)@, and CK ( j)"CK (!j)@.
ff
ff
fh
fh
hh
hh
Theorem 2.3.2. (a) Under Assumptions 1}5, CK ( j)P C ( j), CK ( j)P C ( j),
ff
1 ff
fh
1 fh
and
CK ( j)P C ( j).
(b)
Under
Assumptions
1}6,
SK "
hh
1 hh
ff
K( j/M)CK ( j)P S and SK "
K( j/M)CK ( j)P S , SK "+P~1
+P~1
j/~P`1
fh
1 fh
hh
j/~P`1
ff
1 ff fh
K( j/M)CK ( j)P S .
+P~1
j/~P`1
hh
1 hh
We now have all the tools necessary to conduct asymptotically valid out-ofsample inference concerning the moments of nonsmooth functions of parametric
forecasts and forecast errors. For example, given FK , BK and n( such that
FK P F, BK P B and n( Pn we can use Theorem 2.3.2 to create SK , SK , SK such
1
1
ff fh hh
that XK "SK #jK (FK BK SK @ #SK BK @FK @)#jK FK BK SK BK @FK @P X. Then, using Thefh
fh
hh
hh
1
ff
fh
orem 2.3.1, we know that s ,XK ~0.5P~0.5+T ( f (bK )!h )P N(0, I ). If
t/R t,q t
0
$
l
T
l"1 we can use standard normal tables to test the null. If l'1 we can use the
fact that s@ s P s2(l) and hence chi-square tables can be used to test the null.
T T $
3. Empirical evidence
In this forecasting exercise I apply the theory developed in Section 2 to the
test of equal MAE in the prediction of excess returns to the S & P 500 composite.
For simplicity, I choose to compare the predictive ability of two simple linear
regressions. For the "rst model I follow Fama and French (1988) and use the
dividend yield and a constant to predict excess returns. For the second model
I follow Campbell and Shiller (1988) and use the earnings}price ratio and
a constant to predict excess returns. Shiller (1984) also compares the predictability of dividend yields to the predictability of earnings-price ratios for the
S & P 500 composite. See Fama (1991) for a review of these and other results.
For the sake of comparison, I construct three distinct tests of equal predictive
ability. First I construct a test for equal MAE using the methods discussed in
Section 2 that account for e!ects due to parameter uncertainty. I then construct
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
207
the test for equal MAE ignoring parameter uncertainty using the statistic
proposed in Diebold and Mariano (1995). Finally, I construct the test for equal
MSE ignoring the potential e!ects of parameter uncertainty. Under the assumption that OLS provides consistent estimates of the parameters, West (1996) has
shown that one can ignore parameter uncertainty when testing for equal MSE.
3.1. Data and sources
The sample period includes 519 monthly observations from 1954:01 to
1997:03. The starting point 1954:01 is chosen to avoid the Treasury-Fed Accord
to peg interest rates. It is also the "rst month for which monthly frequency
observations for dividend yield exist for the S & P 500 composite.
I use the closing value of the S & P 500 composite as of the "nal Wednesday of
the month as the stock price (P ). These are obtained from Standard and Poor's
t
Current Statistics (1997) and Security Price Index Record (1997). The onemonth risk-free rate (I ), used to construct excess returns, is the US Treasury Bill
t
series obtained from Ibbotson Associates (1997). Using these two series I construct excess returns as Return "(P #D !P )/P !I . Standard and
t
t
t
t~1 t~1
t~1
Poor's Statistical Service does not publish the monthly dividend series (D ).
t
I construct one by summing the present and previous three quarter aggregate
dividends and dividing by 12. Pesaran and Timmermann (1995) also use this
technique.
The two predictors are dividend yield (D> ) and the earnings}price ratio
t~1
(EP ). To insure that the predictors are truly ex ante I do not use the dividend
t~1
series (D ) constructed above since it includes information through the end of the
t
present quarter. Instead, I use the dividend yield as reported in the Standard and
Poor's Security Price Index Record at the end of each month. For the same
reasons, I use the inverse of the price}earnings ratio rather than construct an
earnings}price ratio using quarterly information on earnings.
Table 1 reports standard descriptive statistics regarding OLS regressions that
use the dividend-yield or the earnings}price ratio as predictors. Each regression
exhibits little linear predictability. The residuals in each regression have distributions that are skewed and heavy tailed. The residuals exhibit little serial
correlation but are conditionally heteroskedastic in the regressors and exhibit
ARCH-type behavior.
3.2. Methodology and results
Let the scalar y
denote Return
and let Z and Z denote the (2]1)
t`1
t`1
1,t
2,t
vectors (1, DY )@ and (1, EP )@, respectively. We are interested in comparing the
t
t
predictive ability of the two simple linear regression models
y "Z@ bH#u
1,t`1
1,t 1
t`1
and y
t`1
.
"Z@ bH#u
2,t`1
2,t 2
(12)
208
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
Table 1
Summary statistics for full sample regressions of excess returns to S&P 500 composite!
Panel A: Unrestricted linear regression using both dividend yield and earnings-price ratio
Predictors
Constant
DY
EP
R2"0.0079
DW"1.9416
Coe$cient !0.0115
0.0099
!2.6716
Skewness coe$cient"!0.3539
(S.E.)
(0.0084)
(0.0054)
(2.1461)
Kurtosis coe$cient"2.0389
LM test for heteroskedasticity in residuals:
s2(5)"10.7396 with p-value"0.0567
LM test for serial correlation in residuals:
s2(12)"13.0227 with p-value"0.3674
LM test for serial correlation in squared
s2(12)"26.6380 with p-value"0.0087
residuals:
Panel B: restricted linear regression using dividend yield
Predictors
Constant
DY
R2"0.0048
DW"1.9399
Coe$cient !0.0058
0.0032
Skewness coe$cient"!0.3714
(S.E.)
(0.0079)
(0.0022)
Kurtosis coe$cient"1.9942
LM test for heteroskedasticity in residuals:
s2(2)"6.0936 with p-value"0.0475
LM test for serial correlation in residuals:
s2(12)"12.8804 with p-value"0.3778
LM test for serial correlation in squared
s2(12)"27.5580 with p-value"0.0064
residuals:
Panel C: restricted linear regression using earnings-price ratio
Predictors
Constant
EP
R2"0.0020
DW"1.9406
Coe$cient
0.0052
0.7589
Skewness coe$cient"!0.3568
(S.E.)
(0.0061)
(0.8570)
Kurtosis coe$cient"2.0280
LM test for heteroskedasticity in residuals:
s2(2)"8.4639 with p-value"0.0145
LM test for serial correlation in residuals:
s2(12)"12.6658 with p-value"0.3938
LM test for serial correlation in squared
s2(12)"27.3262 with p-value"0.0069
residuals:
!Notes: The data consist of monthly observations from 1954:01 to 1997:03 (¹"519). See Section
3 of the text for a description of the data. Standard errors are constructed using a heteroskedasticity
robust covariance matrix. The skewness and kurtosis coe$cients are constructed using the regression residuals.
The parameters are estimated using OLS and then the parameter estimates
bK are used to construct the forecasts Z@ bK .
i,t i,t
i,t
In this exercise I construct each of the three test statistics nine di!erent
ways corresponding to three di!erent forecasting schemes (recursive, rolling
and "xed) and three di!erent splits of the data. I use the three sample
splits (54:01}89:12, 90:01}97:03), (54:01}79:12, 80:01}89:12) and (54:01}79:12,
80:01}97:03). Given these splits, the corresponding values of n( "P/R are 0.20,
0.38 and 0.66.
To construct the test for equal MAE that accounts for parameter uncertainty,
an asymptotically valid variance, XK (from Theorem 2.3.2), is constructed. When
estimating the variance, I presume no knowledge regarding the existence
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
209
of heteroskedasticity or serial correlation. I use a Newey}West (1987) serial
correlation consistent covariance estimator of S , S and S .5 I use the
ff fh
hh
out-of-sample forecast errors and out-of-sample values of y , Z and Z
t`1 1,t
2,t
in the construction of f (bK )"Dy !Z@ bK D!Dy !Z@ bK D and
2,t 2,t
1,t 1,t
t`1
t,1 t
t`1
h (bK )"[(y !Z@ bK )Z@ ,(y !Z@ bK )Z@ ]@. I estimate B using the
2,t 2,t 2,t
1,t 1,t 1,t t`1
t,1 t
t`1
out-of-sample observations on Z and Z to form the (4]4) block diagonal
1,t
2,t
matrix with (P~1+T Z Z@ )~1 in the upper (2]2) diagonal position and
t/R 1,t 1,t
(P~1+T Z Z@ )~1 in the lower (2]2) diagonal position. To estimate F I use
t/R 2,t 2,t
(9) and (10) directly.
The test of equal MAE is constructed a second time ignoring parameter
uncertainty. This time the variance is estimated only using an estimate of S .
ff
The estimate was identical to the one used above.
For the sake of comparison, the test of equal MSE was also constructed. For
this test, f (bK )"(y !Z@ bK )2!(y !Z@ bK )2. Under the assump2,t 2,t
1,t 1,t
t`1
t,1 t
t`1
tions that OLS provides consistent estimates of the parameters, F"0. Using the
results in West (1996) we then know that we can ignore parameter uncertainty
when estimating the asymptotic variance. In estimating S , I presume no
ff
knowledge regarding the existence of serial correlation. Once again I use the
Newey}West (1987) estimator.
Table 2 reports the results of the tests. Each subpanel corresponds to one of
the sample splits. The "rst four columns report the raw out-of-sample MAE and
MSE associated with each of the two predictive models. The MAE values are
scaled by 100 and the MSE values are scaled by 1000. Note that in every
instance, the MAE and MSE is larger for model 1 than for model 2.
Column 5 reports the test for equal MAE that accounts for parameter
uncertainty. Column 6 reports the test for equal MAE that ignores parameter
uncertainty. In every instance, accounting for parameter uncertainty increases
the magnitude of the estimated variance. This causes the statistics that account
for parameter uncertainty to be uniformly smaller than the ones that do not.
This e!ect can also be seen in the p-values reported in columns 8 and 9. Because
of these changes, there are instances in which accounting for parameter uncertainty can a!ect the decision to reject or fail to reject the null of equal MAE.
During the 1980s, there does not appear to be any di!erence in the predictive
ability using either of the two models. This holds whether we use MAE or MSE
as the measure of predictive ability. For this time frame, accounting for parameter uncertainty made little di!erence in the tests for equal MAE.
The same cannot be said during the 1990s. Ignoring parameter uncertainty,
one would "nd that model two has a lower MAE at (approximately) the 1%
level regardless of which forecasting scheme is used. Accounting for parameter
uncertainty, we fail to reject at the 5% level using either the rolling or "xed
5 I use the integer part of P1@3 as the window width.
210
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
Table 2
Testing for relative predictive ability of predictions of excess returns to S&P 500 composite!
Raw values
MAE-1
90:01}97:03:
Recursive
Rolling
Fixed
80:01}89:12:
Recursive
Rolling
Fixed
80:01}97:03:
Recursive
Rolling
Fixed
n( "0.20
2.691
2.688
2.724
n( "0.38
3.551
3.552
3.557
n( "0.66
3.189
3.203
3.228
Statistics
P-values (2-sided)
Adj.
UnAdj.
Adj.
UnAdj.
MAE-2
MSE-1
MSE-2
MAE
MAE
MSE
MAE
MAE
MSE
2.614
2.634
2.619
1.183
1.183
1.199
1.153
1.163
1.154
1.990
1.541
1.899
2.565
2.354
2.540
1.726
1.455
1.765
0.047
0.123
0.058
0.010
0.019
0.011
0.084
0.146
0.078
3.536
3.551
3.534
2.280
2.284
2.270
2.271
2.282
2.253
0.506
0.046
0.539
0.515
0.048
0.590
0.339
0.115
0.432
0.613
0.963
0.590
0.606
0.962
0.555
0.734
0.909
0.666
3.148
3.180
3.153
1.819
1.828
1.831
1.801
1.817
1.792
1.724
1.154
1.610
1.844
1.326
2.223
0.987
0.768
1.364
0.085
0.249
0.107
0.065
0.185
0.026
0.324
0.442
0.173
!Notes: Table 2 reports empirical results relevant to testing for equal MAE and equal MSE between two
models used to predict the S&P 500 composite portfolio. Model 1 is an OLS estimated linear regression
with an intercept and once lagged dividend yield. Model 2 is the same but uses the earnings}price ratio. The
"rst four columns report the realized out-of-sample values of the MAE and MSE associated with each
model during three di!erent forecast periods. Column 5 reports the values of the test for equal MAE
adjusted (Adj.) for parameter uncertainty. Columns 6 and 7 report the values of the statistics used to
construct the tests for equal MAE and equal MSE both ignoring parameter uncertainty (UnAdj.). Columns
8}10 report the p-values (2-sided, from the standard normal distribution) associated with the statistics in
columns 5}7. MAEs are scaled by 100. MSEs are scaled by 1000.
forecasting schemes. We do reject at the 5% when the recursive scheme is used
but the evidence is weaker than when parameter uncertainty is ignored. Notice
that during the 1990s the test for equal MSE fails to reject the null of equal
predictive ability at the 5% for any of the sampling schemes. The null can be
rejected at the 10% level when either the recursive or "xed schemes are used.
Similar observations can be made regarding the tests for equal MAE throughout both the 1980s and 1990s. Ignoring parameter uncertainty, the "xed scheme
rejects the null at the 5% level. When parameter uncertainty is accounted for we
fail to reject at even the 10% level. When the rolling scheme is used we fail to
reject the null at the 10% level regardless of parameter uncertainty. When the
recursive scheme is used we reject the null at the 10% level regardless of
parameter uncertainty. Over the same time frame, we fail to reject the null for
equal MSE when any of the forecasting schemes are used.
Clearly, there are instances where accounting for parameter uncertainty
a!ects the decision to reject or fail to reject the null of equal MAE. What is not
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
211
clear is whether it was necessary to account for parameter uncertainty in the "rst
place. Recall that if F"0 then parameter uncertainty is asymptotically irrelevant and hence S is the relevant asymptotic variance. For the test of equal
ff
MAE, F can be zero if the disturbances have a zero median conditional on the
values of the regressors. In this application it seems reasonable to reject that
assertion. Using the skewness coe$cients reported in Table 1, the null of zero
skewness is rejected at the 1% level for each of the three sets of residuals.
4. Simulation evidence
The asymptotic results of Section 2 need only be appropriate for large
in-sample sizes R and out-of-sample sizes P. It is not clear how well the
asymptotic approximation will perform in sample sizes commonly used in
empirical work. To examine this problem, I present simulations of the three tests
of either equal MAE or equal MSE between the two simple linear regressions
,
(12@)
and y "Z@ bH#u
y "Z@ bH#u
2,t`1
2,t 2
1,t`1
t`1
1,t 1
t`1
where Z and Z denote the (2]1) vectors (1, z )@ and (1, z )@, respectively.
1,t
2,t
1,t
2,t
Each statistic is constructed in precisely the same manner as in Section 3. For
each statistic I report the size and size-adjusted power of the test in samples of
the size used in Section 3.
First, I simulate a hypothetical data generating processes that is stylized to the
empirical results of Section 3. The data generating process I have chosen has the
representation
#u , u "c(x
#x
)#g ,
y "bH z
2,2 2,t`1
t`1
t`1
1,t`1
2,t`1
t`1
t`1
"az #e
,
x
"2~0.5[(1!a2)z2 !1], z
i,t`1
i,t`1
i,t
i,t`1
i,t`1
e
&i.i.d. N(0, 1) g &i.i.d. t(6), e
oe
og ,
i,t`1
t`1
1,t`1 2,t`1 t`1
c"0.25, a"0.9.
(13)
The parameter bH (the second component of bH) is a tuning parameter used to
2
2,2
distinguish between the null and alternative. When bH "0 the null of either
2,2
equal MAE or equal MSE is satis"ed. When bH O0 the alternative holds;
2,2
model two has both a lower MAE and lower MSE than does model 1. I allow
this parameter to vary across the range 0, 0.10, 0.25, 0.50 and 1.00. By doing so
I am better able to determine how accounting for parameter uncertainty a!ects
the power of the test. I am also better able to determine whether tests for equal
MAE or tests for equal MSE are more powerful for detecting small deviations
from the null.
The initial conditions for the z are drawn from their unconditional distribui,t
tion. The y
are then constructed using (13). Each series is of length
t`1
212
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
500#519"500#(¹#1)"1019. The initial 500 observations are generated
to burn out the e!ects of initial conditions.
The results are based upon 5000 replications. Note that the same simulated
data is used for each sampling scheme and each (P, R) combination in order to
facilitate the di!erent small sample comparisons. To make comparisons possible
across the three hypothesis tests the random number generator is seeded so that
the three sets of 5000 separate samples are the same.
I chose this data generating process for two basic reasons. The "rst is that it
exhibits many of the characteristics of the data used in Section 3. The regressors
exhibit strong serial dependence. The distribution of the predictand, y , has
t
heavy tails and is skewed. The residuals from the two predictive models will
exhibit conditional heteroskedasticity in the regressors. Also, since the regressors are serially correlated, the squares of the residuals from the two
predictive models will be serially correlated (i.e. GARCH(1, 1)-like e!ects).
The second reason is that I wanted the two linear models to have little, if any,
predictive ability in order to match the very small R2 values commonly observed
in the literature. Here, both predictive models have a population R2 of zero. This
occurs because both bH and bH are zero under the null. This implies that both
2
1
models have the same predictive ability and hence the null is satis"ed.
But it does more than that. It implies that f
and S are equal to zero when
t`1
ff
either MAE or MSE is used to measure predictive ability. This does not imply
that an asymptotically standard normal test for equal predictive ability cannot
be constructed. For there to be asymptotic normality the limiting variance, X,
must be positive de"nite. If parameter uncertainty is irrelevant then S must be
ff
positive de"nite. On the other hand, if f
is zero for all t, and hence
t`1
S "FBS@ "0, then FBS B@F@ must be positive de"nite.
fh
hh
ff
For the test of equal MAE in this exercise, and when parameter uncertainty is accounted for, this is not a problem. This occurs because the disturbances are skewed for each predictive model and hence F"
)Z@ )@O0. It is a problem for the test of equal
(!E sgn(u
)Z@ , E sgn(u
2,t`1 2,t
1,t`1 1,t
MSE since consistent estimation of the parameters by OLS implies
Z@ )@"0. Hence, a priori we expect the test for
F"(!E u
Z@ , E u
2,t`1 2,t
1,t`1 1,t
equal MAE, corrected for parameter uncertainty, to be reasonably sized. We
also expect the test for equal MAE, without the correction for parameter
uncertainty, and the test for equal MSE to be missized.
Table 3 reports the actual size of the three tests when the critical values
$2.576, $1.96 and $1.645 are used. When n( "0.20 the test for equal MAE
that accounts for parameter uncertainty is reasonably well sized. It is also
uniformly more accurate than both the test for equal MAE without the correction for parameter uncertainty and the test for equal MSE. When n( is either 0.38
or 0.66 it is harder to make such a uniform statement. In every case the latter two
tests are seriously oversized. At the same time the test of equal MAE, that
accounts for parameter uncertainty, tends be undersized. This is particularly
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
213
Table 3
Actual size of out-of-sample tests!
Valid MAE
Invalid MAE
Invalid MSE
1%
5%
10%
1%
5%
10%
1%
5%
10%
n( "0.20
R
L
F
0.0118
0.0120
0.0134
0.0754
0.0784
0.0720
0.1416
0.1484
0.1504
0.0758
0.0670
0.1130
0.2048
0.1974
0.2556
0.3032
0.2986
0.3542
0.0642
0.0578
0.1048
0.2220
0.2116
0.2798
0.3382
0.3336
0.3916
n( "0.38
R
L
F
0.0072
0.0082
0.0046
0.0418
0.0476
0.0362
0.0968
0.1064
0.0898
0.0434
0.0434
0.1046
0.1624
0.1536
0.2522
0.2604
0.2484
0.3462
0.0344
0.0344
0.0908
0.1704
0.1604
0.2632
0.2910
0.2722
0.3796
n( "0.66
R
L
F
0.0036
0.0056
0.0012
0.0328
0.0458
0.0194
0.0768
0.0976
0.0596
0.0386
0.0330
0.1074
0.1382
0.1188
0.2626
0.2224
0.1984
0.3524
0.0324
0.0250
0.0904
0.1388
0.1200
0.2618
0.2376
0.2142
0.3770
!Notes: Subpanels denoted n( "0.20, 0.38 and 0.66 indicate sample sizes and splits corresponding to
those used in the empirical results reported in Table 2. Columns denoted 1%, 5% and 10% present
the actual size of the test when the critical values $2.576, $1.96 and $1.645 are used,
respectively. Rows denoted R, L and F signify the use of the Recursive, roLLing and Fixed schemes,
respectively. The results are based upon 5000 replication
Robust out-of-sample inference
Michael W. McCracken*
Department of Economics, Louisiana State University, 2107 CEBA, Baton Rouge, LA 70803-0306, USA
Received 25 September 1998; received in revised form 29 November 1999; accepted 13 March 2000
Abstract
This paper presents analytical, empirical and simulation results concerning inference
about the moments of nondi!erentiable functions of out-of-sample forecasts and forecast
errors. Special attention is given to the measurement of a model's predictive ability using
the test of equal mean absolute error. Tests for equal mean absolute error and mean
square error are used to evaluate predictions of excess returns to the S & P 500
composite. Simulations indicate that appropriately constructed tests for equal mean
absolute error can provide more accurately sized and more powerful tests than inappropriately constructed tests for equal mean absolute error and mean square error. ( 2000
Elsevier Science S.A. All rights reserved.
JEL classixcation: C52; C53; C32; C12
Keywords: Forecasting; Forecast evaluation; Hypothesis testing; Model comparison
1. Introduction
It is becoming common to evaluate a forecasting model's ability to predict
using out-of-sample methods. Meese and Rogo! (1983), in predicting exchange
rates, report the mean square error (MSE) of forecast errors. Akgiray (1989) uses
the mean absolute error (MAE) to evaluate volatility forecasts of stock returns.
Engel (1994) reports the number of times the direction of change in exchange
* Corresponding author. Tel.: #1-225-388-3782; fax: #1-225-388-3807.
E-mail address: [email protected] (M.W. McCracken).
0304-4076/00/$ - see front matter ( 2000 Elsevier Science S.A. All rights reserved.
PII: S 0 3 0 4 - 4 0 7 6 ( 0 0 ) 0 0 0 2 2 - 1
196
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
rates is accurately predicted. Swanson and White (1995) report the Schwarz
information criterion as well as the out-of-sample R2 that result when forward
interest rates are used to predict future spot rates.
These papers, and many others, evaluate predictive ability in one of two ways.
Most do so by simply constructing point estimates of some measure of predictive ability. The most common measure is MSE. A few others argue heuristically
that their tests of predictive ability are limiting normal and hence asymptotically
valid t-statistics can be used to test hypotheses. For example, Pagan and
Schwert (1990) and Fair and Shiller (1990) construct regression based tests for
e$ciency and encompassing respectively. However, they do not provide a set of
su$cient conditions for their statistics to be asymptotically standard normal.
Recent theoretical work has attempted to provide those su$cient conditions.
When parametric forecasts and forecast errors are used to estimate moments
or conduct inference there are two sources of uncertainty. There is uncertainty
that exists even when we know the model parameters and there is uncertainty
due to the estimation of parameters. Diebold and Mariano (1995) show how to
construct asymptotically valid out-of-sample tests of predictive ability when
there is no parameter uncertainty, for example, when parameters are known.
Under this restriction, they are able to construct tests of hypotheses that involve
moments of di!erentiable and nondi!erentiable functions such as those used to
construct tests for equal MSE and equal MAE between two predictive models.
When parameters are unknown, and must be estimated, parameter uncertainty can play a role in out-of-sample inference. West (1996) has shown how the
uncertainty due to parameter estimation can a!ect the asymptotic distribution
of moments of di!erentiable functions of out-of-sample forecasts and forecast
errors. Given a parametric forecasting model, this allows for inference concerning tests of serial correlation, e$ciency, encompassing, zero mean prediction
error and equal MSE between two predictive models.
In this paper I close some of the gaps between the work by Diebold and
Mariano (1995) and West (1996). I extend the work by Diebold and Mariano
(1995) by showing that parameter uncertainty can a!ect out-of-sample inference
regarding moments of nondi!erentiable functions. As in West (1996), the parameter uncertainty causes the limiting covariance structure to be nonstandard.
The limiting covariance matrix contains two components: a standard component that would exist if the parameters used to construct forecasts were known in
advance and a second component due to the fact that parameters are not known
and have to be estimated.
I extend the work of West (1996) in two ways. Firstly, I provide su$cient
conditions for asymptotic normality of sample averages of nondi!erentiable
functions of parametric forecasts and forecast errors. In particular, the analytical
results provide conditions that are weak enough to construct tests that use the
absolute value and indicator functions. These conditions also allow the application of White's (2000) bootstrap of data snooping e!ects to instances when the
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
197
measure of predictive ability is nondi!erentiable. Secondly, I allow model
parameters to be estimated using loss functions that are not di!erentiable. By
doing so I permit a greater degree of freedom in choosing the loss function used
to estimate the parameters to match the loss function used to evaluate the
forecasts. This may be bene"cial in light of the discussion in Weiss (1996).
These extensions are potentially useful since nonsmooth measures of predictive ability have been used to evaluate parametric predictive models. Granger
(1969) provides an early theoretical discussion. Empirical examples are plentiful.
Gerlow et al. (1993) use MAE to evaluate predictive ability. Swanson and White
(1997) use mean absolute percentage error (MAPE) to measure predictive
ability. Stekler (1991) compares the predictive ability of two parametric models
using the test of percent better, or what Diebold and Mariano (1995) refer to as
the &sign test'. Engel (1994) constructs a test for sign predictability based upon
the binomial distribution. Henriksson and Merton (1981) and Pesaran and
Timmermann (1992) construct tests for sign predictive ability using a standard
normal approximation.
Each of the measures of predictive ability mentioned above can be used to
construct tests of forecast accuracy. As presented though, most ignore the
possibility that the forecasts are generated parametrically and hence may be
a!ected by parameter uncertainty. The results of West and McCracken (1998),
concerning smooth measures of predictive ability, suggest that in many circumstances it is inappropriate to ignore the parameter uncertainty.
In this paper I provide analytical, empirical and simulation results indicating
that ignoring parameter uncertainty can be inappropriate when nonsmooth
measures of predictive ability are used. I focus on the test of equal MAE as an
example in which accounting for parameter uncertainty can be important.
Although I emphasize the absolute value function, the asymptotic results are
applicable to tests that use indicator functions.
For the results of this paper to hold, however, certain conditions must be met.
Perhaps the most important is Assumption 4. There I assume that the expectation of the function of interest must be continuously di!erentiable in the
parameters. This assumption is not very restrictive when the absolute value
function is being used and is the reason I use the test of equal MAE as a foil
throughout the paper. It can be a problem when indicator functions are used. In
particular it can be a problem for tests of sign predictability. See the discussion
following Assumption 4 for further detail.
The remainder of the paper proceeds as follows. Section 2 describes a general
environment and provides the asymptotic results. Section 3 tests the predictability of excess returns to the S & P 500 composite portfolio using tests of equal
MAE and equal MSE. Section 4 provides simulation evidence on the "nite
sample size and power characteristics of the test statistics used in Section 3. The
paper concludes with a discussion of these results and some topics for future
research. The appendix presents proofs. An additional appendix available on
198
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
request from the author presents details of proofs omitted from the paper to save
space.
2. Theoretical results
This section presents su$cient conditions for asymptotic inference about the
moments of functions of out-of-sample forecasts and forecast errors. These
conditions will su$ce to show, in Theorem 2.3.1, that out-of-sample averages
consistently estimate population means, and when appropriately scaled are
asymptotically normal. These conditions also su$ce to show, in Theorem 2.3.2,
that the limiting covariance structure can be consistently estimated by
a straightforward application of Slutsky's theorem.
For any function f, f (bK ) will denote the parametric estimate of f (bH). Also,
t,q t
t`q
in order to minimize notation, f
will denote f (bH).
t`q
t`q
2.1. Environment
Throughout it is assumed that MX NT`q is a given sample of observables. The
s s/1
latter portion of that sample contains a continuous stream of P s-step ahead
forecasts. The "rst forecast, y (bK ), is based upon a parameter vector estimated
R,q R
using observations s"1,2, R. Further forecasts, y (bK ), are each constructed
t,q t
using an estimated parameter vector that is based on observations
s"1,2, t, R)t)¹,R#P!q. The time period for which the P forecasts
are generated will be referred to as the out-of-sample period.
As in West and McCracken (1998) I will allow for three di!erent forecasting
schemes. The recursive, rolling and "xed forecasting schemes di!er in how they
construct the sequence of parameter estimates used to construct the sequence of
forecasts and forecast errors. A brief description is given below.
Keim and Stambaugh (1986) use the recursive scheme. Under this scheme
a sequence of forecasts is generated using updated parameter estimates. At
each time t"R,2, ¹ the parameter estimate bK depends explicitly on all obsert
vables from s"1,2, t. If OLS is used to estimate the parameters from
a scalar linear model with regressors Z and predictand y then bK "
s
s
t
(t~1+t Z Z@ )~1(t~1+t Z y ). The "rst forecast is then of the form y (bK ).
s/1 s s
R,q R
s/1 s s
The second forecast, y
(bK
) is constructed similarly using observations
R`1,q R`1
s"1,2, R#1. This process is iterated P times so that for each t3[R, ¹], the
parameter estimates use observations s3[1, t].
Chen and Swanson (1996) use the rolling scheme. Under this scheme the
sequence of parametric forecasts is constructed in much the same way as
the recursive scheme. The rolling scheme di!ers from the recursive in its treatment of observations from the distant past. The rolling scheme uses only a
"xed window of the past R observations. As t increases from R to ¹, older
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
199
observations are not used in estimating the parameters. If OLS is used to
estimate the parameters from a scalar linear model with regressors Z and
s
Z y ). This imZ Z@ )~1(R~1+t
predictand y then bK "(R~1+t
s/t~R`1 s s
s/t~R`1 s s
s
t
plies that the "rst rolling forecast, y (bK ), and forecast error are identical to
R, q R
those for the recursive. The second rolling forecast, y
(bK
), is constructed
R`1,q R`1
using only observations s"2,2, R#1 to estimate the model parameters. This
implies that the second rolling forecast and forecast error are distinct from those
using the recursive scheme. The process is iterated P times such that for each
t3[R, ¹] the parameter estimates use observations s3[t!R#1, t].
Kuan and Liu (1995) use the "xed scheme. This method is distinct from the
previous two in that the parameters are not updated when new observations
become available. Since the parameter vector is estimated only once, each
of the P forecasts, y (bK ), uses the same parameter estimate.1 If OLS is
t,q R
used to estimate the parameters using regressors Z and predictand y then
s
s
bK "(R~1+R Z Z@ )~1(R~1+R Z y ). Hence for each forecast from time
s/1 s s
s/1 s s
t
t3[R, ¹], the parameter estimate only uses observations s3[1, R].
Since, we are ultimately interested in conducting inference concerning the
population moments of functions of parametric forecasts and forecast errors,
a description of these functions is in order. The function
f (bK ),f (q, X , bK ) (l]1)
(1)
t,q t
t t
depends upon three arguments. The "rst is a "nite forecast horizon, q*1. The
second, X , is a "nite dimensioned vector of observables. The dating of the
t
subscript t is not meaningful. For example, if we are interested in the one-step
ahead MAE from a scalar linear regression model, f (bK )"Dy !Z@ bK D. Since
t t
t,1 t
t`1
the realized scalar left-hand side variable is y , and the variables used for
t`1
prediction are Z , X "(y , Z@ )@.
t t
t`1 t
The third argument, bK , is an estimate of a (k]1) unknown parameter vector
t
bH. When the inference to be conducted is simply a diagnostic of a single
parametric model, such as the test of zero median error for which
f (bK )"1My !Z@ bK )0N, bH is the vector of parameters that index that
t t
t,1 t
t`1
particular parametric model. On the other hand, if the inference to be conducted
is meant to detect which of two nonnested competing models is more accurate,
bH is formed by stacking the vector of parameters that index each of the two
models. For example, suppose that we are interested in comparing the one-step
ahead MAE from two scalar nonnested linear regression models. If we
let i"1, 2 index the two models (along with their respective regressors and
parameter estimates), f (bK )"Dy !Z@ bK D!Dy !Z@ bK D and hence
2,t 2,t
1,t 1,t
t`1
t,1 t
t`1
bK "(bK @ , bK @ )@.
1,t 2,t
t
1 Notice that the "xed and rolling parameter estimates should be subscripted both by t and R. In
order to simplify the notation the subscript R will be suppressed.
200
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
Given such a function, we are interested in testing (say) the scalar null
hypothesis H : E f "h for some "nite h . To do so, we will focus on test
0
t`q
0
0
statistics of the form XK ~0.5P~0.5+T ( f (bK )!E f ) where XK is a consistent
t/R t,q t
t`q
estimate of the appropriate limiting variance. In Theorem 2.3.1 it is shown that
this statistic is asymptotically standard normal and hence asymptotically valid
inference can be conducted using standard normal tables.
2.2. Assumptions
Within the following, for any matrix A, DAD"max Da D,DD.DD is the ¸Q norm,
i,j i,j
Q
sup denotes sup
, and for h (b) de"ned in Assumption 1,
t
RxtxT
t
g (b)"[( f (b)!E f (b))@, h (b)@]@.
t
t,q
t,q
t
(2)
Assumption 1. The estimate bK satis"es bK !bH"B(t)H(t), where B(t) is (k]q)
t
t
and H(t) is (q]1), with (a) B(t)P B, B a matrix of rank k, (b)
a.s.
h and R~1+R h for the recursive, rolling
H(t)"t~1+t h , R~1+t
s/1 s
s/t~R`1 s
s/1 s
and "xed schemes respectively; for the orthogonality condition h ,h (bH), and
s
s
(c) E h "0.
s
Assumption 1 provides for a wide range of methods of estimating parameters.
In particular, it allows for maximum likelihood, nonlinear least squares and
a range of generalized method of moments estimators. It allows for linear and
nonlinear models as well as single and multiple equation systems.
As an example of the notation in Assumption 1 consider that our statistic is
used to test for equal MAE between two competing linear models. Suppose that
each of the two models, for y , has the representation y "Z@ bH#u
i,t`1
i,t i
t`1
t`1
for i"1, 2. Consider further that for each i"1, 2 OLS provides a consistent
estimate of bH (k ]1). Since there are two sets of parameters needed to construct
i i
this test, bK "(bK @ , bK @ )@ (k #k "k]1), and hence B (k]q, q"q #q ,
1
2
1
2
1,t 2,t
t
q "k , q "k ) and h (q]1) are
1
1 2
2
s
A
B"
B
A
B
u
Z
(E Z Z@ )~1
01 2
1,t 1,t
k Cq
, h " 1,s`1 1,s .
s
u
Z
02 1
(E Z Z@ )~1
2,s`1 2,s
k Cq
2,t 2,t
(3)
Assumption 2. R, PPR as ¹PR, and lim
P/R"n, 0)n(R.
T?=
Assumption 2 provides our "rst insight as to how the asymptotic approximation is achieved. When asymptotically valid in-sample results are derived, one
allows ¹PR. I do so here, but impose the stronger condition that both the
out-of-sample (P) and in-sample (R) sample sizes become arbitrarily large
simultaneously. This assumption eventually allows application of central limit
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
201
theorem results to both the parameters estimated in-sample and the out-ofsample average of the function f .
t`q
Assumption 3. For some d'1, (a) E f "h , (b) X is strong mixing with
t`q
0
t
coe$cients of size !2d/(d!1), (c) g (bH) is covariance stationary, and (d) for an
t
open neighborhood N of bH, sup DDsup g (b)DD (R, (e) X is p.d. .
t
b|N t
2d
Assumption 3 is similar to that in West and McCracken (1998) with two
important distinctions. The "rst is that I weaken the moment conditions so that
only 2d rather than 4d need exist. This may prove helpful in the context of
forecasting excess returns for which there is evidence of leptokurtosis. The
second di!erence is that I reduce the order of the mixing coe$cients from
!3d/(d!1) to !2d/(d!1). The covariance stationarity assumption is primarily for simplifying the algebra when constructing a consistent estimate of the
asymptotic covariance matrix in Theorem 2.3.2.
Assumption 4. For each i3M1,2, l#qN: (a) E g (b) is continuously di!erentii,t
able in the neighborhood N (from Assumption 3) of bH admitting a mean value
expansion E g (b)"E g #(LE g (bI )/Lb)(b!bH) where g is a scalar, b
i,t
i,t
i,t
i,t
is (k]1) and bI is on the line between b and bH, (b) there exists a "nite constant D such that sup sup DLE g (b)/LbD(D, and (c) for all t, G"G ,
t
b|N
i,t
t
LE h (b)/LbD H and F"F ,LE f (b)/LbD H .
t
t,q
b/b
t
b/b
As we will see in Lemma 2.3.2 I separate the parameter uncertainty from the
sampling uncertainty by taking a mean value expansion of E f (b)D K t as in
t,q b/b
Randles (1982), rather than of f (b)D K t as in West (1996). The bound provided
t,q b/b
by D su$ces to show certain terms are o (1).
1
Although Assumption 4(a) is weaker than the di!erentiability condition in
West (1996), it is not always satis"ed. A simple counterexample can be constructed that is relevant for tests of sign predictability.2 Suppose that
y "bHy #u with u &i.i.d. N(0, 1) and DbHD(1. Let one-step ahead foret
t~1
t
t
casts of the form y bK be used to predict y . Consider the function
t t
t`1
f (b)"1My b*0N. There are two cases. If bH"0 then E f (bH)"
t,1
t
t`1
E1My bH*0N"E1M0*0N"1 for all t. But in every open neighborhood of
t
bH"0 there exists a b such that E f (b)"E1My b*0N"0.5. In this case
t,1
t
Assumption 4(a) fails. On the other hand, if bHO0 then there exists an open
neighborhood of bH such that for all b in that neighborhood, E f (b)"
t,1
E1My b*0N"0.5. In this case Assumption 4(a) holds. In the former case the
t
results of this paper cannot be applied to determine the limiting distribution of
2 The same type of problem exists for the Henriksson}Merton test (1981) and the
Pesaran}Timmermann test (1992). I use this example to simplify the presentation.
202
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
our test statistic. In the latter case, not only can the results be applied it is clear
that F"0.
This type of problem does not occur for all tests that use indicator functions.
Consider the test of zero median error. Using the same environment as in the
preceding example we have f (b)"1My !y b)0N"1Mu )y (b!bH)N.
t,1
t`1
t
t`1
t
Taking expectations, and letting U denote the standard normal c.d.f., we have
E f (b)"EU(y (b!bH)). Since U is continuously di!erentiable Assumption
t,1
t
4(a) holds regardless of the value of bH. Once again, not only can the results of
this paper be applied it is clear that F"0. See Kim and Pollard (1990, p. 205) for
a set of conditions su$cient for continuous di!erentiability of expectations of
indicator functions.
Assumption 5. Let N(e)"N(bH, e),Mb3Rk : Db!bHD(eN. There exist "nite
constants C, u'0 and Q*2d such that for all N(e)LN (from Assumption 3),
sup DDsup
(g (b)!g )DD )Cer.
t
b|N(e) t
t Q
In some circumstances it is straightforward to verify the ¸Q continuity
condition in Assumption 5. For example, if the parametric model is linear and
f is Lipschitz (as is the case for the absolute value function), this assumption is
t,q
automatically satis"ed. When indicator functions are used, verifying the condition is more di$cult. It will frequently be the case that Assumption 4 and
reasonable assumptions on the continuity of the p.d.f. of X will be needed to
t
verify the condition.
2.3. Results
In this section I utilize the assumptions of Section 2.2 to show that
P~0.5+T ( f (bK )!E f ) is asymptotically normal with a positive-de"nite
t/R t,q t
t`q
covariance matrix X which will usually depend on n"lim
P/R. In order to
T?=
construct an asymptotically valid test statistic, I then show that there exists
a straightforward and consistent estimator XK of X.
For the "rst step in the derivation, I borrow a decomposition used by Randles
(1982). Let
T
m "P~0.5 + ( f (bK )!E f (b)D K t !f #E f )
t`q
t`q
t,q t
t,q b/b
0,P
t/R
and
T
m "P~0.5 + ( f !E f #E f (b)D K t !E f )
t`q
t`q
t,q b/b
1,P
t`q
t/R
(4)
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
203
such that
T
P~0.5 + ( f (bK )!E f )"m #m .
t,q t
t`q
0,P
1,P
t/R
This decomposition leads to the following two lemmas upon which limiting
normality is based.
Lemma 2.3.1. Given Assumptions 1}5, m "o (1).
0,P
1
Lemma 2.3.2. Given Assumptions 1}5, m "[P~0.5+T ( f !E f )#
t/R t`q
t`q
1,P
FBP~0.5+T H(t)]#o (1).
t/R
1
It is now clear how both types of uncertainty are present. Sampling uncertainty is the "rst term and parameter uncertainty is the second term in the expansion
of Lemma 2.3.2. It is important to note that Lemma 2.3.2 provides the
same decomposition as in West and McCracken (1998) with F suitably rede"ned. If we de"ne C ( j)"E( f !E f )( f
!E f )@, C ( j)"
ff
t`q
t`q t`q~j
t`q
fh
C ( j), S "+=
, S "+=
, C ( j)"E h h@
E( f !E f )h@
j/~=
j/~= ff
fh
t`q t`q~j ff
t`q
t`q t`q~j hh
C ( j) we immediately know that the limiting variance
C ( j) and S "+=
j/~= hh
fh
hh
of the bracketed term on the right-hand side of Lemma 2.3.2 is
X"S #j (FBS@ #S B@F@)#j FBS B@F@,
fh
fh
hh
hh
ff
fh
where
Recursive
j
j
fh
hh
1!n~1 ln(1#n) 2[1!n~1 ln(1#n)]
Rolling, n)1
n/2
Scheme
n!n2/3
Rolling, 1(n(R 1!(2n)~1
1!(3n)~1
Fixed
n
0
(5)
(6)
Theorem 2.3.1. Given Assumptions 1}5, (a) P~0.5+T ( f (bK )!E f )
t/R t,q t
t`q
P N(0, X) for X dexned in (5), (b) if either F"0 or n"0 then
d
P~0.5+T ( f (bK )!E f )P N(0, S ), and (c) P~1+T f (bK )P E f .
t/R t,q t
1 t`q
t/R t,q t
t`q
d
ff
Theorem 2.3.1 shows that the statistic is limiting normal and that out-ofsample averages provide consistent estimates of population moments. The
distinction between parts (a) and (b) is exclusively whether or not parameter
uncertainty is relevant to the asymptotic covariance. To make this more clear,
notice that for all sampling schemes X"S when either F"0 or n"0. Since
ff
the latter covariance terms are those which result from parameter uncertainty,
parameter uncertainty is said to be asymptotically irrelevant when either F"0
204
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
or n"0.3 If this is the case then the results of Diebold and Mariano (1995) are
applicable even though parameters have been estimated.
The "nal step is to construct a consistent estimate of the covariance matrix X.
To do so, we need to design consistent estimates of S , S , B, F, BS B@, j
ff fh
hh
fh
and j . One can estimate B consistently by simply using the in-sample informahh
tion from the "nal parameter estimates. Since n( "P/R is a consistent estimate of
n and both j and j are continuous in n, we can use jK ,j (n( ) and
fh
hh
fh
fh
jK ,j (n( ) to consistently estimate j and j . The term BS B@ is the
fh
fh
fh
hh
hh
asymptotic covariance matrix of the parameter estimates. Since most software
packages automatically provide a consistent estimate of this matrix, an estimator of BS B@ is immediate from the "nal parameter estimates. If this
hh
estimator is unavailable, another option is presented in Theorem 2.3.2.
The matrix F is a bit more di$cult to estimate. Since F varies with f , so will
t,q
its estimator. In any respect, F is an expectation and hence Theorem 2.3.1(c) can
be used to estimate it.
To clarify the issues in estimating F, I will brie#y present F for the test of
equal MAE. For the sequel let t (x) and W (x) denote the marginal p.d.f. and
x
x
c.d.f. of a random variable x, and let t (xDz) and W (xDz) denote the conditional
x
x
p.d.f. and c.d.f. of a random variable x given the value of another random
variable z. Assume that each p.d.f. is continuous and has a bounded density in an
open neighborhood of the origin.
The test for equal MAE with q"1 involves the null hypothesis
H : E(Du
D!Du
D)"0. If the two potential models for the predictand
0
1,t`1
2,t`1
y
are scalar linear regression models with regressors Z and Z then
t`1
1,t
2,t
the relevant test statistic is XK ~0.5P~0.5+T ( f (bK )!0) with f (bK )"Dy !
t/R t,1 t
t,1 t
t`1
Z@ bK D!Dy !Z@ bK D and bK "(bK @ , bK @ )@. To present F it is convenient to
1,t 2,t
2,t 2,t
t
1,t 1,t
t`1
de"ne F"(F , F ) relative to the partition of the parameter vector
1 2
bH"(bH@, bH@)@. Since both components are similar, consider F .
1
1 2
Letting F (b),LE f (b)/Lb , we have
1,t
t,1
1
!Z@ (b !bH)D]/Lb
F (b)"LE[Du
!Z@ (b !bH)D!Du
2
1
2,t 2
1
2,t`1
1,t 1
1,t
1,t`1
"LE[Du
!Z@ (b !bH)D]/Lb
1
1
1,t 1
1,t`1
=
!Z@ t 1,t`1 (xDZ ) dx
"
1,t u
1,t
Z1,t Z@1,t (b1 ~bH1 )
Z@1,t (b1 ~bH1 )
(7)
#
Z@ t 1,t`1 (xDZ ) dx dW 1,t .
1,t
Z
1,t u
~=
P CP
P
D
3 West (1996) notes that under the recursive scheme, parameter uncertainty is also irrelevant when
j (FBS@ #S@ B@F@)#j FBS B@F@"0. Also, West and McCracken (1998) show that augmented
fh
th
fh
hh
hh
regression-based tests can remove parameter uncertainty.
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
205
If we evaluate at the true parameter bH we have
P CP
F (bH)"!
1,t
Z1,t
P
=
0
Z@ t 1,t`1 (xDZ ) dx#
1,t u
1,t
0
~=
D
!Z@ t 1,t`1 (xDZ ) dx dW 1,t
1,t
Z
1,t u
(8)
"!E sgn (u
)Z@
1,t`1 1,t
for a function sgn(x) that takes the value 1 if x is nonnegative and !1 if x is
negative. If we impose the condition that X is strictly stationary, F"F all t.
t
t
For the test of equal MAE, F in (8) can be consistently estimated by
1
T
(9)
FK "!P~1 + sgn(y !Z@ bK )Z@ .
1,t 1,t 1,t
t`1
1
t/R
Reintroducing F into the discussion (and noticing that there is an extra minus
2
sign introduced), we can estimate F consistently using FK "(FK , FK ) with
1 2
FK de"ned as
2
T
(10)
FK "P~1 + sgn(y !Z@ bK )Z@ .
2,t 2,t 2,t
t`1
2
t/R
If we are willing to impose the stronger assumption that for each i"1, 2, u
,
i,t`1
is independent of Z then F can also be consistently estimated by FK "(FK , FK )
i,t
1 2
where
A
BA
B
T
T
FK "(!1)i P~1 + sgn(y !Z@ bK ) P~1 + Z@ .
(11)
i
i,t
i,t i,t
t`1
t/R
t/R
Similar arguments can be used to derive F and a consistent estimator FK for other
test statistics. Rather than do so, for the remainder of the paper I will assume
that such an estimator FK exists.
To complete the construction of a consistent estimate of X, we need to
generate consistent estimates of S , S and possibly S . If f
and h are
ff fh
hh
t`q
t
m-dependent of known order then Assumptions 1}5 su$ce for constructing
consistent estimates of S , S and S (as we will see in Theorem 2.3.2(a)). For
ff fh
hh
example, when evaluating the q-step ahead predictive ability of two models,
Swanson and White (1997) estimate S using the "rst q!1 sample autocorrelaff
tions of f .4 However, if f and h are autocorrelated of in"nite order I suggest
t,q
t`q
t
4 It should be noted that q!1 dependence in the levels of a forecast error does not imply q!1
dependence of a function of those forecast errors. For example, a one-step ahead forecast error may
form a martingale di!erence sequence but still exhibit serial correlation in its square. See Harvey et
al. (1998) for a discussion.
206
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
using a kernel-based estimator. Such an estimator requires imposing conditions
on a kernel, K(x), as well as stronger moment and mixing conditions on g (b).
t
Assumption 6. (a) Let K(x) be a kernel such that for all
x, DK(x)D)1, K(x)"K(!x), K(0)"1, K(x)
is
continuous,
and
:= DK(x)D dx(R, (b) for u de"ned in Assumption 5, some bandwidth M and
~=
constant i, i3(0, min(u, 0.5)), M"O(Pi), and (c) There exists r6 3(1, 2] such that
(1!i)~1(r6 (d and += a(r6 ~1~d~1)(R.
j/1 j
( f (bK )!
Throughout the following, and for "xed j*0, CK ( j)"P~1+T
t/R`j t,q t
ff
(bK
) and CK ( j)"
( f (bK )!fM )h@
fM )( f
(bK
)!fM )@, CK ( j)"P~1+T
t`q~j t~j
hh
t/R`j t,q t
t~j,q t~j
fh
(bK
) where fM "P~1+T f (bK ). Furthermore, for
h (bK )h@
P~1+T
t/R t,q t
t/R`j t`q t t`q~j t~j
j(0, CK ( j)"CK (!j)@, CK ( j)"CK (!j)@, and CK ( j)"CK (!j)@.
ff
ff
fh
fh
hh
hh
Theorem 2.3.2. (a) Under Assumptions 1}5, CK ( j)P C ( j), CK ( j)P C ( j),
ff
1 ff
fh
1 fh
and
CK ( j)P C ( j).
(b)
Under
Assumptions
1}6,
SK "
hh
1 hh
ff
K( j/M)CK ( j)P S and SK "
K( j/M)CK ( j)P S , SK "+P~1
+P~1
j/~P`1
fh
1 fh
hh
j/~P`1
ff
1 ff fh
K( j/M)CK ( j)P S .
+P~1
j/~P`1
hh
1 hh
We now have all the tools necessary to conduct asymptotically valid out-ofsample inference concerning the moments of nonsmooth functions of parametric
forecasts and forecast errors. For example, given FK , BK and n( such that
FK P F, BK P B and n( Pn we can use Theorem 2.3.2 to create SK , SK , SK such
1
1
ff fh hh
that XK "SK #jK (FK BK SK @ #SK BK @FK @)#jK FK BK SK BK @FK @P X. Then, using Thefh
fh
hh
hh
1
ff
fh
orem 2.3.1, we know that s ,XK ~0.5P~0.5+T ( f (bK )!h )P N(0, I ). If
t/R t,q t
0
$
l
T
l"1 we can use standard normal tables to test the null. If l'1 we can use the
fact that s@ s P s2(l) and hence chi-square tables can be used to test the null.
T T $
3. Empirical evidence
In this forecasting exercise I apply the theory developed in Section 2 to the
test of equal MAE in the prediction of excess returns to the S & P 500 composite.
For simplicity, I choose to compare the predictive ability of two simple linear
regressions. For the "rst model I follow Fama and French (1988) and use the
dividend yield and a constant to predict excess returns. For the second model
I follow Campbell and Shiller (1988) and use the earnings}price ratio and
a constant to predict excess returns. Shiller (1984) also compares the predictability of dividend yields to the predictability of earnings-price ratios for the
S & P 500 composite. See Fama (1991) for a review of these and other results.
For the sake of comparison, I construct three distinct tests of equal predictive
ability. First I construct a test for equal MAE using the methods discussed in
Section 2 that account for e!ects due to parameter uncertainty. I then construct
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
207
the test for equal MAE ignoring parameter uncertainty using the statistic
proposed in Diebold and Mariano (1995). Finally, I construct the test for equal
MSE ignoring the potential e!ects of parameter uncertainty. Under the assumption that OLS provides consistent estimates of the parameters, West (1996) has
shown that one can ignore parameter uncertainty when testing for equal MSE.
3.1. Data and sources
The sample period includes 519 monthly observations from 1954:01 to
1997:03. The starting point 1954:01 is chosen to avoid the Treasury-Fed Accord
to peg interest rates. It is also the "rst month for which monthly frequency
observations for dividend yield exist for the S & P 500 composite.
I use the closing value of the S & P 500 composite as of the "nal Wednesday of
the month as the stock price (P ). These are obtained from Standard and Poor's
t
Current Statistics (1997) and Security Price Index Record (1997). The onemonth risk-free rate (I ), used to construct excess returns, is the US Treasury Bill
t
series obtained from Ibbotson Associates (1997). Using these two series I construct excess returns as Return "(P #D !P )/P !I . Standard and
t
t
t
t~1 t~1
t~1
Poor's Statistical Service does not publish the monthly dividend series (D ).
t
I construct one by summing the present and previous three quarter aggregate
dividends and dividing by 12. Pesaran and Timmermann (1995) also use this
technique.
The two predictors are dividend yield (D> ) and the earnings}price ratio
t~1
(EP ). To insure that the predictors are truly ex ante I do not use the dividend
t~1
series (D ) constructed above since it includes information through the end of the
t
present quarter. Instead, I use the dividend yield as reported in the Standard and
Poor's Security Price Index Record at the end of each month. For the same
reasons, I use the inverse of the price}earnings ratio rather than construct an
earnings}price ratio using quarterly information on earnings.
Table 1 reports standard descriptive statistics regarding OLS regressions that
use the dividend-yield or the earnings}price ratio as predictors. Each regression
exhibits little linear predictability. The residuals in each regression have distributions that are skewed and heavy tailed. The residuals exhibit little serial
correlation but are conditionally heteroskedastic in the regressors and exhibit
ARCH-type behavior.
3.2. Methodology and results
Let the scalar y
denote Return
and let Z and Z denote the (2]1)
t`1
t`1
1,t
2,t
vectors (1, DY )@ and (1, EP )@, respectively. We are interested in comparing the
t
t
predictive ability of the two simple linear regression models
y "Z@ bH#u
1,t`1
1,t 1
t`1
and y
t`1
.
"Z@ bH#u
2,t`1
2,t 2
(12)
208
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
Table 1
Summary statistics for full sample regressions of excess returns to S&P 500 composite!
Panel A: Unrestricted linear regression using both dividend yield and earnings-price ratio
Predictors
Constant
DY
EP
R2"0.0079
DW"1.9416
Coe$cient !0.0115
0.0099
!2.6716
Skewness coe$cient"!0.3539
(S.E.)
(0.0084)
(0.0054)
(2.1461)
Kurtosis coe$cient"2.0389
LM test for heteroskedasticity in residuals:
s2(5)"10.7396 with p-value"0.0567
LM test for serial correlation in residuals:
s2(12)"13.0227 with p-value"0.3674
LM test for serial correlation in squared
s2(12)"26.6380 with p-value"0.0087
residuals:
Panel B: restricted linear regression using dividend yield
Predictors
Constant
DY
R2"0.0048
DW"1.9399
Coe$cient !0.0058
0.0032
Skewness coe$cient"!0.3714
(S.E.)
(0.0079)
(0.0022)
Kurtosis coe$cient"1.9942
LM test for heteroskedasticity in residuals:
s2(2)"6.0936 with p-value"0.0475
LM test for serial correlation in residuals:
s2(12)"12.8804 with p-value"0.3778
LM test for serial correlation in squared
s2(12)"27.5580 with p-value"0.0064
residuals:
Panel C: restricted linear regression using earnings-price ratio
Predictors
Constant
EP
R2"0.0020
DW"1.9406
Coe$cient
0.0052
0.7589
Skewness coe$cient"!0.3568
(S.E.)
(0.0061)
(0.8570)
Kurtosis coe$cient"2.0280
LM test for heteroskedasticity in residuals:
s2(2)"8.4639 with p-value"0.0145
LM test for serial correlation in residuals:
s2(12)"12.6658 with p-value"0.3938
LM test for serial correlation in squared
s2(12)"27.3262 with p-value"0.0069
residuals:
!Notes: The data consist of monthly observations from 1954:01 to 1997:03 (¹"519). See Section
3 of the text for a description of the data. Standard errors are constructed using a heteroskedasticity
robust covariance matrix. The skewness and kurtosis coe$cients are constructed using the regression residuals.
The parameters are estimated using OLS and then the parameter estimates
bK are used to construct the forecasts Z@ bK .
i,t i,t
i,t
In this exercise I construct each of the three test statistics nine di!erent
ways corresponding to three di!erent forecasting schemes (recursive, rolling
and "xed) and three di!erent splits of the data. I use the three sample
splits (54:01}89:12, 90:01}97:03), (54:01}79:12, 80:01}89:12) and (54:01}79:12,
80:01}97:03). Given these splits, the corresponding values of n( "P/R are 0.20,
0.38 and 0.66.
To construct the test for equal MAE that accounts for parameter uncertainty,
an asymptotically valid variance, XK (from Theorem 2.3.2), is constructed. When
estimating the variance, I presume no knowledge regarding the existence
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
209
of heteroskedasticity or serial correlation. I use a Newey}West (1987) serial
correlation consistent covariance estimator of S , S and S .5 I use the
ff fh
hh
out-of-sample forecast errors and out-of-sample values of y , Z and Z
t`1 1,t
2,t
in the construction of f (bK )"Dy !Z@ bK D!Dy !Z@ bK D and
2,t 2,t
1,t 1,t
t`1
t,1 t
t`1
h (bK )"[(y !Z@ bK )Z@ ,(y !Z@ bK )Z@ ]@. I estimate B using the
2,t 2,t 2,t
1,t 1,t 1,t t`1
t,1 t
t`1
out-of-sample observations on Z and Z to form the (4]4) block diagonal
1,t
2,t
matrix with (P~1+T Z Z@ )~1 in the upper (2]2) diagonal position and
t/R 1,t 1,t
(P~1+T Z Z@ )~1 in the lower (2]2) diagonal position. To estimate F I use
t/R 2,t 2,t
(9) and (10) directly.
The test of equal MAE is constructed a second time ignoring parameter
uncertainty. This time the variance is estimated only using an estimate of S .
ff
The estimate was identical to the one used above.
For the sake of comparison, the test of equal MSE was also constructed. For
this test, f (bK )"(y !Z@ bK )2!(y !Z@ bK )2. Under the assump2,t 2,t
1,t 1,t
t`1
t,1 t
t`1
tions that OLS provides consistent estimates of the parameters, F"0. Using the
results in West (1996) we then know that we can ignore parameter uncertainty
when estimating the asymptotic variance. In estimating S , I presume no
ff
knowledge regarding the existence of serial correlation. Once again I use the
Newey}West (1987) estimator.
Table 2 reports the results of the tests. Each subpanel corresponds to one of
the sample splits. The "rst four columns report the raw out-of-sample MAE and
MSE associated with each of the two predictive models. The MAE values are
scaled by 100 and the MSE values are scaled by 1000. Note that in every
instance, the MAE and MSE is larger for model 1 than for model 2.
Column 5 reports the test for equal MAE that accounts for parameter
uncertainty. Column 6 reports the test for equal MAE that ignores parameter
uncertainty. In every instance, accounting for parameter uncertainty increases
the magnitude of the estimated variance. This causes the statistics that account
for parameter uncertainty to be uniformly smaller than the ones that do not.
This e!ect can also be seen in the p-values reported in columns 8 and 9. Because
of these changes, there are instances in which accounting for parameter uncertainty can a!ect the decision to reject or fail to reject the null of equal MAE.
During the 1980s, there does not appear to be any di!erence in the predictive
ability using either of the two models. This holds whether we use MAE or MSE
as the measure of predictive ability. For this time frame, accounting for parameter uncertainty made little di!erence in the tests for equal MAE.
The same cannot be said during the 1990s. Ignoring parameter uncertainty,
one would "nd that model two has a lower MAE at (approximately) the 1%
level regardless of which forecasting scheme is used. Accounting for parameter
uncertainty, we fail to reject at the 5% level using either the rolling or "xed
5 I use the integer part of P1@3 as the window width.
210
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
Table 2
Testing for relative predictive ability of predictions of excess returns to S&P 500 composite!
Raw values
MAE-1
90:01}97:03:
Recursive
Rolling
Fixed
80:01}89:12:
Recursive
Rolling
Fixed
80:01}97:03:
Recursive
Rolling
Fixed
n( "0.20
2.691
2.688
2.724
n( "0.38
3.551
3.552
3.557
n( "0.66
3.189
3.203
3.228
Statistics
P-values (2-sided)
Adj.
UnAdj.
Adj.
UnAdj.
MAE-2
MSE-1
MSE-2
MAE
MAE
MSE
MAE
MAE
MSE
2.614
2.634
2.619
1.183
1.183
1.199
1.153
1.163
1.154
1.990
1.541
1.899
2.565
2.354
2.540
1.726
1.455
1.765
0.047
0.123
0.058
0.010
0.019
0.011
0.084
0.146
0.078
3.536
3.551
3.534
2.280
2.284
2.270
2.271
2.282
2.253
0.506
0.046
0.539
0.515
0.048
0.590
0.339
0.115
0.432
0.613
0.963
0.590
0.606
0.962
0.555
0.734
0.909
0.666
3.148
3.180
3.153
1.819
1.828
1.831
1.801
1.817
1.792
1.724
1.154
1.610
1.844
1.326
2.223
0.987
0.768
1.364
0.085
0.249
0.107
0.065
0.185
0.026
0.324
0.442
0.173
!Notes: Table 2 reports empirical results relevant to testing for equal MAE and equal MSE between two
models used to predict the S&P 500 composite portfolio. Model 1 is an OLS estimated linear regression
with an intercept and once lagged dividend yield. Model 2 is the same but uses the earnings}price ratio. The
"rst four columns report the realized out-of-sample values of the MAE and MSE associated with each
model during three di!erent forecast periods. Column 5 reports the values of the test for equal MAE
adjusted (Adj.) for parameter uncertainty. Columns 6 and 7 report the values of the statistics used to
construct the tests for equal MAE and equal MSE both ignoring parameter uncertainty (UnAdj.). Columns
8}10 report the p-values (2-sided, from the standard normal distribution) associated with the statistics in
columns 5}7. MAEs are scaled by 100. MSEs are scaled by 1000.
forecasting schemes. We do reject at the 5% when the recursive scheme is used
but the evidence is weaker than when parameter uncertainty is ignored. Notice
that during the 1990s the test for equal MSE fails to reject the null of equal
predictive ability at the 5% for any of the sampling schemes. The null can be
rejected at the 10% level when either the recursive or "xed schemes are used.
Similar observations can be made regarding the tests for equal MAE throughout both the 1980s and 1990s. Ignoring parameter uncertainty, the "xed scheme
rejects the null at the 5% level. When parameter uncertainty is accounted for we
fail to reject at even the 10% level. When the rolling scheme is used we fail to
reject the null at the 10% level regardless of parameter uncertainty. When the
recursive scheme is used we reject the null at the 10% level regardless of
parameter uncertainty. Over the same time frame, we fail to reject the null for
equal MSE when any of the forecasting schemes are used.
Clearly, there are instances where accounting for parameter uncertainty
a!ects the decision to reject or fail to reject the null of equal MAE. What is not
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
211
clear is whether it was necessary to account for parameter uncertainty in the "rst
place. Recall that if F"0 then parameter uncertainty is asymptotically irrelevant and hence S is the relevant asymptotic variance. For the test of equal
ff
MAE, F can be zero if the disturbances have a zero median conditional on the
values of the regressors. In this application it seems reasonable to reject that
assertion. Using the skewness coe$cients reported in Table 1, the null of zero
skewness is rejected at the 1% level for each of the three sets of residuals.
4. Simulation evidence
The asymptotic results of Section 2 need only be appropriate for large
in-sample sizes R and out-of-sample sizes P. It is not clear how well the
asymptotic approximation will perform in sample sizes commonly used in
empirical work. To examine this problem, I present simulations of the three tests
of either equal MAE or equal MSE between the two simple linear regressions
,
(12@)
and y "Z@ bH#u
y "Z@ bH#u
2,t`1
2,t 2
1,t`1
t`1
1,t 1
t`1
where Z and Z denote the (2]1) vectors (1, z )@ and (1, z )@, respectively.
1,t
2,t
1,t
2,t
Each statistic is constructed in precisely the same manner as in Section 3. For
each statistic I report the size and size-adjusted power of the test in samples of
the size used in Section 3.
First, I simulate a hypothetical data generating processes that is stylized to the
empirical results of Section 3. The data generating process I have chosen has the
representation
#u , u "c(x
#x
)#g ,
y "bH z
2,2 2,t`1
t`1
t`1
1,t`1
2,t`1
t`1
t`1
"az #e
,
x
"2~0.5[(1!a2)z2 !1], z
i,t`1
i,t`1
i,t
i,t`1
i,t`1
e
&i.i.d. N(0, 1) g &i.i.d. t(6), e
oe
og ,
i,t`1
t`1
1,t`1 2,t`1 t`1
c"0.25, a"0.9.
(13)
The parameter bH (the second component of bH) is a tuning parameter used to
2
2,2
distinguish between the null and alternative. When bH "0 the null of either
2,2
equal MAE or equal MSE is satis"ed. When bH O0 the alternative holds;
2,2
model two has both a lower MAE and lower MSE than does model 1. I allow
this parameter to vary across the range 0, 0.10, 0.25, 0.50 and 1.00. By doing so
I am better able to determine how accounting for parameter uncertainty a!ects
the power of the test. I am also better able to determine whether tests for equal
MAE or tests for equal MSE are more powerful for detecting small deviations
from the null.
The initial conditions for the z are drawn from their unconditional distribui,t
tion. The y
are then constructed using (13). Each series is of length
t`1
212
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
500#519"500#(¹#1)"1019. The initial 500 observations are generated
to burn out the e!ects of initial conditions.
The results are based upon 5000 replications. Note that the same simulated
data is used for each sampling scheme and each (P, R) combination in order to
facilitate the di!erent small sample comparisons. To make comparisons possible
across the three hypothesis tests the random number generator is seeded so that
the three sets of 5000 separate samples are the same.
I chose this data generating process for two basic reasons. The "rst is that it
exhibits many of the characteristics of the data used in Section 3. The regressors
exhibit strong serial dependence. The distribution of the predictand, y , has
t
heavy tails and is skewed. The residuals from the two predictive models will
exhibit conditional heteroskedasticity in the regressors. Also, since the regressors are serially correlated, the squares of the residuals from the two
predictive models will be serially correlated (i.e. GARCH(1, 1)-like e!ects).
The second reason is that I wanted the two linear models to have little, if any,
predictive ability in order to match the very small R2 values commonly observed
in the literature. Here, both predictive models have a population R2 of zero. This
occurs because both bH and bH are zero under the null. This implies that both
2
1
models have the same predictive ability and hence the null is satis"ed.
But it does more than that. It implies that f
and S are equal to zero when
t`1
ff
either MAE or MSE is used to measure predictive ability. This does not imply
that an asymptotically standard normal test for equal predictive ability cannot
be constructed. For there to be asymptotic normality the limiting variance, X,
must be positive de"nite. If parameter uncertainty is irrelevant then S must be
ff
positive de"nite. On the other hand, if f
is zero for all t, and hence
t`1
S "FBS@ "0, then FBS B@F@ must be positive de"nite.
fh
hh
ff
For the test of equal MAE in this exercise, and when parameter uncertainty is accounted for, this is not a problem. This occurs because the disturbances are skewed for each predictive model and hence F"
)Z@ )@O0. It is a problem for the test of equal
(!E sgn(u
)Z@ , E sgn(u
2,t`1 2,t
1,t`1 1,t
MSE since consistent estimation of the parameters by OLS implies
Z@ )@"0. Hence, a priori we expect the test for
F"(!E u
Z@ , E u
2,t`1 2,t
1,t`1 1,t
equal MAE, corrected for parameter uncertainty, to be reasonably sized. We
also expect the test for equal MAE, without the correction for parameter
uncertainty, and the test for equal MSE to be missized.
Table 3 reports the actual size of the three tests when the critical values
$2.576, $1.96 and $1.645 are used. When n( "0.20 the test for equal MAE
that accounts for parameter uncertainty is reasonably well sized. It is also
uniformly more accurate than both the test for equal MAE without the correction for parameter uncertainty and the test for equal MSE. When n( is either 0.38
or 0.66 it is harder to make such a uniform statement. In every case the latter two
tests are seriously oversized. At the same time the test of equal MAE, that
accounts for parameter uncertainty, tends be undersized. This is particularly
M.W. McCracken / Journal of Econometrics 99 (2000) 195}223
213
Table 3
Actual size of out-of-sample tests!
Valid MAE
Invalid MAE
Invalid MSE
1%
5%
10%
1%
5%
10%
1%
5%
10%
n( "0.20
R
L
F
0.0118
0.0120
0.0134
0.0754
0.0784
0.0720
0.1416
0.1484
0.1504
0.0758
0.0670
0.1130
0.2048
0.1974
0.2556
0.3032
0.2986
0.3542
0.0642
0.0578
0.1048
0.2220
0.2116
0.2798
0.3382
0.3336
0.3916
n( "0.38
R
L
F
0.0072
0.0082
0.0046
0.0418
0.0476
0.0362
0.0968
0.1064
0.0898
0.0434
0.0434
0.1046
0.1624
0.1536
0.2522
0.2604
0.2484
0.3462
0.0344
0.0344
0.0908
0.1704
0.1604
0.2632
0.2910
0.2722
0.3796
n( "0.66
R
L
F
0.0036
0.0056
0.0012
0.0328
0.0458
0.0194
0.0768
0.0976
0.0596
0.0386
0.0330
0.1074
0.1382
0.1188
0.2626
0.2224
0.1984
0.3524
0.0324
0.0250
0.0904
0.1388
0.1200
0.2618
0.2376
0.2142
0.3770
!Notes: Subpanels denoted n( "0.20, 0.38 and 0.66 indicate sample sizes and splits corresponding to
those used in the empirical results reported in Table 2. Columns denoted 1%, 5% and 10% present
the actual size of the test when the critical values $2.576, $1.96 and $1.645 are used,
respectively. Rows denoted R, L and F signify the use of the Recursive, roLLing and Fixed schemes,
respectively. The results are based upon 5000 replication