07350015%2E2012%2E727721

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

Markov-Switching MIDAS Models
Pierre Guérin & Massimiliano Marcellino
To cite this article: Pierre Guérin & Massimiliano Marcellino (2013) MarkovSwitching MIDAS Models, Journal of Business & Economic Statistics, 31:1, 45-56, DOI:
10.1080/07350015.2012.727721
To link to this article: http://dx.doi.org/10.1080/07350015.2012.727721

View supplementary material

Accepted author version posted online: 12
Sep 2012.

Submit your article to this journal

Article views: 626

View related articles


Citing articles: 1 View citing articles

Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=ubes20
Download by: [Universitas Maritim Raja Ali Haji]

Date: 11 January 2016, At: 21:59

Markov-Switching MIDAS Models
Pierre GUÉRIN
Bank of Canada, Ottawa, ON K1A 0G9 (pguerin@bankofcanada.ca)

Massimiliano MARCELLINO

Downloaded by [Universitas Maritim Raja Ali Haji] at 21:59 11 January 2016

European University Institute, Florence, Italy; Bocconi University, Milan, Italy; and CEPR, London, United Kingdom
(massimiliano.marcellino@eui.eu)
This article introduces a new regression model—Markov-switching mixed data sampling (MS-MIDAS)—
that incorporates regime changes in the parameters of the mixed data sampling (MIDAS) models and

allows for the use of mixed-frequency data in Markov-switching models. After a discussion of estimation
and inference for MS-MIDAS and a small sample simulation-based evaluation, the MS-MIDAS model
is applied to the prediction of the U.S. economic activity, in terms of both quantitative forecasts of
the aggregate economic activity and the prediction of the business cycle regimes. Both simulation and
empirical results indicate that MS-MIDAS is a very useful specification.
KEY WORDS: Business cycle; Forecasting; Mixed-frequency data; Nonlinear models; Nowcasting.

1.

INTRODUCTION

Marcellino, and Schumacher (2011) compared the forecasting
performance of MIDAS and MF-VAR models for the prediction
of the quarterly GDP growth in the Euro area. They find that
MIDAS models outperform MF-VAR for short horizons (up to
5 months), while MF-VAR tend to perform better for longer
horizons. A similar comparison is provided by Bai, Ghysels,
and Wright (in press).
An issue that has so far attracted limited attention in the
MIDAS literature is the stability of the relationship between the

high- and low-frequency variables. Time-variation in MIDAS
models has been only introduced by Galvao (2009) via a smooth
transition function governing the change in some parameters
of the model. This Smooth Transition MIDAS is applied to
the prediction of quarterly U.S. GDP using weekly and daily
financial variables.
In this article, we propose an alternative way to allow for
time-variation in the MIDAS model, introducing the Markovswitching MIDAS (MS-MIDAS) model. Regime changes may
result from asymmetries in the process of the mean or variance.
From an economic point of view, the predicting ability of the
higher-frequency variables could change across regimes following, for example, changes in market conditions or business cycle
phases. For example, the slope of the yield curve is often considered as a strong predictor of U.S. recessions, an inverted yield
curve signaling a forthcoming recession. However, Galbraith
and Tkacz (2000) argued that the predictive power of the slope
of the yield curve is limited in normal times. Therefore, it could
be important to permit time-variation in the predictive ability
of the high-frequency data. Indeed, our empirical applications
show that in general the predictions from MS-MIDAS models
are more accurate than those from simple MIDAS models.
An additional attractive feature of Markov-switching models

is the possibility of estimating and predicting the probabilities
of being in a given regime. The literature (see e.g., Estrella

The econometrician often faces a dilemma when observations
are sampled at different frequencies. One solution consists of
estimating the model at the lowest frequency, temporally aggregating the high-frequency data. However, this solution is not
fully satisfactory since important information can be discarded
in the aggregation process. A second solution is to temporally
disaggregate (interpolate) the low frequency variables. However, there is no agreement on the proper interpolation method,
and the resulting high-frequency variables would be affected by
measurement error.
The third option is represented by regression models that
combine variables sampled at different frequencies. They are
particularly attractive since they can use the information of
high-frequency variables to explain variables sampled at a lower
frequency without any prior aggregation or interpolation. In this
context, the MIDAS (Mixed Data Sampling) model by Ghysels,
Santa-Clara, and Valkanov (2004) has recently gained considerable attention. A crucial feature of this class of models is the
parsimonious way of including explanatory variables through a
weighting function, which can take various shapes depending

on the value of its parameters.
MIDAS models have been applied for predicting both
macroeconomic and financial variables. Ghysels, Santa-Clara,
and Valkanov (2006) used the MIDAS framework to predict
the volatility of equity returns, while Clements and Galvao
(2008) successfully applied MIDAS models to the prediction of
quarterly U.S. GDP growth using monthly indicators as highfrequency variables. Andreou, Ghysels, and Kourtellos (2010)
exploited the informational content of daily financial variables
to predict quarterly GDP and inflation in the United States. In
particular, they extend the standard MIDAS model to include
factors in a dynamic framework, along the lines of Marcellino
and Schumacher (2010).
MIDAS models are generally used as single-equation models
where the dynamics of the indicator is not modeled. By contrast,
system-based models such as the mixed-frequency VAR (MFVAR) explicitly model the dynamics of the indicator. Kuzin,

© 2013 American Statistical Association
Journal of Business & Economic Statistics
January 2013, Vol. 31, No. 1
DOI: 10.1080/07350015.2012.727721

45

Downloaded by [Universitas Maritim Raja Ali Haji] at 21:59 11 January 2016

46

Journal of Business & Economic Statistics, January 2013

and Mishkin 1998; Birchenhall et al. 1999) often uses binary
response models to predict the state of the economy using the
NBER dating of expansions and contractions as a dependent
variable. However, this method can be problematic since the
announcements of turning points may be published up to 20
months after the turning point has actually occurred. Our MSMIDAS model instead allows for real-time evaluation and forecasting of the probability of being in a given regime.
Finally, MS-MIDAS is also a convenient approach to allow
for the use of mixed frequency information in standard Markovswitching models. Hamilton (2011) pointed out the importance
of using models with mixed frequency data for predicting recessions in real time. In our applications, the forecasting performance of standard MS models is indeed improved by the use of
higher frequency information. In addition, MS-MIDAS models
clearly outperform the linear and nonlinear benchmark models we consider at the nowcasting horizon in our two empirical
applications.

The article is organized as follows. Section 2 reviews the MIDAS approach, introduces the MS-MIDAS, and discusses the
estimation method. Section 3 presents Monte Carlo simulations
to assess the accuracy of the proposed estimation method in finite samples and its forecasting accuracy. Section 4 presents two
empirical applications on modeling and forecasting the levels
and turning points of quarterly GDP growth in the United States
using monthly indicators, and monthly U.S. IP growth using
weekly indicators. Section 5 concludes. An online Appendix
with an in-sample Monte Carlo experiment and an additional
empirical application with UK data completes the article.
2.

MARKOV-SWITCHING MIDAS

mial of lagged coefficients b(j ; θ ). A popular choice for the
weighting scheme is the exponential Almon lag:
exp(θ1 j + · · · + θQ j Q )
.
b(j ; θ ) = K
Q
j =1 exp(θ1 j + · · · + θQ j )


(2)

Note that the weighting function of the exponential Almon lag
implies that the
weights are always positive. Also, we impose
the restriction K
j =1 b(j ; θ ) = 1 to ensure that the parameters β1
and θ are identified. In the empirical applications, we employ
the exponential Almon lag scheme with two parameters θ =
{θ1 , θ2 }.
2.1.2 Autoregressive MIDAS. Introducing an autoregressive lag in the MIDAS specification is not straightforward, as
pointed out by Clements and Galvao (2008) who showed that a
seasonal response of y to x can appear. However, they find that
this can be done without generating any seasonal patterns if autoregressive dynamics is introduced through a common factor,
so that Equation (1) becomes
(m)
+ ǫt .
yt = β0 + λyt−d + β1 B(L1/m ; θ )(1 − λLd )xt−h


(3)

Andreou, Ghysels, and Kourtellos (2011) instead reported that
this specification has undesirable properties since the lag polynomial is characterized by geometrically declining spikes at
the low frequency due to the interaction of the high-frequency
polynomial with the low-frequency polynomial. Hence, we also
consider the ADL-MIDAS specification described by Andreou,
Ghysels, and Kourtellos (2010). This specification can be written as
(m)
yt = β0 + λyt−d + β1 B(L1/m ; θ )xt−h
+ ǫt .

(4)

2.1 MIDAS Approach
2.1.1 Basic MIDAS. The MIDAS approach by Ghysels,
Santa-Clara, and Valkanov (2004) involved the regression of
variables sampled at different frequencies. Following the notation by Clements and Galvao (2008), and assuming that the
model is specified for h-step ahead forecasting, the basic univariate MIDAS model is given by
(m)

+ ǫt ,
(1)
yt = β0 + β1 B(L1/m ; θ )xt−h

(m)
K
where B(L1/m ; θ ) = j =1 b(j ; θ )L(j −1)/m and Ls/m xt−1 =
(m)
. Note that t refers to the time unit of the dependent
xt−1−s/m
variable yt and m to the time unit of the higher-frequency vari(m)
ables xt−h
.
The forecasts of the MIDAS regression are computed directly
so that no forecasts for the explanatory variables are required.
However, unlike iterated forecasts, direct forecasts require to reestimate the model when the forecasting horizon changes (see
Chevillon and Hendry 2005; Marcellino, Stock, and Watson
2006, for a comparison of the relative merits of iterated and
direct forecasts).
The crucial difference between MIDAS and Autoregressive

Distributed Lag models is that the content of the higher frequency variable is exploited in a parsimonious way through
the polynomial b(j ; θ ), which allows to have a rich variety of
shapes with a limited number of parameters. Ghysels, Sinko, and
Valkanov (2007) detailed various specifications for the polyno-

2.2 Markov-Switching MIDAS
2.2.1 The Model. The basic idea behind Markovswitching models is that the parameters of the underlying data
generating process (DGP) depend on an unobservable discrete
variable St , which represents the probability of being in a different state of the world (see Hamilton 1989). The basic version of
the Markov-switching MIDAS (MS-MIDAS) regression model
we propose is
(m)
yt = β0 (St ) + β1 (St )B(L1/m ; θ )xt−h
+ ǫt (St ),

(5)

where ǫt |St ∼ NID(0, σ 2 (St )).
The MS-MIDAS that includes autoregressive dynamics as
in the literature by Clements and Galvao (2008) is instead
defined as
(m)
yt = β0 (St )+λyt−d +β1 (St )B(L1/m ; θ )(1−λLd )xt−h
+ ǫt (St ).

(6)
The MS-MIDAS with autoregressive dynamics without a
common factor is instead written as
(m)
yt = β0 (St ) + λyt−d + β1 (St )B(L1/m ; θ )xt−h
+ ǫt (St ).

The regime-generating process is an ergodic Markov-chain
with a finite number of states St = {1, . . . , M} defined by the

Guérin and Marcellino: Markov-Switching MIDAS Models

47

Table 1. Classification of the models

following transition probabilities:
pij = Pr(St+1 = j |St = i),
M


pij = 1∀i, j ǫ{1, . . . , M}.

(7)
(8)

Downloaded by [Universitas Maritim Raja Ali Haji] at 21:59 11 January 2016

j =1

Here, the transition probabilities are constant. This assumption was originally relaxed by Filardo (1994), who used timevarying transition probabilities modeled as a logistic function,
while Kim, Piger, and Startz (2008) modeled them as a probit function. However, we stick to the assumption of constant
transition probabilities to keep the model tractable.
The parameters that can switch are the intercept of the equation, β0 , the parameter entering before the weighting scheme,
β1 , and the variance of the disturbances, σ 2 . Changes in the
intercept β0 are important since they are one of the most common sources of forecast failure (see e.g., Clements and Hendry
2001). The switch in the parameter β1 allows the predictive
ability of the higher-frequency variable to change across the
different states of the world. Besides, we also allow the variance
of the disturbances σ 2 to change across regimes. This proves to
be useful not only for modeling financial variables but also for
applications with macroeconomic variables.
Note that we do not consider changes in all parameters of
the model (i.e., we keep the autoregressive parameter λ and
the MIDAS parameters θ constant). A switch in the MIDAS
parameters would lead to a different shape of the weighting
function across regimes. However, this increases the number
of parameters to be estimated and makes the estimation more
difficult since the weighting function becomes highly nonlinear.
As a result, our algorithm encountered convergence problems
and we often obtained outlier forecasts for the dependent variable. Besides, we already model a change in the forecasting
ability of the high-frequency indicator through a switch in the
parameter β1 .
Another attractive feature of the Markov-switching models is
that they allow the estimation of the probabilities of being in
a given regime. This is relevant, for example, when one wants
to predict business cycle regimes. Indeed, studies about the
identification and prediction of the state of the economy have
gained attention over the last decade (see, e.g., Estrella and
Mishkin 1998; Stock and Watson 2010; Berge and Jorda 2011;
and the literature review in Marcellino 2006).
Table 1 reports the classification of the models we use in the
empirical experiments.
2.2.2 Estimation and Model Selection. In the literature,
MIDAS models are usually estimated by nonlinear least squares
(NLS). However, for implementing the filtering procedure described by Hamilton (1989), we estimate the MS-MIDAS via
(pseudo) maximum likelihood. We thus need to make a normality assumption about the distribution of the disturbances, which
is not required with the NLS estimation. We aim at maximizing
the log-likelihood function given by
L=

T


log f (yt |t−1 ),

(9)

t=1

where f (yt |t−1 ) is the conditional density of yt given the information available up to time t − 1, t−1 . Note that f (yt |t−1 )

Model

Regime changes in

MSIH(M)-MIDAS
MSH(M)-MIDAS
MSIHAR(M)-MIDAS
MSHAR(M)-MIDAS
MSIHADL(M)-MIDAS
MSHADL(M)-MIDAS

2

β0 and σ
β0 , β1 and σ 2
β0 and σ 2
β0 , β1 and σ 2
β0 and σ 2
β0 , β1 and σ 2

AR component
No
No
Yes
Yes
Yes
Yes

NOTE: The suffix ”H” refers to models with a switch in the variance of the shocks. The
suffix ”I” refers to models with a switch in the intercept β0 . The suffix ”AR” means that we
include an AR component in the model through a common factor (see Equation (3)). The
suffix ”ADL” means that we include an AR component in the model without a common
factor (see Equation (4)).

can be rewritten as
f (yt |t−1 ) =

M


P (St = j |t−1 )f (yt |St = j, t−1 ). (10)

j =1

The computations are carried out with the optimization package
OPTMUM of GAUSS 7.0 using the BFGS algorithm. Appendix
A (provided online) provides more details about the estimation
method we use.
Choosing the number of regimes for Markov-switching models is a tricky problem. Indeed, the econometrician has to deal
with two problems: first, some parameters are not identified
under the null hypothesis and, second, the scores are identically equal to zero under the null. Hansen (1992) considered
the likelihood function as a function of unknown parameters
and uses empirical processes to bound the asymptotic distribution of a standardized likelihood ratio test statistic. Garcia
(1998) pointed out that the test is computationally expensive if
the number of parameters and regimes is high. Carrasco, Hu,
and Ploberger (2009) recently proposed a new method for testing the constancy of parameters in Markov-switching models.
Their procedure is attractive since it only requires to estimate
the model under the null hypothesis of constant parameters.
However, this testing procedure does not allow one to discriminate between Markov-switching models with different number
of regimes since the parameters must be constant under the null
hypothesis.
Psaradakis and Spagnolo (2006) studied the performance of
information criteria based on the optimization of complexitypenalized likelihood for model selection. They find that the
AIC, SIC, and HQ criteria perform well for selecting the correct
number of regimes and lags as long as the sample size and the parameter changes are large enough. Smith, Naik, and Tsai (2006)
proposed a new information criterion for selecting simultaneously the number of variables and lags of the Markov-switching
models. Note that both studies run their analysis with models
where all parameters switch across regimes, which might not
always be desirable. For example, in Equations (5) and (6), we
do not consider switches in the vector of parameters θ since
we encountered serious convergence problems in the empirical
applications due to the relatively small size of our sample
(T = 200). However, with larger sample sizes, the MS-MIDAS
model could easily accommodate changes in the θ vector. In
addition, Driffill et al. (2009) showed that a careful study of the

48

Journal of Business & Economic Statistics, January 2013

parameters that can switch is crucial for forecasting accurately
bond prices with the CIR model for the term structure.
In the empirical part, we will follow Psaradakis and Spagnolo
(2006) and use the Schwarz information criterion for selecting
the number of regimes and deciding whether the variance of the
disturbances should also change across regimes. We will also
report results for different parameterizations of the Markovswitching models.

Downloaded by [Universitas Maritim Raja Ali Haji] at 21:59 11 January 2016

3.

MONTE CARLO EXPERIMENTS

The DGP used in the Monte Carlo experiments is the
MSHAR(2)-MIDAS model defined by Equations (6)–(8), that
is, it is a model with two regimes and switches in the intercept
β0 , in the parameter entering before the weighting function β1
and in the variance σ 2 , since models with two regimes are often
used in the literature. We consider two sample sizes for the simulated series T = 200 and T = 500. The matrix of explanatory
variables includes a constant and the process for xt(m) is an AR(1)
with a large autoregressive coefficient (0.95) and a small drift
(0.025). We are primarily interested in the predictive content
of monthly variables for forecasting quarterly variables, so we
set K = 3 and K = 13. We use the following true parameter
values:
(β0,1 , β0,2 ) = (−1, 1), (β1,1 , β1,2 ) = (0.6, 0.2),
(σ1 , σ2 ) = (1, 0.67),

(11)

(θ1 , θ2 ) = (2 ∗ 10−1 , −3 ∗ 10−2 ).

(12)

These parameter values are similar to those used by Kim,
Piger, and Startz (2008) and closely match the in-sample parameter estimates of our empirical applications when (i) predicting
monthly industrial production with weekly variables as highfrequency predictors, and (ii) the UK quarterly GDP growth
with monthly indicators (see Tables B and C in the online Appendix). The transition probabilities are first set such that both
regimes are equally persistent (p11 = 0.95, p22 = 0.95). We
also consider another set of transition probabilities: p11 = 0.85
and p22 = 0.95. Indeed, with these transition probabilities, if
one thinks of yt as quarterly observations, the duration of the
first regime (6.67 quarters) is lower than the duration of the second regime (20 quarters), which roughly corresponds to the average duration of recessions and expansions experienced by the
United States.
We first simulate a Markov chain with two regimes using
one of the two sets of transition probabilities. The dependent
variable yt is then constructed depending on the outcome of the
simulated Markov chain using the above parameter values and
the simulated series for xt .
The sample size T is split between an estimation sample and
an evaluation sample. We choose two different sizes H for the
evaluation sample, H = {50, 100}. We then run the following
out-of-sample forecasting experiment: we use the first T − H
observations and compute one-step ahead forecasts . We recursively expand the estimation sample until we reach the end of
the sample T so that we compute H forecasts. The design of this
forecasting experiment is very close to the empirical application
we run later in the article.

We use 10 different models to compute the forecasts: the
MSHAR(2)-MIDAS model (i.e., the true model), the MSH(2)MIDAS model (i.e., the true model without an autoregressive
lag), a standard MIDAS, ADL-MIDAS, and AR-MIDAS models. We also include as competitors MS-MIDAS models with AR
dynamics without a common factor: the MSHADL-MIDAS and
MSIHADL-MIDAS models. We also consider an AR(1) model
and a standard Markov-switching model with two regimes, a
switch in the intercept and in the variance of the disturbances, as
well as one autoregressive lag. Besides, we also show the results
for an MSIHAR(2)-MIDAS model (i.e., a model with a constant
β1 ). Finally, we also consider as an additional benchmark the
ADL(1,1) model, which is a usual benchmark in the forecasting literature (see e.g., Baffigi, Golinelli, and Parigi 2004). We
always use the true number of lags (K = 13) for the highfrequency variable xtm when the models have mixed-frequency
data. Also, all nonlinear models have innovations with a regime
switching variance.
We repeat the forecasting experiment N times for each evaluation sample and report in Table 2 the average of the mean
square forecast error over the number of replications for the
10 models under consideration. We also report the quadratic
probability score (QPS) and log probability score (LPS) for the
models with Markov-switching features to check how well these
models can predict the true regimes. Here, the MSFE is a criterion that allows us to assess the quantitative forecasting abilities
of the model under scrutiny, whereas QPS and LPS criteria evaluate their qualitative forecasting abilities, that is, to what extent
the true regimes are predicted.
Note that the QPS is bounded between 0 and 2 and the range
of LPS is 0 to ∞. LPS penalizes large forecast errors more than
QPS. LPS and QPS are calculated as follows:
QPS =

H
2 
(P (St+1 = 1) − St+1 )2 ,
H t=1

LPS = −

(13)

H
1 
(1 − St+1 )log(1 − P (St+1 = 1))
H t=1

+ St+1 log(P (St+1 = 1)),

(14)

where St+1 is a dummy variable that takes on a value of 1 if the
true regime is the first regime and P (St+1 = 1) is the predicted
probability of being in the first regime in period t + 1.
Table 2 presents the results for the two sample sizes we consider (i.e., T = 200 and T = 500). The number of Monte Carlo
experiments N is set to 1000. The first panel reports the results
when the full estimation sample size is T = 200, while the second panel reports results when the full estimation sample size
is T = 500. The results for the true model (the MSHAR(2)MIDAS) are reported in the top row of each panel. We then
report in the following rows of each panel the results for each
model, as well as the average loss functions across the linear
MIDAS models and the nonlinear MIDAS models.
For the first estimation sample size T = 200 (reported in the
top panel of Table 2), the true model obtains the best continuous
forecasts when the evaluation sample is H = 100. However,
the MSHADL-MIDAS model—that is, the model equivalent
to the true model except that the autoregressive component is

Guérin and Marcellino: Markov-Switching MIDAS Models

49

Table 2. Monte Carlo results: forecasting exercise

Downloaded by [Universitas Maritim Raja Ali Haji] at 21:59 11 January 2016

50

100

Number of out-sample
forecasts H:

QPS

LPS

MSFE

QPS

LPS

MSFE

First estimation sample size
True model
Standard MS
MSH(2)-MIDAS
MSIHAR(2)-MIDAS
MSIHADL-MIDAS
MSHADL-MIDAS
ADL-MIDAS
AR-MIDAS
MIDAS
AR(1)
ADL(1,1)
Av. Linear MIDAS
Av. Nonlinear MIDAS

0.423
0.608
0.413
0.484
0.468
0.380






0.433

0.655
1.015
0.623
0.603
0.729
0.574






0.637

1.796
1.972
1.819
1.812
1.850
1.779
1.869
1.885
2.525
1.932
1.811
2.093
1.811

0.483
0.598
0.416
0.432
0.487
0.394






0.443

0.721
1.004
0.619
0.673
0.762
0.597






0.674

1.736
1.925
1.784
1.785
1.746
1.770
1.739
1.789
2.434
1.749
1.784
1.987
1.764

Second estimation sample size
True model
Standard MS
MSH(2)-MIDAS
MSIHAR(2)-MIDAS
MSIHADL-MIDAS
MSHADL-MIDAS
ADL-MIDAS
AR-MIDAS
MIDAS
AR(1)
ADL(1,1)
Av. Linear MIDAS
Av. Nonlinear MIDAS

0.345
0.684
0.361
0.390
0.416
0.318






0.366

0.528
1.181
0.559
0.592
0.638
0.500






0.563

1.721
1.868
1.794
1.934
1.766
1.732
1.805
1.848
2.351
1.889
1.793
2.001
1.789

0.378
0.653
0.407
0.396
0.434
0.319






0.387

0.578
1.141
0.570
0.606
0.676
0.507






0.588

1.624
1.830
1.739
1.913
1.717
1.611
1.792
1.745
2.167
1.722
1.754
1.901
1.721

NOTE: This table reports the average QPS, LPS, and MSFE over 1000 Monte Carlo replications. In the first estimation sample, the initial estimation sample size is T − H where T = 200.
In the second estimation sample, the initial estimation sample size is T − H where T = 500. Both estimation samples are recursively expanded until the end of the sample is reached.
The true model is the MSHAR(2)-MIDAS model. A classification of the models is provided in Table 1.

not included through a common factor—yields the best results
when the evaluation sample size is H = 50 for both discrete and
continuous forecasts. In addition, this model produces the best
discrete forecasts with the evaluation sample H = 100. The
simple MIDAS model has a poor forecasting performance as
compared to the AR-MIDAS model, and the nonlinear MIDAS
models perform on average better than the linear MIDAS models
for both evaluation samples H = {50, 100}.
Turning our attention to the second, longer, estimation sample
size (i.e., the bottom panel of Table 2), the true model obtains
the best continuous forecasts in the first evaluation sample while
the MSHADL-MIDAS model obtains the best discrete forecasts. Furthermore, in the larger evaluation sample H = 100,
the MSHADL-MIDAS models obtain the best forecasts for both
discrete and continuous forecasts. In addition, on average, nonlinear MIDAS models outperform the linear MIDAS models.
Finally, the standard MS model and AR(1) model yield rather
inaccurate continuous forecasts.
Overall, the true MS-MIDAS model is either ranked first or
has a forecasting performance very close to the best model for
discrete forecasts. In addition, our simulation experiments suggest that including an autoregressive component through a common factor can be detrimental for both discrete and continuous

forecasts. Indeed, the true model is often outperformed by the
MSHADL-MIDAS, which is equivalent to the true model except that the autoregressive component is not included through
a common factor.
4.
4.1

EMPIRICAL APPLICATIONS

U.S. Quarterly GDP Growth and Monthly Indicators

4.1.1 Design of the Real-Time Forecasting Exercise. We
analyze quarterly data for the U.S. GDP, taken from the real-time
dataset of the Philadelphia Federal Reserve, available at http://
www.philadelphiafed.org/research-and-data/real-time-center/
real-time-data/data-files/ROUTPUT/, which originates from
the work by Croushore and Stark (2001). The quarterly vintages
we use reflect the information available in the middle month of
each quarter. The dependent variable is taken as 100 times the
quarterly change in the log of the United States’ real GDP from t
= 1959:Q1 to 2009:Q4. The sample is then split into an estimation sample and an evaluation sample. The evaluation sample
consists of quarterly GDP growth in the quarters 1998:Q1–
2009:Q4. For each of these quarters, we generate forecasts
with horizons h = {0, 1/3, 2/3, 1}. The initial estimation

Downloaded by [Universitas Maritim Raja Ali Haji] at 21:59 11 January 2016

50

Journal of Business & Economic Statistics, January 2013

sample goes from 1959:Q1 to 1997:Q4 and is recursively
expanded over time until we reach the end of the sample.
The forecasting model is the MS-MIDAS model with three
regimes (MSIHAR(3)-MIDAS), which an in-sample analysis
reported in Appendix C (provided online) indicates as a proper
specification for describing quarterly U.S. GDP growth, and
consider alternative monthly indicators. First, we use the slope
of the yield curve since its predictive power for GDP growth
has been widely documented (Estrella and Hardouvelis 1991;
Galvao 2006; Rudebusch and Williams 2009). We take the difference between the monthly yields on 10-year Treasury bond
and the monthly yields on the 3-month Treasury-bill as a proxy
for the monthly slope of the yield curve. We also consider stock
returns as a monthly indicator for forecasting quarterly aggregate economic activity. Stock returns are taken as 100 times the
monthly change in the log of the S&P500 index. We then consider the monthly Federal Funds as a monthly indicator to take
into account the stance of the monetary policy, which is often
considered as an important determinant of economic activity.
We take the first difference for both the slope of the yield curve
and the Federal Funds since we achieve better forecasting results
with this transformation. We also use industrial production and
one survey of economic activity, the ISM manufacturing. The
data for the 10-year Treasury bond yields, the 3-month Treasury
bill, and the Federal Funds are taken from the Federal Reserve
web site, while the data for the S&P500 index are downloaded
from Yahoo Finance. The real-time data for monthly industrial
production are downloaded from the Federal Reserve Bank of
Saint Louis web site, while the data for the monthly ISM manufacturing are downloaded from Haver Analytics. These last two
variables are taken as 100 times the monthly change in the log
of the variable in level.
The design of the exercise is similar to the one described in
Section 3.1.1 of the article by Clements and Galvao (2008). We
denote yτ,ν as output growth in period τ released in the vintage
ν dataset. We aim at forecasting output growth as defined in the
second vintage data releases of GDP. Note that for GDP, the
vintage dataset released in quarter t + 1 contains data up to
quarter t, and the quarterly vintages we use reflect information
available in the middle month of each quarter.
A few additional comments are required. First, forecasts for
the regime probabilities k quarters ahead are computed recursively as

P (St+k = j ) =

M


pij P (St+k−1 = i).

(15)

i=1

Note that the predicted probabilities only depend on the transition probabilities and on the filtered probabilities.

Second, forecasts with a horizon h = 0 (i.e., nowcasts) imply that we want to forecast output growth for the current
quarter knowing the values of the monthly indicators for all
months of the current quarter. The nowcasts are computed as
follows: we first regress yt|t+1 on B(L(1/3) ; θ )xt|t+1 and yt−1|t+1 ,
where yt|t+1 = [y1|t+1 , y2|t+1 , . . . , yt−1|t+1 , yt|t+1 ] and xt|t+1 =
[x1|t+1 , . . . , xt−1|t+1 , xt|t+1 ]. We then use these estimates, the
forecasts for the regime probabilities P (St+1 = j |xt ), yt|t+1 , and
xt+1|t+1 to compute the forecasts ŷt+1|t+1 .
Forecasts with a horizon h = 1/3 imply that we only
know the values for the first 2 months of the monthly indicator. To obtain these forecasts, we first regress yt|t+1
on B(L(1/3) ; θ )xt−1/3|t+1 and yt−1|t+1 , where xt−1/3|t+1 =
[x1−1/3|t+1 , . . . , xt−4/3|t+1 , xt−1/3|t+1 ]. We then use these estimates, the forecasts for the regime probabilities P (St+1 =
j |xt ), yt|t+1 and xt+2/3|t+1 to obtain forecasts for yt+1 , which
is conditioned on xt+2/3|t+1 and yt|t+1 .
Similarly, forecasts with a horizon h = 2/3 imply that
we only know the values for the first month of the
monthly indicator. To obtain these forecasts, we first regress
yt|t+1 on B(L(1/3) ; θ )xt−2/3|t+1 and yt−1|t+1 where xt−2/3|t+1 =
[x1−2/3|t+1 , . . . , xt−5/3|t+1 , xt−2/3|t+1 ]. We then use these estimates, the forecasts for the regime probabilities P (St+1 = j |xt )
and xt+1/3|t+1 to obtain forecasts for yt+1 , which is conditioned
on xt+1/3|t+1 .
Finally, forecasts with a horizon h = 1 are computed from the
regression of yt|t+1 on B(L(1/3) ; θ )xt−1|t+1 and yt−1|t+1 . Hence,
for example, forecasts for the first quarter Q1 of a given year
are generated as described in Table 3.
4.1.2 Out-of-Sample Results. Table 4 reports our out-ofsample forecasting results for forecasting quarterly GDP growth
with monthly variables. We organize our results by reporting in the first column of each panel, for each indicator, the
MSFE of the average of the forecasts across the six nonlinear
MIDAS models we consider. We then calculate three relative
mean square forecast errors (RMSFE) in the next three columns
of each panel, using three different benchmarks: (i) an AR(1)
model, (ii) a standard ADL(1,1) model that regresses quarterly
GDP on its first lag and a quarterly average of the monthly indicator, and (iii) the MSFE of the forecasts across the three linear
MIDAS models we consider. Each of these three benchmarks
has its own virtues: the AR(1) model is a standard benchmark
in the forecasting literature; the ADL(1,1) model allows us to
assess the ability of the nonlinear MIDAS models to outperform a regression model that uses a simple rule of thumb for
aggregating the high-frequency variable; and the last benchmark
(the average of the forecasts across the linear MIDAS models)
permits us to directly evaluate the benefits of using Markovswitching parameters in the context of MIDAS models. The
complete results for each individual model we estimate at each

Table 3. Forecasting scheme for Q1
Forecast horizon h
Data up to month
Data up to month

Readily available variables
Macroeconomic variables

0

1/3

2/3

1

Marcht
Febt

Febt
Jant

Jant
Dect

Dect−1
Novt−1

NOTE: The variables available without a publishing lag are stock prices, the slope of the yield curve, Federal Funds, and ISM Manufacturing. The industrial production series are
published with a 1-month lag.

Guérin and Marcellino: Markov-Switching MIDAS Models

51

Table 4. Forecasting U.S. quarterly GDP growth with monthly variables
Slope of the yield curve

Federal Funds

Ratio to

Horizon

Nonlinear
MIDAS
(MSFE)

Linear
MIDAS

AR

0
1/3
2/3
1
Total Av.

0.336
0.328
0.323
0.308
0.324

0.984
0.975
0.945
0.901
0.951

1.107
1.082
1.064
1.013
1.014

ADL

Linear
MIDAS

AR

1.056
1.032
1.015
0.967
1.018

0.289a
0.280b
0.296
0.311
0.294

1.037
0.955
0.956
0.980
0.982

0.952
0.923
0.975
1.023
0.968

Downloaded by [Universitas Maritim Raja Ali Haji] at 21:59 11 January 2016

0
1/3
2/3
1
Total Av.

Linear
MIDAS

AR

0.209b
0.193a
0.243a
0.275b
0.230

0.827
0.796
0.999
1.085
0.927

0.688
0.636
0.800
0.905
0.757

Ratio to

ADL

Nonlinear
MIDAS
(MSFE)

Linear
MIDAS

AR

ADL

0.991
0.960
1.015
1.065
1.008

0.234a
0.239b
0.189b
0.241c
0.226

1.042
1.063
.894
1.195
1.049

0.770
0.789
0.623
0.793
0.744

1.001
1.025
0.810
1.031
0.967

ISM manufacturing

Ratio to

Nonlinear
MIDAS
(MSFE)

Ratio to

Nonlinear
MIDAS
(MSFE)

Industrial production

S&P500

Ratio to

ADL

Nonlinear
MIDAS
(MSFE)

Linear
MIDAS

AR

ADL

Standard MS

0.801
0.741
0.931
1.054
0.882

0.219b
0.223a
0.249b
0.266a
0.239

0.758
0.778
0.926
0.999
0.865

0.720
0.734
0.820
0.876
0.788

0.780
0.795
0.888
0.949
0.853

1.120

NOTE: The first column of each panel reports the MSFE for the average of the forecasts across the nonlinear MIDAS models for each indicator at different forecasting horizons. The next
three columns of each panel report the ratio of the MSFE for the average of the forecasts across the nonlinear MIDAS models using three benchmarks: the MSFE of the average across
the three linear MIDAS models, the AR(1) model, and the ADL(1,1) model. Total Av. is the average over all forecasting horizons h = {0, 1/3, 2/3, 1}. The superscripts a,b,c indicate that
a given model forecast at a given forecasting horizon outperforms the AR(1) model at 10%, 5%, and 1%, respectively, according to the Clark and West (2007) test of equal out-of-sample
forecast accuracy. The larger model is the one that takes the average of the forecasts over each of the nonlinear MIDAS models, which nests the AR(1) model.

forecasting horizon are reported in Table F of the online Appendix. Table F also reports the results of the Clark and West
(2007) tests of equal out-of-sample accuracy for nested models to assess the forecasting performance of the models under
scrutiny with respects to the forecasts from the AR(1) model.
Note that the asymptotic distribution of this test is not known in
the case of MS-MIDAS models, although it is probably reasonably approximated by the normal distribution.
Our results go as follows. On average, nonlinear MIDAS
models yield sizeable gains over the linear MIDAS models for
all monthly indicators, except the S&P500, and all forecast horizons. The best indicator is the S&P 500 index. The second best
indicator is industrial production, and in this case the gains of
MS-MIDAS over standard MIDAS are particularly large. In addition, the forecasts from the average of the nonlinear MIDAS
models using the ISM indicator yields better forecasts than its
three benchmarks. Moreover, again on average, the nonlinear
MIDAS models also produce better forecasts than the AR(1) and
ADL(1,1) models. Overall, this confirms the results by Kuzin,
Marcellino, and Schumacher (2012), who showed that pooling
mixed-frequency forecasts produce in general good and robust
results. The improvement in forecast accuracy is statistically
significant in most cases.
In addition to predicting quarterly GDP growth, Markovswitching MIDAS models can endogenously generate probabilities of being in a given regime. As a result, we report in
Table 5 the QPS and LPS for the models with Markov-switching
features to check how well these models can predict the true
regimes. The first column of each panel reports the average of
the QPS (or LPS) over each of the nonlinear MIDAS models,
while the second column of each panel reports the results for

the nonlinear MIDAS model with the best forecasts (i.e., with
the lowest QPS (or LPS)). We use the classification of the economic activity from the NBER so that St is a dummy variable
that takes on a value of 1 if the economy is in recession in quarter t according to the NBER, while P (St = 1) is the probability
of being in the recession regime in period t. Forecasts with a
horizon h = {0, 1/3, 2/3, 1} predict the regime of the economy
one quarter ahead.
In contrast to forecasting the level of GDP growth, the slope
of the yield curve and the Federal Funds tend to better predict
the state of the economy than stock prices and IP. This is in line
with the results from binary recession models that emphasize the
predictive power of the slope of the yield curve (see e.g., Estrella
and Mishkin 1998). ISM manufacturing, industrial production,
and stock prices have a relatively poor forecasting performance
in terms of discrete forecasts. In general, using information from
the monthly indicators produces better regime forecasts than a
standard MS model for quarterly GDP growth only when using
the monthly slope of the yield curve as high-frequency indicator.
Additional evidence on the predictive ability of the Markovswitching MIDAS specification is presented in Figure 1, where
we report the nowcasted real-time probability of being in a
recession with the slope of the yield curve as a monthly indicator
using the MSH(3)-MIDAS model with h = 0. These probabilities are generated from the recursive exercise and correspond to
the filtered probabilities for the last observation T, where T is
recursively expanded over time from t = 1997:Q4 to 2009:Q4.
This figure also shows three additional sets of real-time probabilities of recession: from a standard univariate regime-switching
model that only uses GDP, the Econbrowser recession probabilities that also only uses real GDP, and the recession probabilities

52

Journal of Business & Economic Statistics, January 2013

Table 5. Quadratic probability score and log probability score
Quadratic probability score
Slope of the yield
curve

Federal funds

Industrial
production

S&P500

ISM manufacturing

Horizon

Av.

Best
model

Av.

Best
model

Av.

Best
model

Av.

Best
model

Av.

Best
model

Standard
MS

0
1/3
2/3
1
Total av.

0.328
0.335
0.351
0.384
0.350

0.270
0.287
0.306
0.370
0.308

0.409
0.393
0.397
0.379
0.394

0.391
0.375
0.388
0.391
0.389

0.405
0.384
0.422
0.412
0.406

0.384
0.282
0.385
0.402
0.363

0.445
0.419
0.436
0.415
0.429

0.439
0.406
0.444
0.364
0.413

0.470
0.451
0.509
0.420
0.462

0.382
0.419
0.333
0.366
0.375

0.357

Log probability score

Downloaded by [Universitas Maritim Raja Ali Haji] at 21:59 11 January 2016

Slope of the yield
curve

Federal funds

Industrial
production

S&P500

ISM manufacturing

Horizon

Av.

Best
model

Av.

Best
model

Av.

Best
model

Av.

Best
model

Av.

Best
model

Standard
MS

0
1/3
2/3
1
Total av.

0.579
0.575
0.614
0.728
0.624

0.437
0.473
0.465
0.690
0.516

0.841
0.707
0.745
0.666
0.739

0.734
0.621
0.722
0.722
0.700

0.750
0.715
0.825
0.787
0.769

0.741
0.442
0.714
0.558
0.614

0.809
0.787
0.784
0.781
0.790

0.782
0.792
0.734
0.575
0.721

0.720
0.685
0.880
0.720
0.753

0.571
0.624
0.531
0.647
0.593

0.594

NOTE: QPS is calculated as
QPS =

F
2 
(P (St+h = 1) − NBERt+h )2 .
F t=1

LPS is instead calculated as
LPS = −

F
1 
(1 − NBERt+h )log(1 − P (St+h = 1)) + NBERt+h log(P (St+h = 1)),
F t=1

where F is the number of forecasts, P (St+h ) are the predicted regime probabilities of being in the first regime, and NBERt+h is a dummy variable that takes on a value of 1 if the U.S.
economy is in recession in quarter t + h according to the NBER. For h = {0,1/3,2/3,1}, we predict business cycle regimes one quarter ahead.

from a model using both GDP and GDI (see Nalewaik
2012).
Figure 1 shows that the MS-MIDAS model with the slope of
the yield curve gives a first signal of recession in the second
quarter of 2001 using information up to August 2001. The probability of recession then rises above 0.90 in the third and fourth
quarters of 2001. Interestingly, the probability of recession stays
above 0.35 until the second quarter of 2003 and only fall below
0.10 in the third quarter of 2003. This illustrates the slow economic recovery that followed the 2001 recession. Figure 1 also
shows that there is a first peak in the probability of recession
in the fourth quarter of 2007 using information up to February 2008. The probability of recession then jumps above 0.85
from the end of the first quarter of 2008 until the second quarter of 2009 and it starts declining in the third quarter of 2009.
This confirms the NBER dating of the end of the last recession in
June 2009. Note that our model gives the first signal of recession
well before the announcement of the recession by the NBER that
occurred in December 2008. In addition, Figure 1 shows that
this nonlinear MIDAS model outperforms the standard regime
switching model as well as the Hamilton’s Econbrowser recession indicator index for detecting the beginning of the last two
recessions (available at http://www.econbrowser.com/archives/

recind/description.html). Moreover, the real-time probability of
recession from the MS-MIDAS declines at a slower pace at the
end of the last two recessions than the standard MS model and
the Econbrowser recession indicator index.
A crucial point of the MS-MIDAS specification is that unlike
the probabilities of recession from the alternative models reported in Figure 1, the quarterly probabilities of recession from
MS-MIDAS models can be updated on a monthly basis (i.e.,
at the frequency of the xtm variable). This makes this class of
models very attractive for real-time estimation of business cycle conditions. Indeed, Table D in the online Appendix reports
the nowcasted probability of recession for the MSH(3)-MIDAS
model with the slope of the yield curve as an indicator for three
different forecast horizons h = {0,1/3,2/3}. The table confirms
that using the slope of the yield curve as a monthly indicator
provides strong calls of recession in the first quarter of 2008
since the probability of recession gradually increased in the first
quarter of 2008 to reach 0.87 in March 2008 (using information
available up to May 2008). Table D also shows that the probability of recession is decreasing in the third quarter of 2009
in line with the NBER dating of the end of the last recession.
However, the probability of recession remains fairly high in the
third and fourth quarters of 2009.

Guérin and Marcellino: Markov-Switching MIDAS Models

53

1
0.9
0.8
0.7
0.6
0.5
0.4

Downloaded by [Universitas Maritim Raja Ali Haji] at 21:59 11 January 2016

0.3
0.2
0.1
0
1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

Markov-switching MIDAS model
Standard Markov-switching model
Econbrowser recession indicator index
ProbabiliƟes inferred from the joint behaviour of GDP and GDI
Figure 1. Quarterly real-time probabilities of a U.S. recession. Note: The dotted line plots the probabilities of recession from a standard
regime-switching model. The solid line plots the probabilities of recession from the MS-MIDAS model that uses GDP and the monthly slope of
the yield curve. The line with triangles plots the probabilities of recession from the Econbrowser web site that only uses real GDP. The line with
squares plots the probabilities of recession inferred from the joint behavior of GDP and GDI. The shaded areas represent NBER recessions. The
online version of this figure is in color.

We also conducted an empirical application for forecasting UK quarterly GDP growth with monthly indicators. Our
results—reported in the online Appendix—show that MSMIDAS models outperform linear MIDAS models as well as
the AR(1) and ADL(1,1) benchmarks for continuous forecasts
of the economic activity. Besides, MS-MIDAS models also provide relevant timely updates of the UK business cycle regimes.
In summary, Markov-switching MIDAS models not only generate good forecasting results for the level of GDP growth, but
they also provide relevant information about the state of the
economy. In particular, the combination of high-frequency information and parameter switching performs better than using
each of these two features separately, as in standard MIDAS and
MS models, respectively.
4.2 Monthly U.S. Industrial Production Growth
and Weekly Indicators
In this subsection, we run another experiment with another
mix of frequencies, that is, we forecast monthly industrial production (IP) using variables sampled at weekly frequency. IP
is taken as 100 times the monthly change in the level of in-

dustrial production from 1970:M01 to 2009:M12. We consider
the following three explanatory variables taken at the weekly
frequency: the slope of the yield curve, the Federal Funds,
and the S&P 500 index. We use the same transformation for
these explanatory variables as in the previous analysis of GDP
growth.
We use a model with two regimes and a switch in the variance
of the innovations since this model captures well the U.S.
business cycle regimes (see Appendix D provided online). The
design of the real-time forecasting exercise is similar to the one
described above although the frequency mix we consider here is
different. The forecasting horizon h refers to forecasts h-weekahead. For example, for h = 0, the forecasts are made using full
information on the weekly indicator in the current month, while
for h = 4, forecasts have a 4-week horizon. The actual values for
IP are taken from the vintage of data T = 2010 : M1. The first
estimation sample goes from t = 1970 : 01 to t = 1997 : 12
and is recursively expanded until we reach the end of the
sample. As a result, our evaluation sample covers the same
time period as in the previous application when forecasting
quarterly GDP growth with monthly indicators. This permits
us to have a fair comparison of the forecasting performance of

54

Journal of Business & Economic Statistics, January 2013

Table 6. Forecasting U.S. monthly industrial production growth with weekly variables

Downloaded by [Universitas Maritim Raja Ali Haji] at 21:59 11 January 2016

Slope of the yield curve

Federal Funds

Ratio to

Horizon

Nonlinear
MIDAS
(MSFE)

Linear
MIDAS

AR

0
1
2
3
4
Total Av.

0.502a
0.508a
0.520a
0.516a
0.533
0.516

0.957
0.978
0.978
0.966
0.963
0.968

0.944
0.956
0.978
0.971
1.003
0.970

S&P500

Ratio to

ADL

Nonlinear
MIDAS
(MSFE