Manajemen | Fakultas Ekonomi Universitas Maritim Raja Ali Haji 073500106000000576

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

Bayesian Analysis of the Output Gap
Christophe Planas, Alessandro Rossi & Gabriele Fiorentini
To cite this article: Christophe Planas, Alessandro Rossi & Gabriele Fiorentini (2008) Bayesian
Analysis of the Output Gap, Journal of Business & Economic Statistics, 26:1, 18-32, DOI:
10.1198/073500106000000576
To link to this article: http://dx.doi.org/10.1198/073500106000000576

Published online: 01 Jan 2012.

Submit your article to this journal

Article views: 103

View related articles

Citing articles: 2 View citing articles


Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=ubes20
Download by: [Universitas Maritim Raja Ali Haji]

Date: 12 January 2016, At: 17:42

Bayesian Analysis of the Output Gap
Christophe P LANAS
Joint Research Centre, European Commission, 21020 Ispra (VA), Italy (christophe.planas@jrc.it )

Alessandro R OSSI
Joint Research Centre, European Commission, 21020 Ispra (VA), Italy (alessandro.rossi@jrc.it )

Gabriele F IORENTINI

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:42 12 January 2016

Department of Statistics, University of Florence, 50134 Florence (FI), Italy (fiorentini@ds.unifi.it )
Our objective is to build output gap estimates that benefit from information provided by Phillips curve
theory and business cycle studies. For this, we develop a Bayesian analysis of the bivariate Phillips curve

model proposed by Kuttner for estimating potential output. Given our priors, we obtain samples from
parameters and state variables joint posterior distribution following a Gibbs sampling strategy. We sample
the state variables given parameters using the Carter–Kohn procedure, and we exploit a likelihood factorization to draw parameters given the state. A Metropolis–Hastings step is used to remove the conditioning
on starting values. To accommodate the variance moderation that has been observed on U.S. gross domestic product, Kuttner’s model is extended for a change in variance parameters. We apply this methodology
to the analysis of the output gap in the United States and in the European Monetary Union. Finally, some
important extensions to the original Kuttner model are discussed.
KEY WORDS: Business cycle; Gibbs sampling; Kalman filter and smoothing; Markov chain Monte
Carlo; Unobserved components.

1. INTRODUCTION

is important information: Output gap measurements have indeed been strongly criticized for lacking reliability (see Orphanides and van Norden 2002), and knowing the uncertainty
has become imperative for policy. Second, the insertion of additional knowledge into the decomposition is expected to make
Bayesian gap estimates more precise than maximum likelihood
ones. In any case, researchers seeking to reduce the uncertainty
can use the Bayesian framework to assess the added value of
any extra information. Third, the Bayesian setting helps in understanding the salient features of Phillips curve regressions:
For instance, the sharpness of the response of inflation to different proxies for detrended output can be compared (see Gali,
Gertler, and Lopez-Salido 2001). Finally, by properly tuning
the prior distribution of variance parameters, the pile-up problem sometimes faced in classical analysis can be avoided. The

pile-up problem—that is, obtaining zero variance estimates for
the innovations in unobservables even though the true variance
is strictly positive (see Stock and Watson 1998)—is generally
undesirable because the related variable turns out deterministic
and, hence, observed.
Given priors on model parameters, we implement a Gibbs
sampling scheme for drawing model parameters and state variables from their joint posterior distribution (see, for instance,
Casella and George 1992 and the general discussion in Geweke
1999). We sample the state conditionally on parameters on the
basis of the Carter–Kohn (1994) sampler with de Jong’s (1991)
Kalman filter initialization. We draw the state in its first time
period using the results in Koopman (1997). A likelihood factorization allows us to sample the parameters given the state
in three blocks. We also introduce Metropolis–Hastings steps
(see, for instance, Chib and Greenberg 1995) for removing
the conditioning on the first observations. We reparameterize

In this article we develop a Bayesian analysis of the Phillips
curve bivariate model put forward by Kuttner (1994) for estimating the potential output and the output gap. Potential output
and output gap are two concepts that are essential to macroeconomic analysis because they are related to different economic
forces: potential output measures long-term movements associated with steady economic growth, whereas the output gap

captures all short-term fluctuations (see, for instance, Hall and
Taylor 1991). The concurrent output gap is, hence, subject to
careful monitoring by institutions responsible for stabilization
policy and inflation control. Taylor (1993) acknowledged this
when building his famous rule for describing the Federal Reserve monetary policy.
Kuttner’s (1994) original model relates output gap to inflation through the Phillips curve. Although somewhat atypical
because it involves a regression on an unobserved variable, this
model has entertained a certain success. For instance, Kichian
(1999) applied it to the G7 countries, Gerlach and Smets (1999)
emphasized its appeal for the European Central Bank policymaking process, and Apel and Jansson (1999a, b) further extended it to include unemployment. The European Commission
uses this framework for estimating structural unemployment
(see Planas, Roeger, and Rossi 2007), whereas the NAIRU estimates of the Organization of Economic Cooperation and Development (OECD) are obtained from a closely related model
(see OECD 2000).
In this particular context of cycle estimation embedded into
a Phillips curve regression, an important amount of macroeconomic theory and business cycle knowledge is available. It is,
thus, natural to develop a Bayesian analysis in order to incorporate this information into the decomposition. This yields
several benefits. First, because Bayesian methods deliver samples from posterior distributions, the finite-sample uncertainty
around any quantity of interest can be precisely delineated. This

© 2008 American Statistical Association

Journal of Business & Economic Statistics
January 2008, Vol. 26, No. 1
DOI 10.1198/073500106000000576
18

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:42 12 January 2016

Planas, Rossi, and Fiorentini: Bayesian Analysis of Output Gap

19

the traditional cyclical AR(2) model in terms of polar coordinates of the characteristic equation roots. We find this necessary mainly because specifying diffuse priors for autoregressive parameters implies informative priors on periodicity and
amplitude. We then resort to the adaptive rejection Metropolis
method proposed by Gilks, Best, and Tan (1995) for sampling
the polar coordinates (see also the corrigendum by Gilks, Neal,
Best, and Tan 1997). We also reparameterize the covariance between shocks in inflation and in the gap. Finally, in order to accommodate the variance moderation that has been observed on
U.S. gross domestic product (GDP) (see Kim and Nelson 1999;
McConnell and Perez-Quiros 2000; Stock and Watson 2002),
we extend Kuttner’s original model to allow for a break in the
variance parameters. Our empirical results suggest that the sampling scheme we put forward produces low correlations in the

chain, for a greater efficiency of posterior density estimates.
Section 2 discusses the model structure, the parameterization adopted, and the prior distributions. Section 3 describes the
procedure we propose for sampling from the joint posterior distribution of both model parameters and unobserved variables
in Kuttner’s model, with the possibility of a variance break.
Section 4 presents an application to the 12 countries of the
European Monetary Union (EMU) and to the United States.
Several model extensions together with their Bayesian treatment are discussed in Section 5. Section 6 concludes.
2. MODEL SPECIFICATION
Let yt denote the logarithm of real output. Like Watson
(1986) and Clark (1987), we assume that it is made up of a
trend, pt , and of a cycle, ct , according to
yt = pt + ct ,
pt = μp + apt ,

(2.1)

φc (L)ct = act ,
where L is the lag operator,  ≡ 1 − L represents the first difference, μp is a constant drift, and φc (L) = 1 − φc1 L − φc2 L2
is an AR(2) polynomial with stationary and complex roots. The
permanent and transitory shocks, apt and act , are orthogonal

Gaussian white noises with variances Vp and Vc . The long-term
and short-term components, pt and ct , are interpreted as potential output and output gap.
Kuttner (1994) complemented (2.1) with an equation that dynamically links change in inflation, say πt , to the output gap
as in
a∗πt

φπ (L)πt = μπ + βct−1 + λyt−1 + a∗πt ,

(2.2)

where
is a Gaussian white noise and the roots of the AR(2)
polynomial φπ (L) = 1 − φπ1 L − φπ2 L2 are assumed to be stationary. The original model included a moving average instead
of autoregressive terms: We make this slight modification in
order to simplify the statistical analysis. Equation (2.2) introduces a Phillips curve effect by relating the change in inflation
to the real output growth and to the latent cycle, both with one
lag. As in Kuttner’s model, the innovations in inflation and in
the gap are assumed contemporaneously correlated, with correlation coefficient ρcπ . Let κπ denote a real parameter and let
aπt be a Gaussian white noise with variance Vπ orthogonal


to act . We reparameterize ρcπ by writing the innovations a∗πt
as a∗πt ≡ κπ act + aπt , so (2.2) becomes
φπ (L)πt = μπ + βct−1 + λyt−1 + κπ act + aπt , (2.3)
√ 
and ρcπ = κπ Vc / κπ2 Vc + Vπ . This parameterization makes
the Bayesian treatment of all parameters homogeneous. The
shocks aπt and apt are assumed to be independent.
In the Bayesian framework, incorporating the prior information available often requires a careful model parameterization.
For instance, thanks to the bulk of empirical studies, it is generally admitted that business cycles in G7 countries typically last
from 2 to 10 years. Hence, we reparameterize the model for the
cycle as in


(2.4)
1 − 2A cos(2π/τ )L + A2 L2 ct = act ,

where the parameters A and τ represent the amplitude and periodicity of the cyclical movements, respectively. By construction, these parameters describe cycles much more naturally
than AR coefficients. Indeed, assuming a normal prior distribution for (φc1 , φc2 ), we found it difficult to reproduce our prior
knowledge by tuning the mean and the covariance matrix of
the autoregressive parameters. And in some cases the implied

distribution for the periodicity and amplitude can be counterintuitive. Let us consider, for instance, a flat prior for the AR
parameters over the region of stationary and complex roots,
2 < −4φ and φ ∈ (−1, 0). Half of the complex rethat is, φc1
c2
c2
gion where φ1 is negative yields roots with periodicity in [2, 4],
whereas the other half yields roots with periodicity in [4, ∞).
Therefore, the associated distribution for τ puts equal weight on
the intervals [2, 4] and [4, ∞) and 4 is the median value. A similar reasoning suggests that the flat prior on the complex roots
overweights close-to-one amplitudes. Hence, being noninformative on the coefficients of AR polynomials with complex and
stationary roots amounts to emphasizing short-term and persistent fluctuations. Putting the prior on the AR coefficients as,
for instance, in Chib (1993), Chib and Greenberg (1994), or
McCulloch and Tsay (1994), or on the partial autocorrelations
as in Barnett, Kohn, and Sheater (1996) and in Billio, Monfort,
and Robert (1999) is probably inadequate for cyclical analysis. The trigonometric specification proposed by Harvey (1981,
pp. 182–183) and used in a Bayesian setting in Harvey, Trimbur,
and van Dijk (2002) appears instead better suited. Here we stick
to Kuttner’s original model by considering the polar coordinate
setting (2.4). This parameterization actually simplifies Harvey’s
specification by excluding a moving-average term. For a similar

discussion about the link between prior and parameterizations
in the local level model, see appendix C of Koop and van Dijk
(2000).
We now complete the model by specifying the prior distribution of all parameters. Let δ = (μπ , β, λ, φπ1 , φπ2 , κπ ) denote
the vector of conditional mean parameters in (2.3). We shall
consider the block independence structure:
p(A, τ, Vc , δ, Vπ , μp , Vp ) = p(A)p(τ )p(Vc )p(δ, Vπ )p(μp , Vp ),
with
p(A) = Beta(aA , bA ),

τ − τl
p
= Beta(aτ , bτ ),
τu − τl


(2.5)

20


Journal of Business & Economic Statistics, January 2008

p(Vc ) = IG(sc0 , vc0 ),

p(δ, Vπ ) = NIG(δ0 , Mδ−1 , sπ0 , vπ0 )Iδ ,

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:42 12 January 2016

−1
p(μp , Vp ) = NIG(μp0 , Mp0
, sp0 , vp0 ),

where Beta(·, ·) is the Beta distribution, τl and τu are the lower
and upper bounds of τ ’s support, IG(·) is the inverted Gamma
distribution, NIG(·) is the Normal-inverted Gamma distribution, and Iδ is an index set for imposing constraints on the parameter support. The hyperparameters τl , τu , aA , bA , aτ , bτ ,
δ0 (6 × 1), Mδ (6 × 6), μp0 , Mp0 , sc0 , vc0 , sπ0 , vπ0 , sp0 , and
vp0 are assumed to be given; we shall discuss the setting of
some of them in the empirical application. In (2.5), the prior
distributions for Vc , δ, Vπ , μp , and Vp are naturally conjugate
for the full conditionals of interest. Computational convenience
is the main reason for this choice. The framework that ensues, however, remains quite flexible because we can be as
(non)informative as desired by properly tuning the hyperparameters: For instance, setting Mp0 , sp0 , and vp0 to small quantities leads to p(μp , Vp ) ∝ 1/Vp . The conjugate property is instead lost for the parameters A and τ . We will pay this price for
the parameterization given in (2.4) in terms of computational
complexity.
Let θ denote the full set of parameters, that is, θ ≡ (A, τ, Vc ,
δ, Vπ , μp , Vp ). Our objective is to characterize the joint posterior distribution of the potential output, the output gap, and the
parameters conditionally on the data, that is, p(cT , pT , θ |Y T ),
where xkT ≡ {xk , . . . , xT }, xT ≡ x1T , and Y T ≡ {yT , π T }. Given
our model, no closed-form expression for this posterior is available but draws from p(cT , pT , θ |Y T ) can be obtained using a
Gibbs sampling scheme. The full conditionals of interest are
• p(cT , pT |θ, Y T );
• p(θ |cT , pT , Y T ).
We explain how to sample from these two distributions in the
next section.
3. JOINT POSTERIOR DISTRIBUTION OF STATES
AND MODEL PARAMETERS
3.1 Sampling the State Variables Given
Model Parameters
We first focus on simulating the unobservable components
conditionally on model parameters. It will be useful to cast
equations (2.1), (2.3), and (2.4) into a state space format (see,
for instance, Durbin and Koopman 2001) such that
Yt = Hξt ,
ξt+1 = D + Fξt + wt+1 ,
where Yt = (yt , πt )′ is the vector of observations, ξt = ( pt , ct ,
ct−1 , κπ act + aπt )′ is the state vector, and wt = (apt , act , 0,
κπ act + aπt )′ is a Gaussian error vector with zero mean and
singular variance matrix Q. The time-invariant matrices H, D,
F, and Q can be straightforwardly recovered. As usual, ξt|k
and Pt|k denote the conditional expectation E(ξt |Y k ) and variance V(ξt |Y k ). Samples from p(cT , pT |θ, Y T ) will be obtained

through p(ξ T |θ, Y T ). We make use of the following identity that
gives the basis of the Carter–Kohn (1994) state sampler:
T

T

p(ξ |θ, Y ) = p(ξT |θ, YT )

T−1

t=2

p(ξt |θ, Y t , ξt+1 )p(ξ1 |θ, Y1 , ξ2 ),

where the last term is isolated because initial conditions need a
special treatment. A draw from p(ξ T |θ, Y T ) can be obtained as
follows:
1. Compute ξt|t and Pt|t , t = 2, . . . , T, via the diffuse Kalman
filter (de Jong 1991).
2. Given ξT|T and PT|T , sample ξT from p(ξT |θ, Y T ) =
N(ξT|T , PT|T ).
3. For t = T − 1 to t = 2, sample backward ξt from
p(ξt |θ, ξt+1 , Y t ) = N(E[ξt |θ, ξt+1 , Y t ], V[ξt |θ, ξt+1 , Y t ]).
4. Sample ξ1 from p(ξ1 |θ, ξ2 , Y1 ) = N(E[ξ1 |θ, ξ2 , Y1 ],
V[ξ1 |θ, ξ2 , Y1 ]).
Steps 1 and 2 only involve classical results. Step 3 needs
the conditional moments E[ξt |θ, ξt+1 , Y t ] and V[ξt |θ, ξt+1 , Y t ].
From the joint distribution of ξt and ξt+1 conditional on θ
and Y t , we get
E[ξt |θ, ξt+1 , Y t ] = ξt|t + Pt|t F ′ P−1
t+1|t (ξt+1 − Fξt|t ),
V[ξt |θ, ξt+1 , Y t ] = Pt|t − Pt|t F ′ P−1
t+1|t FPt|t .
Step 4 is more complicated. For t = 1, the preceding formula involves ξ1|1 and P1|1 but none of them is available in our model.
This occurs because if d is the state integration order, that is,
d = 1 in our case, the first state estimate that de Jong’s algorithm yields is ξd+1|d . A procedure based on Koopman (1997)
for obtaining ξ1|1 and P1|1 is detailed in Appendix A. In our particular context, use could be made of the fact that ξ2 contains c1
for skipping the sampling of ξ1|1 , but such a simplification only
holds when the trend integration order is 1, an assumption that
will be relaxed in Section 5. The algorithm in Appendix A has
the advantage of generality.
Because of the model structure, not all elements of the state
need to be simulated. Trivially, given yt , knowledge of ct determines pt . Also, given model parameters and observations up to
time t, sampling ct−1 yields the last state’s element, κπ act + aπt ,
through (2.3). We, thus, end up with ct−1 as the only random element to simulate at every time period. In this situation, using a
more efficient simulation smoother as proposed by de Jong and
Shepard (1995) and Durbin and Koopman (2002) instead of a
state sampler should not give relevant advantages.
3.2 Sampling Model Parameters Given the State
We now turn to the second full conditional distribution,
p(θ |cT , pT , Y T ) = p(θ |cT , pT , π T ). Several approaches are
possible: The strategy we put forward exploits model parameterization, prior block independence, and likelihood factorization in order to build three parameter blocks. Indeed, the structure of the model implies that the density p( pT , cT , π T |θ ) can
be factorized as
p( pT , cT , π T |μp , Vp , A, τ, Vc , δ, Vπ )
= p( pT |μp , Vp )p(cT |A, τ, Vc )
× p(π T |cT , pT , A, τ, Vc , δ, Vπ , μp , Vp ).

Planas, Rossi, and Fiorentini: Bayesian Analysis of Output Gap

Then the block independence assumption about the parameter
priors makes the full conditional p(θ | pT , cT , Y T ) verifying:
p(μp , Vp , A, τ, Vc , δ, Vπ |cT , pT , π T )

21

It remains to sample from the full conditional distribution
p(δ, Vπ |cT , pT , π T , A, τ, Vc , μp ,Vp ). We can write
p(δ, Vπ |cT , pT , π T , A, τ, Vc , μp , Vp )

= p(μp , Vp |pT )p(A, τ, Vc |cT )

T

× p(δ, Vπ |cT , pT , π T , A, τ, Vc , μp , Vp ).
We first consider the conditional distribution p(μp , Vp | pT ). As
detailed, for instance, in Box and Tiao (1973, p. 99) and Zellner (1996, chap. 3), choosing the conjugate Normal-inverse
Gamma prior yields
p(μp , Vp |p

T

−1
) = NIG(μp∗ , Mp∗
, sp∗ , vp∗ ),

T

∝ p(π1 , π2 |c , p , θ )

T

t=3

p(πt |π t−1 , cT , pT , θ )p(δ, Vπ )
∝ NIG(δ∗ , sπ∗ , Mπ∗ , vπ∗ )Iδ ,

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:42 12 January 2016

where

Mp∗ = Mp0 + T − 1,

Mπ∗ = Mπ0 + Z ′ Z,

−1
μp∗ = Mp∗
[Mp0 μp0 + (T − 1)
μp ],

vp∗ = vp0 + T − 1,
−1

1
1
sp∗ = sp0 +
(μp0 − 
μp )2
+
Mp0 T − 1
T

(pt )2 − (T − 1)
μ2p ,
+
t=2

1
T
and 
μp = T−1
t=2 pt .
Focusing next on p(A, τ, Vc |cT ), we consider the full conditionals p(A|τ, Vc , cT ), p(τ |A, Vc , cT ), and p(Vc |A, τ, cT ). The
first two verify
T


p(ct |ct−1 , A, τ, Vc )p(A)

T


p(ct |ct−1 , A, τ, Vc )p(τ ).

t=3

and
p(τ |A, Vc , cT ) ∝ p(c1 , c2 |A, τ, Vc )

t=3

p(πt |π t−1 , θ )p(δ, Vπ ).

Let Z denote the (T − 2) × 6 matrix of regressors in (2.2) and let

δ be the ordinary least squares (OLS) estimate of δ. Given our
priors, standard results in Bayesian regression analysis yield

where

p(A|τ, Vc , cT ) ∝ p(c1 , c2 |A, τ, Vc )

T


t=3

Sampling directly from the preceding conditionals is not
possible, but both densities are straightforward to evaluate.
Given that the log-concavity of the full conditionals cannot
be ensured, we use the adaptive rejection Metropolis scheme
(ARMS) proposed by Gilks et al. (1995, 1997). For the distribution p(Vc |A, τ, cT ), the NIG framework implies (see Bauwens,
Lubrano, and Richard 1999, p. 304)
p(Vc |A, τ, cT ) = IG(sc∗ , vc∗ ),
with
sc∗ = sc0 + (c1 , c2 )
c−1 (c1 , c2 )′ +

T


a2ct ,

t=3

vc∗ = vc0 + T,
where
c is the variance–covariance matrix of (c1 , c2 ) given
A and τ rescaled by the innovation variance Vc , that is,
c ≡
V[(c1 , c2 )|A, τ, Vc ]/Vc .

−1
δ∗ = Mπ∗
[Mπ0 δ0 + Z ′ Z
δ ],

vπ∗ = vπ0 + T − 2,
sπ∗ =


sπ0 + π3T (IT−2

(3.1)


−1 ′

− Z(Z Z)

Z

)π3T

−1
+ (δ0 − 
δ )′ [Mπ0
+ (Z ′ Z)−1 ]−1 (δ0 − 
δ ).

This result enables us to get draws from p(δ, Vπ |cT , pT , π T ,
A, τ, Vc , Vp , μp ) through a Metropolis–Hastings scheme with
the NIG(δ∗ , sπ∗ , Mπ∗ , vπ∗ )Iδ as proposal. For each candidate,
say (δ̃, Ṽπ ), the acceptance probability is given by


p(π1 , π2 |δ̃, Ṽπ , cT , pT , A, τ, Vc , μp , Vp )
. (3.2)
min 1,
p(π1 , π2 |δ, Vπ , cT , pT , A, τ, Vc , μp , Vp )
The Metropolis–Hastings step (3.2) removes the conditioning
on the starting values π1 and π2 . Notice that because the
innovations aπt are assumed to be Gaussian, only the first two
moments of (π1 , π2 ) conditional on (c1 , c2 ) and the model
parameters are needed for evaluating the acceptance probability. We detail in Appendix B a Yule–Walker procedure for computing them in the bivariate model (2.1)–(2.3).
This closes the circle of simulations, the full sequence
consisting of samples successively drawn from p(ξ T |θ, Y T ),
p(μp , Vp |pT ), p(A|τ, Vc , cT ), p(τ |A, Vc , cT ), p(Vc |A, τ, cT ),
and p(δ, Vπ |cT , pT , π T , A, τ, Vc , μp , Vp ). The Markov chain
properties discussed in Tierney (1994) ensure convergence to
the joint posterior p(ξ T , θ |Y T ).
3.3 The Variance Moderation in U.S. GDP Growth Rate
A few years after the publication of Kuttner’s article in 1994,
several researchers pointed out a substantial reduction in the
U.S. output growth volatility during the mid-1980s; see, for instance, Kim and Nelson (1999), McConnell and Perez-Quiros
(2000), and the discussion in Stock and Watson (2002). These
authors reported evidence of a decrease in the magnitude of
shocks to the U.S. economy of about one half to two thirds
compared to the previous two decades. The dynamic properties of the cyclical movements were unaffected. Blanchard and
Simon (2001) also observed a concurrent reduction in inflation
volatility, in a strong relationship with output fluctuations. Several explanations for this shrinking of the U.S. business cycle

22

Journal of Business & Economic Statistics, January 2008

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:42 12 January 2016

have been proposed; interested readers are referred to Stock and
Watson (2002). The variance moderation is worth inserting into
the modeling because it eventually yields gap estimates with a
reduced uncertainty. This only needs a relatively simple amendment to Kuttner’s original model. In particular, we let the innovation variances verify

t < tB
Vm ,
Vmt =
m = p, c, π.
(3.3)
αm Vm ,
t ≥ tB ,
The time index tB refers to the date of the change in the variance parameters. For the U.S. economy, the break is typically
deemed to have occurred during the first quarter of 1984. As
this date is the subject of a broad consensus, we do not discuss
here the dating issue. Rather, we focus on the Bayesian analysis of Kuttner’s model with variance shifts. Equation (3.3) introduces three additional parameters, αp , αc , and απ , for which
we assume the independent priors:
l


αm − αm
∼ Beta aαm , bαm ,
u
l
αm − αm

m = p, c, π.

Because we expect a variance reduction, the bounds of the
l < α u ≤ 1. For
support of the αm distribution verify 0 < αm
m
drawing the state given the parameters, the Carter–Kohn state
sampler described in Section 3.1 still applies with the shock
variance–covariance matrix Q appropriately corrected. For
sampling parameters given the state, the independence assumption on the α’s prior distribution preserves the three-block
structure, the blocks of interest becoming p(μp , Vp , αp |pT ),
p(A, τ, Vc , αc |cT ), and p(δ, Vπ , απ | pT , cT , π T , μp , Vp , A, τ,
αp , αc ). Let us first focus on the trend parameters. The full conditional distribution p(μp , Vp , αp |pT ) can be factorized according to
p(μp , Vp , αp |pT ) = p(μp , Vp |αp , pT ) × p(αp |pT ),
where the second term verifies


p(αp |pT ) ∝ p pTtB |p2tB −1 , αp p(αp )

because p2tB −1 does not depend on αp . Let It denote the identity matrix of dimension t, 1t,k a (t × k) matrix of 1’s, and,
for simplifying exposition, 1t ≡ 1t,t . The NIG prior assumption
(2.5) implies

p2tB −1

αp , Vp
pTtB

μp0 1tB −2,1
∼N
,
μp0 1T−tB +1,1


1tB −2,T−tB +1
1tB −2
ItB −2 + M
Mp0
p0
. (3.4)
Vp
1 B +1
αp IT−tB +1 + T−t
Mp0
Conditioning over p2tB −1 and αp and marginalizing with respect to Vp yields a multivariate Student distribution with parameters (see Bauwens et al. 1999, p. 304):
pTtB |p2tB −1 , αp
−1
  
 
,
∼ t E pTtB |p2tB −1 , αp , V pTtB |p2tB −1 , αp

sp∗ , vp0 + tB − 2 ,

where

′
sp∗ = sp0 + p2tB −1 − μp0 1tB −2,1

−1  tB −2

−1
× ItB −2 + Mp0
1tB −2
p2 − μp0 1tB −2,1 ,

and, in this context, E[pTtB |p2tB −1 , αp ] = E[pTtB |p2tB −1 ,
αp , Vp ] and V[pTtB |p2tB −1 , αp ] = V[pTtB |p2tB −1 , αp , Vp ]/
Vp . Both moments are straightforwardly available from the joint
Normal distribution given in (3.4). Samples of αp given pT
can be obtained using ARMS with the kernel of p(αp |pT )
evaluated as the product of this multivariate Student times the
Beta prior on αp . Next, given αp and pT , samples of μp and
Vp can be obtained similarly to Section 3.2 after rescaling the

trend equation (2.1) by αp in order to make the residuals homoscedastic.
For the cycle parameters, we consider the Gibbs scheme
p(A|τ, Vc , αc , cT ), p(τ |A, Vc , αc , cT ), and p(Vc , αc |A, τ, cT ).
Samples of A and τ from the first two conditionals can be obtained as in Section 3.2 with a correction of the likelihood of cT
to account for the variance change. The factorization
p(Vc , αc |A, τ, cT ) = p(Vc |αc , A, τ, cT )p(αc |A, τ, cT )
shows that Vc can also be sampled as in Section 3.2 using resid√
uals rescaled by αc . The sampling of αc requires instead
more attention. A simple solution is to use ARMS with the
kernel of p(αc |τ, A, cT ) evaluated as p(αc |τ, A, c1 , c2 , aTc3 ) ∝
p(aTctB |c1 , c2 , actB3−1 , τ, A, αc ) × p(αc ) where, from the NIG
framework,

 

p aTctB |c1 , c2 , actB3−1 , τ, A ∼ t 0, IT−tB /αc , sc∗ , vc0 + tB − 1 ,

tB −1 2
act .
with sc∗ = sc0 + (c1 , c2 )′
c−1 (c1 , c2 ) + t=3
It remains to sample the Phillips curve parameters. Here
we condition on απ in a further Gibbs step p(δ, Vπ |απ , pT ,
cT , π T , μp , Vp , A, τ, αp , αc ) and p(απ |δ, Vπ , pT , cT , π T ,
μp , Vp , A, τ, αp , αc ). Rescaling the Phillips curve equation (2.3)

by απ lets the sampling of δ and Vπ remain unchanged with
respect to Section 3.2. Finally, the sampling of απ , given all
other quantities, can be made with ARMS by observing that
p(απ |δ, Vπ , pT , cT , π T , μp , Vp , A, τ, αp , αc )


∝ p aπtB , . . . , aπT |Vπ , απ p(απ ),

where the first term is the Normal distribution with mean 0 and
variance–covariance matrix Vπ απ IT−tB .
This scheme presents the advantage of sampling jointly from
p(μp , Vp , αp |pT ) and p(Vc , αc |A, τ, cT ) without further conditioning for a greater efficiency of posterior estimates. This
strategy is, however, not applicable to the sampling of απ because of the autoregressive parameters in δ.
4.

EMPIRICAL APPLICATION

We now illustrate our methodology with a Bayesian investigation of the Phillips curve-based output gap in the United
States and in the euro area. The U.S. data have been downloaded from the U.S. Bureau of Economic Analysis Website.
The euro-zone data are from the OECD. For both cases, the
sample is made up of 141 quarterly observations from 1970-3

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:42 12 January 2016

Planas, Rossi, and Fiorentini: Bayesian Analysis of Output Gap

to 2005-3. Longer series are available for the U.S. data, but imposing the same time span facilitates comparisons. GDP series
are in constant prices. The inflation rate is measured on the basis
of the consumer price index (CPI). The OECD CPI series has
only been available since 1990-1; for the period 1970–1989,
we linked it to the CPI series of the euro-zone dataset (Fagan,
Henry, and Mestre 2001). We also experimented with the use
of the GDP deflator, but we found the relationship between
the CPI-based inflation rate and real GDP stronger at cyclical frequencies. This agrees with a comment by Kuttner (1994,
p. 363). There was some evidence of moderate seasonal movements in the euro-zone price index; we removed them with
the seasonal adjustment program Tramo–Seats (see Gomez and
Maravall 1996). Finally, an additive outlier was detected in the
EU-12 CPI series at the date 1975-1. We, thus, added a dummy
variable to (2.3). The Bayesian analysis of the associated parameter is similar to that of the other coefficients of (2.3). Inflation
and the logarithm of GDP have been multiplied by 100.
For setting the hyperparameters of the prior distributions, we
consider the information available from macroeconomic theory
and from business cycle knowledge. For the U.S. case, we take
into account Kuttner’s results, which characterize fluctuations
in the U.S. economy as occurring with a 5-year recurrence and
with a contraction factor around .8. Comparable values are obtained in the univariate analysis of Harvey et al. (2002) with the
so-called first-order cycle and noninformative priors. We could
observe that a less informative prior yields similar posterior
modes with more dispersion. For EU-12, we consider the results
in Gerlach and Smets (1999) that describe euro-area cycle with
a contraction factor close to that of U.S. cycles but with a longer
periodicity, namely, about 8 years. We, thus, center the first moment of the τ and A prior distributions on these values without
imposing too much precision. Specifically, our priors are (τ −
2)/(141 − 2) ∼ Beta(2.44, 16.35) and A ∼ Beta(5.82, 2.45) for
the United States and (τ − 2)/(141 − 2) ∼ Beta(2.96, 10.70)
and A ∼ Beta(5.82, 2.45) for the EMU. The support for τ is set
to [2, 141] because 2 is the minimum periodicity and 141 is the
number of observations available. Next, we tune the prior distribution on the trend drift in order to reproduce an annual mean
growth rate of about 3% for the United States and 2.5% for the
EMU, with an associated variance that allows for an annual deviation of roughly 1.5 percentage points. The prior distribution
of the cycle innovation variance has been set so as to add a further 2.0-percentage-point deviation on the annual growth. For
the United States, we tune the prior on the scale parameters αc
and αp so as to get an expected value about .3 with associated
standard deviation .15. More flexibility is assigned to the parameter απ : The expected value is tuned to .5 for a standard
deviation at .2. We set the break date at 1984-1, in agreement
with Kim and Nelson (1999) and McConnell and Perez-Quiros
(2000). Euro-zone data do not need this extension as there is no
clear evidence of such a variance moderation.
Prior information is also available for the Phillips curve equation. In particular, inflation is expected to react positively to the
lagged output gap and to the lagged output growth. The contemporaneous correlation between shocks in gap and shocks
in inflation should also be positive. We, thus, impose a positive support for the distribution of the parameters β, λ, and κπ
through the index set Iδ . The other Phillips curve parameters

23

are set diffusely around 0. The prior densities of all quantities
of interest are depicted in Figures 1–3.
Having set the hyperparameters, we obtain draws from the
joint posterior distribution of gap and model parameters following the Markov chain Monte Carlo (MCMC) scheme previously
detailed. The chain runs 200,000 times after a burn-in phase
of 100,000 iterations. Simulations are recorded every 10 iterations so all statistics reported are computed on samples of size
10,000. For a selection of variables of interest, Table 1 reports
the sample mode and standard deviation, the autocorrelations
at lags 1 and 5, the numerical standard error (NSE) associated
with the sample mean, the relative numerical efficiency (RNE),
and the p values of the Geweke (1992) convergence diagnostic (CD). The NSE is computed with a window on autocorrelations of length 400. The RNE is obtained as the ratio of the
NSE evaluated using only the output variance to the one that
takes into account the sample autocorrelations. Geweke’s CD
checks whether the average of the first 20% of simulations is
significantly different from the average last 50%. The selection
of variables we focus on is made up of the cycle periodicity and
amplitude, the inverse signal-to-noise ratio Vc /Vp , the output
mean growth μp , the contemporaneous correlation of the innovations in gap with those in inflation ρcπ , the impact multipliers
β and λ, the scale parameters αm ’s for the U.S. application, and
the gap estimated at the last sample date.
The autocorrelations between draws are rather low because
after only five lags the largest value is about .03 in absolute
value. The NSE takes values of magnitude 10−1 for the periodicity, 10−2 for the cycle point estimates and the signal-to-noise
ratio, and less than 10−2 for all other parameters. Hence, the
number of draws seems sufficient to estimate the posterior distribution and the moments of every quantity of interest with
a fair accuracy. Because the RNE values are almost always
above 50%, this precision is comparable to what we would have
achieved with independent samples of half-size. Only for the
U.S. second-period inverse signal-to-noise ratio does the RNE
take the relatively low value of 40%. The Geweke tests are
never rejected at the 5% level so chain convergence is obtained.
We, thus, turn to analysis of the results.
Figure 1 displays prior and posterior distributions of periodicity and amplitude of GDP cycles in both the United States and
the EMU. It can be seen that the periodicity posterior mode is
about 7 years for the United States and about 9 years for the
EMU. Positive skewness makes posterior means slightly larger.
For both the United States and the EMU, the posterior mode is
a bit larger than that of the prior. Maybe this reflects the relatively long expansion of the 1990s. There seems to be more
precision around the periodicity of cycles in the United States
than in the EMU, but this is mainly due to the difference between priors: Using the EU prior for U.S. data yields analogous
dispersion. Even if the posterior mode of the U.S. cycle periodicity is stable enough, the data appear to be only moderately
informative for this parameter. More precision is obtained for
the cycle amplitude parameter. With a mode of .8 and a comparable dispersion, the posterior distributions of U.S. and EMU
cycle amplitude seem much alike. This suggests that a similar
number of quarters is necessary to both economies for absorbing a demand shock.

24

Journal of Business & Economic Statistics, January 2008

Table 1. MCMC efficiency and convergence diagnostics
Model specification
GDP:
yt = pt + ct , pt = μp + apt , ct − 2A cos{2π/τ }ct−1 + A2 ct−2 = act
Inflation:
φπ (L)πt = μπ + βct−1 + λyt−1 + a∗πt
ρ1

ρ5

NSE
(×10−2 )

RNE

CD

United States: variance break at 1984-1
τ
27.18
8.36
A
.82
.05
.76
.04
μp
.32
.42
αc Vc /αp Vp
.14
.09
ρcπ
λ
.04
.03
β
.04
.02
.18
.08
αp
.15
.09
αc
.53
.12
απ
.50
.86
c2005-3

.02
.20
.00
.37
.02
.01
.06
.06
.07
.01
.00

.00
.01
.00
.03
.00
.00
.00
.00
.01
.00
.01

4.89
.06
.03
.50
.06
.02
.02
.07
.06
.09
.71

1.46
.46
1.07
.36
1.07
1.45
.81
.73
.96
.80
.74

.44
.74
.23
.98
.45
.53
.54
.36
.47
.98
.68

EMU
τ
A
μp
Vc /Vp
ρcπ
λ
β
c2005-3

.00
.10
.00
.21
.03
.00
.05
.01

.00
.00
.00
.01
.00
.00
.01
.00

8.15
.04
.03
.16
.09
.02
.01
.69

1.21
.99
.63
.73
.93
1.23
1.16
1.27

.79
.58
.40
.84
.45
.62
.07
.63

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:42 12 January 2016

Variable

Mode

35.21
.83
.59
.33
.26
.08
.04
−.44

Sd.

12.67
.05
.04
.19
.12
.04
.02
1.11

NOTE: ρcπ is the correlation between act and aπt ; Sd. stands for standard deviation; ρj represents the lag-j autocorrelation; NSE is the
numerical standard error of the mean; RNE is relative numerical efficiency; and CD denotes the p value of the Geweke convergence statistic.

Figure 1. Densities of cycle periodicity and amplitude (- - prior; — posterior).

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:42 12 January 2016

Planas, Rossi, and Fiorentini: Bayesian Analysis of Output Gap

25

Figure 2. Densities of correlation between innovations in gap and in inflation, of inverse signal-to-noise ratio, and of output mean growth rate
(- - prior; — posterior).

Figure 2 shows the prior and posterior distributions of
the correlation coefficient ρcπ , the inverse signal-to-noise ratio Vc /Vp , and the output mean growth rate μp . For the United
States, the inverse signal-to-noise ratio reported is that obtained
in the second sample period, that is, αc Vc /(αp Vp ). As can be
seen, besides imposing a positive correlation, our priors are
diffuse enough. The posterior distribution of ρcπ is highly concentrated around a mode of about .15 for the United States
and about .25 for the EMU. For both the United States and the
EMU, the variance of the shocks in the gap is roughly one third
of the variance of the long-term shocks, and this is obtained
with a good accuracy: Shocks on potential output are, thus,
dominating these two GDP series. Notice that the posterior distributions put no weight on Vc = 0 and Vp = 0. Hence, the
pile-up problem mentioned in the Introduction does not occur.

Figure 2 also shows the distribution of the output mean growth
rate: the posterior modes imply an average annual growth rate
of about 3.0% for the United States and 2.4% for the EMU.
Figure 3 displays the prior and posterior densities of the scale
parameters α’s. The standard deviations of both trend and cycle
innovations are reduced by slightly more than one half, making the signal-to-noise ratio almost constant. The moderation
of the inflation shocks is less pronounced, of order one third.
The remarkable posterior precision around the scale parameters
confirms the relevance of the variance break specification for
U.S. GDP and inflation.
Figure 4 presents the prior and posterior distributions of
the impact multipliers β and λ. The posterior distributions put
small weight in the neighborhood of 0 so the data seem to validate the prior restriction on positive responses. More constancy

Figure 3. Densities of scale factors (- - prior; — posterior).

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:42 12 January 2016

26

Journal of Business & Economic Statistics, January 2008

Figure 4. Densities of impact multipliers (- - prior; — posterior).

can be seen in the posterior response of inflation to shocks
on the gap than to shocks on the output growth. Because the
gap and output growth mainly differ in the periodicity of their
movements, it may be the noise amplification that the differencing operation implies that has obscured the empirical link
with inflation. This result suggests that Phillips curve regressions can be sensitive to the dynamic properties of the proxy
used for describing detrended output, as also discussed in Gali
et al. (2001).
Figure 5 shows the dynamic response of inflation to a shock
on the gap with associated 90% highest posterior density region (HPD; see Box and Tiao 1973, chap. 2.8). This impulse response function is obtained as the posterior mean of
the polynomial {β + λ(1 − L) + κπ φc (L)}/{φc (L)φπ (L)}. The
total inflationary effect of a shock on the gap is then {β +
κπ φc (1)}/{φc (1)φπ (1)}; its posterior distribution is displayed
in Figure 6, together with the distribution implied by the parameter priors. It can be seen that, besides positiveness, we have
not constrained the shape of the response by very much. The
dynamics of the propagation of a shock on the gap is similar in
the United States and in the EMU: After the 1 year, the inflationary effects are not significantly different from 0 at the 10%
level. Besides the contemporaneous reaction related to the correlation parameter, the largest impact on inflation of a shock on
the gap occurs with a three-quarter delay. The total response of
inflation is slightly stronger in the euro area than in the United
States. According to posterior modes, a 1-percentage-point rise
in the gap in one quarter yields an inflation pressure of about
.45 points for the EMU against .38 for the United States. There
is, however, more uncertainty for the EMU than for the United
States.

Figure 7 shows the posterior mean of the gap all along the
sample together with the maximum likelihood (ML) estimates.
For both the United States and EMU, the Bayesian and the ML
estimates are very close to each other. The sequence of turning
points suggests a delay in the fluctuations of the EMU economy
with respect to the United States. For instance, the four highest
peaks in the U.S. data occur in 1973-2, 1978-4, 1989-2, and
2000-2 and are followed by peaks in the EMU data in 1974-1,
1980-1, 1990-4, and 2000-3. Figure 7 also displays the 90%
HPD interval and the 90% ML confidence bands computed with
the Ansley–Kohn (1986) procedure that accounts for parameter uncertainty. Both intervals are centered around the null hypothesis of a zero-gap estimate. The effect of the U.S. variance
break is quite evident: The confidence interval shrinks by one
half in the second period. With respect to the constant parameter
model, the variance change yields a noticeable reduction of uncertainty after the break. For the EMU and the United States in
the last period, ML and Bayesian analysis yield a similar accuracy, with a slight advantage to the Bayesian approach. Figure 8
shows the posterior distribution of the gap estimate for the third
quarter of 2005, the last point of the sample, together with the
asymptotic distribution of the ML gap estimate. It can be seen
that for both the EMU and the United States, the use of prior information implies a nonnegligible gain in accuracy, even though
our priors were not very restrictive. This illustrates the usefulness of the Bayesian framework for characterizing unobserved
patterns when evidence in the data is only moderate.
5. MODEL EXTENSIONS
Model (2.1)–(2.5) can also be used for estimating longterm unemployment or NAIRU. In this case, the trend equation

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:42 12 January 2016

Planas, Rossi, and Fiorentini: Bayesian Analysis of Output Gap

27

Figure 5. Average response of change in inflation to a shock on output gap with 90% HPD region.

in (2.1) needs to be modified as unemployment series are typically better described as a second-order random walk than as a
random walk plus drift. The trend equation, thus, becomes
pt = μpt−1 + apt ,

(5.1)
μpt = aμt ,
where aμt is a Gaussian white noise with variance Vμ that is orthogonal to all other innovations. Because the stationary trans-

Figure 6. Posterior density of total response of change in inflation to a shock on output gap (- - prior; — posterior).

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:42 12 January 2016

28

Journal of Business & Economic Statistics, January 2008

Figure 7. Gap posterior mean (—) with 90% HPD region ML estimate (- - -) with 90% confidence bands.

formation of yt is now 2 yt , the output growth regressor in the
Phillips curve equation (2.2) or (2.3) must be replaced by 2 yt .
The cycle equation (2.4) is left unchanged. The state space rep-

resentation in Section 3.1 must be modified, the state becoming
ξt = ( pt , μt , ct , ct−1 , κπ act + aπt )′ with associated noise vector
wt = (apt , aμt , act , 0, κπ act + aπt )′ . For the prior on the trend

Figure 8. Densities of output gap in 2005-3 Bayesian (—) versus ML (- - -).

Planas, Rossi, and Fiorentini: Bayesian Analysis of Output Gap

parameters, we consider
p(Vp ) = IG(sp0 , vp0 ),
p(Vμ ) = IG(sμ0 , vμ0 ).
The full conditional of interest is p(Vp , Vμ |pT , μTp ). The likelihood factorization p( pT , μTp |Vp , Vμ ) = p( pT |μTp , Vp )p(μTp |Vμ )
and the prior independence assumption allow us to consider
p(Vp | pT , μTp ) and p(Vμ |μTp ) sequentially. Specifying inverted
Gamma prior distributions for Vp and Vμ yields full conditionals such that


T

p(Vp | pT , μTp ) = IG sp0 +
a2pt , vp0 + T − 1 ,

Downloaded by [Universitas Maritim Raja Ali Haji] at 17:42 12 January 2016

t=2



p(Vμ |μTp ) = IG sμ0 +

T

t=2



a2μt , vμ0 + T − 1 .

Letting the prior of all the other parameters remain unchanged
makes the analysis in Section 3 still valid. One column of matrix Z in (3.1) needs to be modified, however, as it must embody
2 yt instead of yt . This also implies that the Yule–Walker
procedure in Appendix B must be corrected. Samples of the
state in its first time period can be obtained using the results in
Appendix A. Notice that two elements of the state now need to
be sampled, ct−1 and μt .
As in Section 3.3, a break in the variances Vp and Vμ at a
specified date tB can be introduced. Let αμ denote the parameter indexing the slope variance reduction. Still assuming independent Beta priors Beta(aαm , bαm ) for m = p, μ, samples from
p(Vp |αp , pT μTp ) and p(Vμ |αμ , μTp ) can be obtained as before after a simple rescaling of the second-period shocks aTptB and aTμtB .
For the scale parameters αp and αμ , use can be made of ARMS
with posterior kernels evaluated as
T

, μTp )

p(αp |p


∝ p pTtB |p2tB −1 , αp , μTp p(αp )


t
B −1
T−1
−1
2
= t μtB −1 , αp IT−tB , sp0 +
apt , vp0 + tB − 2
and



× Beta aαp , bαp

p(αμ |μTp )

t=2



∝ p μTptB |μptB −1 , αμ p(αμ )


t
B −1
= t 0, αμ−1 IT−tB , sμ0 +
a2μt , vμ0 + tB − 2


× Beta aαμ , bαμ .

t=2

Sometimes model (2.1)–(2.5) or its NAIRU version arises as
the reduced form of a macroeconomic model; see, for example, Planas et al. (2007). In this case, it is likely that the structural model suggests that some economic variables should also
explain the inflation dynamics. Thanks to the flexibility of the
NIG framework, such exogenous regressors can be added without difficulty using the results of Section 3.

29

Finally, it is worth considering the possibility of a correlation
of shocks on the cycle with those on the trend level. A univariate decomposition of U.S. GDP with correlated components has recently been discussed in Morley, Nelson, and Zivot
(2003). Whether the long-term trend and the short-term fluctuations in GDP should be correlated is an open issue: As far
as we know, no compelling reason for such a correlation has
yet been produced. Harvey and Koopman (2000) pointed out,
however, several shortcomings of such decompositions. Nevertheless, interested readers can easily introduce this correlation
into model (2.1)–(2.4). The simplest way is to specify the cycle
equation (2.4) as in


1 − 2A cos(2π/τ ) L + A2 L2 ct = a∗ct

and to represent the short-term innovations a∗ct as a∗ct ≡ κc apt +
act , κc being a real parameter. The correlation with
 long-term

shocks is then trivially obtained as ρcp = κc Vp / κc2 Vp + Vc .
This parameterization is similar to the one used in (2.3); it has
the advantage that apt , act , and aπt are uncorrelated. The prior
for (κc , Vc ) becomes
−1
, sc0 , vc0 ).
p(κc , Vc ) = NIG(κc0 , Mc0

The full conditional of interest is now p(κc , Vc |A, τ, cT , Vp , aTp ),
the other ones being unchanged. We have
p(κc , Vc |A, τ, cT , Vp , aTp )
∝ p(c1 , c2 |A, τ, κc , Vc , Vp , aTp )
×

T

t=3

p(ct |ct−1 , A, τ, κc , Vc , Vp , aTp )p(κc , Vc ).

As before, the Normal-inverse Gamma framework leads to
T

t=3

p(ct |ct−1 , A, τ, κc , Vc , aTp )p(κc , Vc )
−1
= NIG(κc∗ , sc∗ , Mc∗
, νc∗ ),

where


Mc∗ = Mc0 + aTp3 aTp3 ,


−1
κc∗ = Mc∗
κc ],
[Mc0 κc0 + aTp3