3. Structural change
Under structural change, any or all of g, a, b, m or V
n
, could alter or indeed the forms of the relationships in the system, the distributional assumptions, and lag
lengths, but these are not considered here. Situations such as economic transition undoubtedly involve changes to cointegration links and growth rates, as well as to
speeds of adjustment and equilibrium means. We will focus on the more ‘normal’ setting where g, a, or m shift. However, changes that involve setting a = 0, or
changing a = 0 to a non-zero value, inherently involve changes in the cointegration structure, and these are investigated below. Changes to b could be studied, but pose
rather different problems, as discussed later. Finally, changes to V
n
are predicted by the following theory to have less of an impact, unless they are very large, so are not
considered here. Individual realizations of integrated time series almost inherently have non-zero
data means, by force of their stochastic trends. Nevertheless, the I0 components could have zero unconditional means, depending on the form of data transforma-
tion and units of measurement. Thus, in order to clarify the impacts of parameter changes, we also write the system Eq. 4 in I0 space for the n variables y
t
, the first
r of which are the bx
t
and the remaining n − r are the relevant elements of Dx
t
e.g.
Dx
t
which ensures a non-singular representation. Of course, this representation is inappropriate when b can alter. Thus:
y
t
= f + Py
t − 1
+ e
t
with e
t
IN
n
0, V
e
. 5
The unconditional expectation or long-run mean E[y
t
] of y
t
over t = 1,…, T is:
E[y
t
] = I − P
− 1
f = 8 6
since the earlier assumptions ensure that P has all its eigenvalues less than unity in absolute value, leading to the homogeneous specification:
y
t
− 8 = Py
t − 1
− 8 + o
t
. 7
Under the assumption of constant parameters, Eq. 7 would deliver the one-step ahead outcome assuming known parameters:
y
T + 1
− 8 = Py
T
− 8 + o
T + 1
. 8
At time period T, however, we let f:P change to f:P, so from T + 1 onwards, the data are generated by:
y
T + h
= f + Py
T + h − 1
+ o
T + h
, h ] 1.
We assume P still has all its eigenvalues less than unity in absolute value, and that the number and form of the cointegrating vectors remains the same. Let
8 = I
n
− P
− 1
f, then:
y
T + h
− 8 = Py
T + h − 1
− 8 + o
T + h
. 9
The thrust of Hendry and Doornik 1997, building on Clements and Hendry 1994, is that changes in 8 are easy to detect, whereas those in f and P are not
when 8 is unchanged, so we consider that issue first. We refer to shifts in 8 as ‘deterministic’ changes since they change the long-run mean, whereas shifts in f
and P that leave 8 unchanged are referred to as ‘stochastic’ or ‘non-deterministic’, even though they might involve a changed intercept.
Let o
T + 1T
denote the difference between Eqs. 8 and 9 at one-step ahead: o
T + 1T
= 8 + Py
T
− 8 + o
T + 1
−
8 + Py
T
− 8 + o
T + 1
= 8 − 8 + Py
T
− 8 − Py
T
− 8 =
I
n
− P8 − 8 + P − Py
T
− 8 10
This would be the expected one-step ahead forecast error from using Eq. 8 as a model when Eq. 9 is the DGP, and all parameters are known. The first term on
the last line is the equilibrium-mean shift, and the second is the slope change, which has an unconditional expectation of zero. Indeed, if a slope change occurred when
the economy was near equilibrium, so y
T
8 and 8 did not alter, then o
T + 1T
0.
Of course, if the economy was in substantial disequilibrium, a larger effect would result, but this seems to part of the explanation for the non-detectability of shifts
in f and P that leave 8 unchanged.
3
.
1
. Detectable shifts We now show that deterministic shifts, namely shifts from 8 to 8, whether
induced by changes in the intercept f, or indirectly by changes in dynamics i.e. via I
n
− P
− 1
f are a detectable failure in linear dynamic econometric systems. We then demonstrate that even if every parameter alters, but this leaves the ‘long-run
means’ 8 unchanged, so f = I
n
− P8, structural change is not easily detected. Indeed, shifts in f, P with constant 8 transpire to be isomorphic to changes in
mean-zero processes, where f = f = 0. Further, while breaks in P alone can cause forecast failures, their ease of detection depends on the magnitudes of the long-run
means of the I0 components relative to their error standard deviations. As Hendry and Doornik 1997 remark, when the long-run mean is non-zero, breaks shift the
location of the data, inducing a short-run ‘trend’ to the new equilibrium mean, which is more easily detected than a one-off variance change around the origin. If
Eq. 4 had additional dynamics, these could be expressed in growth-rate form, and hence re-written to have zero means around the equivalent of g, so the same
argument about detectability would apply to shifts in their parameters.
Because the system is dynamic, the impact of any break takes time to have its full effect, and for the system to reach a new equilibrium, so the data expectations alter
in every period, making the process highly non-stationary. To develop the primary implications, we just consider the first period following a change, denoted time T:
later periods are amenable to analysis, as are successive changes, although the algebra becomes increasingly complicated. Let
denote an estimate, so the forecast error at T + 1, immediately after the break or ex post residual using parameter
estimates up to T, is oˆ
T + 1T
= y
T + 1
− yˆ
T + 1T
where:
oˆ
T + 1T
= f + Py
T
+ o
T + 1
− f
. −P.y
T
11 We treat finite-sample biases in estimators as negligible see Hendry, 1997, for an
explanation, so set E[P.]=P and E[f.]=f. Further, as almost all estimation methods match data means in-sample, 8
ˆ = I
n
− P .
− 1
f . . Let E denote the expected
value computed from the model, namely, the actual mean of the forecasts given the in-sample parameter values, when Eq. 5 is assumed to hold i.e. in ignorance of
the shift, then conditional on y
T
: E
[yˆ
T + 1T
y
T
] = f + Py
T
. This is to be contrasted with the actual data expectation at T + 1:
E[yˆ
T + 1T
y
T
] = f + Py
T
.
Then, detectability depends primarily on E[y
T + 1
y
T
] − E[yˆ
T + 1T
y
T
]. More precisely from Eq. 11:
E[oˆ
T + 1T
y
T
] = f + Py
T
+ E[o
T + 1
] − E[f.]−E[P.]y
T
E[oˆ
T + 1T
y
T
] = f − f + P − Py
T
E[oˆ
T + 1T
y
T
] = E[y
T + 1T
y
T
] = − E[yˆ
T + 1T
y
T
]. 12
Moreover, taking the unconditional means of each term using Eq. 6: E
[yˆ
T + 1T
] = f + PE[y
T
] = I
n
− P8 + P8 = 8 13
whereas:
E[y
T + 1
] = f + P8 = I
n
− P8 + P8 − P8 − 8 = 8 − P8 − 8,
so unconditionally:
E[y
T + 1
] − E[yˆ
T + 1T
] = E[oˆ
T + 1
] = I
n
− P8 − 8. 14
This is simply the first term in the last line of Eq. 10
The key implication from Eq. 14 is that E[y
T + 1
] = E[yˆ
T + 1T
] when 8 = 8. This can occur despite changes in the dynamics, represented by shifts in P, or changes
in f: for example, if {y
t
} is a mean-zero process; if 8 does not change; or if shifts in f offset those in P to leave 8 constant, despite both dynamics and intercepts
shifting. Further, for a given value of 8 − 8, the ellect is larger or smaller as I
n
− P moves closer to further from zero. Of course, the detectability of any shifts depends on their magnitude relative to
the error standard deviations. Let V
o −
1
= KK so that: Ky
t
= Kf + KPK
− 1
Ky
t − 1
+ Ko
t
,
or letting y
t +
= Ky
t
:
y
t +
= c + Cy
t − 1 +
+ o
t +
,
where KfC, o
t +
= Ko
t
IN
n
[0, I
n
] and C = KPK
− 1
since C and P are related by
a similarity transform, the ordering-dependence of the transformation by K does not matter here. Then, after the break:
E[y
T + 1 +
] − E[yˆ
T + 1T +
] = KI
n
− P8 − 8. This is the appropriate metric for judging the ‘magnitude’ of a break.
We now illustrate the implications of this analysis for detecting structural change, using some numerical simulations of parameter-constancy tests. Although the
above analysis only applies to one-step ahead errors after a single break, the Monte Carlo will examine many forecast break-test horizons, and allows for two shifts.
Nevertheless, the implications will be seen to hold more generally.
4. An I1 Monte Carlo