Directory UMM :Data Elmu:jurnal:J-a:Journal of Econometrics:Vol98.Issue2.Oct2000:
Journal of Econometrics 98 (2000) 365}383
A quasi-di!erencing approach to dynamic
modelling from a time series of
independent cross-sections
Sourafel Girma*
School of Economics, University of Nottingham, Nottingham NG7 2RD, UK
Received 4 May 1998; received in revised form 10 March 2000; accepted 13 March 2000
Abstract
We motivate and describe a GMM method of estimating linear dynamic models from
a time series of independent cross-sections. This involves subjecting the model to
a quasi-di!erencing transformation across pairs of individuals that belong to the same
group. Desirable features of the model include the fact that: (i) no aggregation is involved,
(ii) dynamic response parameters can vary across groups, (iii) the presence of unobserved
individual speci"c heterogeneity is explicitly allowed for, and (iv) the GMM estimators
are derived by realistically assuming group size asymptotics. The Monte Carlo experiments conducted to assess the "nite sample performances of the proposed estimators
provide us with encouragement. ( 2000 Elsevier Science S.A. All rights reserved.
JEL classixcation: C23
Keywords: Time series of cross sections; Panel data; GMM
1. Introduction
In this paper we propose a new method of using data from a time series of
independent cross-sections (abbreviated to TISICS in what follows) in "tting
* Corresponding author. Tel.: #44-115-951-4733; fax: #44-115-951-4159.
E-mail address: [email protected] (S. Girma).
0304-4076/00/$ - see front matter ( 2000 Elsevier Science S.A. All rights reserved.
PII: S 0 3 0 4 - 4 0 7 6 ( 0 0 ) 0 0 0 2 4 - 5
366
S. Girma / Journal of Econometrics 98 (2000) 365}383
linear dynamic models such as
y "ay
#bx #f #e ,
(1)
it
it~1
it
i
it
where i and t index individual units and time periods, respectively, f is an
individual e!ect and e is a disturbance term. The main problem is the fact that
such data sets do not track the same group of individuals over time, whereas the
econometric models of interest require some information on intra-individual
di!erences. Thus, it is not possible to consistently estimate the parameters of
interest unless additional information in the shape of instrumental variables is
available or further identifying restrictions are imposed on the model.
Deaton (1985) proposes a pseudo-panel approach to circumvent the identi"cation problem caused by the absence of repeated observations on individual
units. This involves placing individuals into distinct groups or cohorts, say G ,
c
for c"1,2, C and working with the expectation of model (1) at each point in
time conditional on cohort membership. Thus taking EMy Di(t)3G N, we have
it
c
(2)
yH "ayH #bxH #f H#eH
ct
ct
ctit
ct~1
ct
where superscripts asterisk denote the conditional expectations.
Eq. (2) is a model of the average behaviour of a cohort, much in the spirit of
the &representative agent' construction. Since cohort averages are e!ectively the
units of analysis and the number of cross-sections is often small in practice, one
might have to assume asymptotics on C to derive large-sample properties of
and f H are
the ensuing econometric estimators. On the other hand as yH
ct
ct~1
correlated by construction, it is desirable to di!erence the latter out of the
model. This requires that f H be invariant over time. One way to force this
ct
condition is to assume that all individuals within a cohort are homogeneous
with respect to the unobserved heterogeneity factors. It will then be possible to
apply standard dynamic panel data techniques by treating sample cohort
averages as error-ridden observations on typical individuals (Collado, 1997).
Notwithstanding the fact that the pseudo-panel approach o!ers a framework
for making joint use of independent cross-sectional information, there are
problems with some of its features. Firstly the assertion of intra-cohort homogeneity seems too strong. It is indeed di$cult to assume away unobserved
individual heterogeneity away in micro data. Secondly the aggregation framework might not be desirable. An estimation method that is based on individual
level data can make a more e$cient use of the available information and
potential problems of interpretation arising out of aggregation are likely to be
mitigated. Thirdly, the practice of establishing the large-sample properties of
econometric estimators and test statistics by driving the number of cohorts to
in"nity is not satisfactory. There is often a physical limit beyond which one
cannot increase the number of cohorts. The oft-cited example of date of birth
cohorts is a case in point. Finally, and this pertains to dynamic modelling in
particular, standard pooling estimators will lead to invalid inference if the
S. Girma / Journal of Econometrics 98 (2000) 365}383
367
response parameters are characterised by cohort-wise heterogeneity. This
follows from Pesaran and Smith (1995) who showed that dynamic modelling
from pooled heterogeneous panels is not trivial. Given that cohorts should be
chosen so that they are as distinct from each other as possible (cf. Deaton, 1985),
it is important to have a framework which allows for some sort of response
heterogeneity.
The above discussion demonstrates that there is clearly scope for studying
alternative estimators for dynamic models from TISICS. In a contribution to the
literature Mo$t (1993) suggests a two-stage least-squares approach where the
regressor y
in (1) is instrumented by the predicted value y(
from a linear
it~1
it~1
projection of the dependent variable on a vector of time-varying and timeinvariant variables, say S and Z , respectively. To make this operational, one
it
i
needs knowledge of S
. But it is not always clear where this information is
it~1
going to come from as TISICS data sets do not typically contain the history of
the variables for the cross-sectional units. Moreover two-stage least squares is
potentially problematic in that the "tted instruments (which need to be correlated with y
) will also have a non-zero correlation with the individual e!ect
it~1
f , since the two quantities are statistically related. The only obvious exception is
i
if S is a function of time alone, hence uncorrelated with the time invariant f .
it
i
But this will result in instruments that do not have much variation across
individuals in the likely case where the number of cross-sections is much smaller
than the cross-sectionals units. As a result the ensuing instrumental variables
estimators will be highly inaccurate. So, although Mo$t's (1993) paper generalises Deaton (1985) in the sense that the possibility of instrumentation procedures
other than grouping is discussed, it does not seem to be too promising due to its
rather strong informational requirement (Verbeek, 1992). In this paper we
explore the possibility of making inference on model (1) from individual level
data, by performing pair-wise quasi-di!erences across di!erent observed individuals within the same group. This produces the following regression function
at each time period and for all i and j : EMy Dy
, x N. This is potentially
it jt~1 it
estimable since CovMy , y
N can be identi"ed from the data as long there are
it jt~1
group-wise cross-sectional correlations.
The plan for the rest of the paper is as follows. In Section 2 we formally
describe the quasi-di!erenced model along with the maintained assumptions.
We identify the maximum number of linearly independent quasi-di!erences that
are available and those will be used as a basis for estimation. In Section 3 we
generate some population moment restrictions implied by the quasi-di!erenced
model and suggest a generalised method of moments (GMM) estimation framework. The asymptotic properties of the GMM estimators are realistically based
on having a large number of individuals per group-time cell. This compares
favourably to the Deaton-type estimators in which the number of group/time
periods is required to grow without limit. Another important advantage of the
quasi-di!erencing approach is that slope parameter heterogeneity across groups
368
S. Girma / Journal of Econometrics 98 (2000) 365}383
can easily be allowed since the method can be applied on a single group. In
Section 4 we conduct some Monte Carlo experiments to assess the "nite sample
performances of the proposed estimators. Section 5 concludes.
2. Model description
2.1. Basic identifying assumptions
We focus on the estimation of the following class of models form TISICS:
#b@ x #f #e
y "a yH
i(t) i(t)t
i(t)
i(t)t
i(t)t
i(t) i(t)t~1
t"2, ¹; i(t)"1,2, N .
(3)
t
Here x is a k-dimensional vector of control variables, f represents individualspeci"c e!ects and the e's are time-varying disturbances. The time index in
parentheses is used to emphasise the non-panel nature of the data. However,
when no ambiguity is apparent, we will sometimes drop this index for the sake of
notational elegance. It is assumed that we have cross-sectional observations for
the time index on the individual
t"1,2, ¹, with ¹*2. Notice that for yH
i(t)t~1
does not match to the (proper) time index. Such variables are not observed and
they will often be denoted by a superscript asterisk. We will next discuss some of
the basic identifying assumptions of the model.
H.1: The population is partitioned into a "nite set of mutually exclusive and
exhaustive groups G , c"1,2, C. Groups may be based on one or more
c
underlying observed variables that are constant over time for all individuals.
Without loss of generality, we assume that at each point in time n members are
drawn at random from G , ∀c.
c
H.2: a "a "a and b "b "b for ∀(i, j)3G .
i
j
c
i
j
c
c
This assumption restricts the individual level heterogeneity in response parameters by stating that those parameters are identical within a group, but we do
the next best thing by allowing for the possibility of group-wise heterogeneity.
H.3: Da D(1, ∀c.
c
H.4: y "k f #k e , where the k's are constants and f is a random
i(t)0
1c
2 i(t)0
c
group speci"c e!ect.
The above assumption implies that the initial (unobserved) conditions are
random draws from possibly di!erent populations, but share a common component in the speci"cation of a conditional mean. Assumption H.4 has an
important implication: CovMy , y
NO0 for ∀(i(t), j(s))3G and ∀t, s. That is,
i(t)t j(s)s)
c
di!erent individuals within the same group exhibit nonzero correlations.
The information obtained from this source of correlations will prove useful in
identifying the dynamic response parameters. While H.4 may not be an easily
testable restriction, it is not without its justi"cations. It has been noted in the
econometric literature that group structures are a common feature of many
S. Girma / Journal of Econometrics 98 (2000) 365}383
369
cross-sectional data sets (e.g., Frees, 1995). Researchers working with TISICS
data often assume that each member of the group has some information about
the average group value (Blundell et al., 1994).
#h #w where the eigenvectors of j lie within
H.5: x "t #j@x
i(t)
i(t)t~1
ct
i(t)t
i(t)t
the unit circle, t is a k-dimensional vector of individual speci"c drift and h is
i(t)
ct
group-time speci"c error component which can come about due to some shared
characteristics between individuals within a group-time cell. It can be seen that
h is a further source of within group correlations. From the above two
ct
assumptions, it is clear that intra-group correlation is directly proportional to
the variances of f and h .Further assumptions are:
c
ct
H.6: CovMf , e N"0, ∀i and t.
i ti
H.7: f , f , f , h and e are all mutually, independently distributed with mean
i c ct ct
it
zero and nonzero and "nite variances.
The above assumption allows for purely individual-speci"c heterogeneity,
even if individuals within a group share a common e!ect. This is in contrast to
Deaton's (1985) approach where within-group heterogeneity is not explicitly
allowed for.
H.8: Cov Mx , f N"0.
i(t)t j(s)
This is an innocuous assumption since f is speci"c to individual j(s).
j(s)
Making use of the above assumptions, we can show that CovMy , y NO0
i(t)t j(s)s
and equal ∀(i, j)3G . In other words, our assumptions imply an equicorrelation
c
structure within a group-time cell. Intuitively, without equicorrelation there
would not be enough independent information to estimate O(n2) covariances
per cell from just n observations. We like to note that in one form or another, the
assumption of equicorrelation amongst the units of analysis is prevalent in
econometric modelling. For example, the two-way panel estimation relies on
equicorrelation between individuals observed during the same time period (e.g.,
Hsiao, 1986). The covariance stationarity assumption employed in time series
models is also a case in point.
2.2. A quasi-diwerencing approach
We begin by writing (3) in its reduced-form representation as1
#n x #n yH #v
xH
y "n xH #2#n
i(t)t
tt i(t)t
t0 i(t)0
t(t~1) i(t)t~1
i(t)t
t1 i(t)1
(4)
with
v
i(t)t
t~1
1!at
f n
" + ase
#
i(t)(t~s)
1!a i(t) ts
s/0
"b@at~s for 1)s)t,
1 As the estimator proposed in this section deals with a single group, the index on a and b is
dropped.
370
S. Girma / Journal of Econometrics 98 (2000) 365}383
and
n "at.
t0
Quasi-di!erencing the model as My !ay
; ∀[i(t), j(t!1)] and ∀t*2N,
i(t)t
j(t~1)t~1
we obtain the following potentially estimable model:
y "ay
#b@x #g
i(t)t
j(t~1)t~1
i(t)t
ijt
where
(5)
g "at[y !y
]#Df #Dx #De
ijt
i(t)0
j(t~1)0
ijt
ijt
ijt
with
a!at
1!at
f !
f
,
*f "
ijt 1!a i(t)
1!a j(t~1)
t~2
N2
!xH
*x "b@ + ar`1MxH
j(t~1)(t~1~r)
i(t)(t~1~r)
ijt
r/0
and
t~1
*e " + arMe
!e
N#e .
ijt
i(t)t~r
j(t~1)t~r
i(t)t
r/1
Let D "[0
:I
]; D "[I
:0
] and A "D ?l ?I ,
1
(T~1)C1 T~1
2
T~1 (T~1)C1
2
2 n
n
where 0 ; I and l are the null matrix, the identity matrix and the vector of ones
a a
a
of dimension a, respectively. When all observations within a group are arranged
"rst by individuals and then by time periods, and the transformation matrix
B(a)"A !aA is applied to the resulting n¹]1 vector, it can be checked that
1
2
the pairwise quasi-di!erenced model of Eq. (4) will be produced. Since the
n2(¹!1)]n¹ matrix B(a) has rank equal to n¹!1, the transformed error
covariance matrix will be singular because r"n2(¹!1)!(n¹!1) equations
in the quasi-di!erenced model are redundant. Consequently, any estimation
strategy has to make due allowance for the singularity of the disturbance
covariance matrix. There are two main ways of dealing with this (e.g., Judge
et al., 1985): (a) the use of appropriate generalised inverse estimators, and (b) the
elimination of the redundant equations prior to estimation. Employing generalised inverses does not seem to be a terribly attractive alternative in our framework. This is because it involves a multiple of n4 terms and with possibly large n,
this will pose awkward computational problems. We thus attempt to identify
n¹!1 linearly independent pairwise quasi-di!erences within each group. To
2 Although x
is observed, it is left in the disturbances for reasons that will shortly become
j(t~1)t~1
apparent.
S. Girma / Journal of Econometrics 98 (2000) 365}383
371
do so we "rst consider the following set of quasi-di!erences within each group
between individuals with the same ordering in their respective time domains:
¸ "My !ay
; ∀i(t), i(t!1) and t*2N.
1
i(t)t
i(t~1)t~1
Note that i(t)"i(t!1) is to be understood that as the case where individual
i at time t and individual i at time (t!1) correspond to the same ordering within
their respective time domains. It can be veri"ed that ¸ consists of n(¹!1)
1
linearly independent quasi-di!erenced equations. By augmenting ¸ by the
1
following n!1 quasi-di!erences between the "rst observation at time t"2 and
all observations bar the "rst one at time t"1, ¸ "My
!ay
;
2
1(2)2
j(1)1
∀j*23G N, it can be checked that ¸"¸ X¸ will produce the desired full set
c1
1
2
of n¹!1 linearly independent quasi-di!erences. As a result ¸ will exhaust all
information contained in the pairwise quasi-di!erenced model. Of course this is
not the only set of linearly independent quasi-di!erences. However, since any
other such set is a nonsingular transformation of ¸, the same information will be
preserved.
Let = "[y
, x@ ]@ and de"ne c "[a, b@]@ as the true parameter
0
ijt
j(t~1)t~1 i(t)t
vector. When the above transformations are applied to Eq. (3), the following
model will then be available for estimation for
∀[i(t), j(t!1)]3S,S XS
1 2
y "ay
#b@x #g ,c@ = #g ,
0 ijt
ijt
i(t)t
j(t~1)t~1
i(t)t
ijt
where S "M[i(t), i(t!1)] and t*2N; S "M∀[1(2), j(1)], j(1)*2)N.
1
2
(6)
3. Inference
3.1. Identixcation
OLS on model (6) would produce inconsistent estimators because g is
ijt
correlated with both y
and x . This can easily be checked using the
j(t~1)t~1
i(t)t
underlying assumptions of the model. Fortunately, these assumptions also
imply that the composite error term gt satis"es the following two sets of linear
cij
moment conditions:
EMy
g N"0; t"2,2, ¹; s"1,2, t!1; ∀g(t!1)Oj(t!1)
g(t~s,t~s) ijt
and
EMx
g N"0; t"2,2, ¹; s"0,2, t!1;
g(t~s)t~s ijt
∀g(t!s) D g(t!s)Oi(t) or g(t!1)"j(t!1).
That is past and present values of the dependent and explanatory variables
within the same group can be used as instruments. Because of assumptions H.4
and H.5, a nonzero IV-regressor correlation is ensured. Notice that since the
372
S. Girma / Journal of Econometrics 98 (2000) 365}383
correlations in the x's between the di!erent individuals comes from the h 's,
ct
leaving the observable x
in the composite disturbance term ensures that
j(t~1)t~1
the latter does not contain a h term. Also notice that since g contains
ijt
y !y
, any y
will not be correlated with it because of the
i(t)0
j(t~1)0
g(t~s)(t~s)
equicorrelation assumption imposed on the model.
A problem that we face is the possibility of an in"nite number of instruments,
given that the number of observations per group-time cell is allowed to grow.
For example, X
; i"1,2, n can be used as instruments for the regrei(t~2)t~2
ssors in the quasi-di!erenced model. Since standard optimality results for GMM
estimators (e.g., Hansen, 1982) are not trivially applicable when the vector of
conditioning variables is in"nite dimensional, it pays to look for some ways to
economise on the number of instruments to be used. One apparent solution
might be to average the instruments over cells. In this case there would not be
enough internal variation in the instruments set, unless the number of groups
grows with n at a suitable rate. This is bound to adversely a!ect the precision of
the resulting estimators. In this paper we restrict ourselves to a set of "nite
orthogonality conditions by the simple, albeit ad hoc, device of using just one
orthogonality condition within each subset of the moment restrictions. For
example in the Monte Carlo experiments conducted in the next section, the
following moment restrictions are chosen to be the basis for estimating c:
, x@ N, where g is made to
EMZ g N"0 with Z "My
, x@
ijt ijt
ijt
g(t~1)t~1 g(t~1)t~1 g(t)t
correspond to j!1.3 We have to emphasise, however, that this is
entirely arbitrary. All that is needed is to assign distinct instruments to each
observation.
3.2. GMM inference
GMM estimation proceeds by forming sample analogs of those population
moment conditions. The sample analogs are usually sample averages of Z g
iji ijt
(also known as the disturbances of GMM) and the latter are expected to satisfy
appropriate laws of large numbers (LLNs) for all admissible points in the
parameter space. As far as justifying the use of LLNs is concerned, the simplest
case occurs when the random variables under consideration are independently
and identically distributed. In our case, however, there is some dependence
across observations due to the assumptions we employed and the quasi-di!erencing transformation of the data. In fact the disturbances of the model exhibit
contemporaneous as well as temporal correlations within groups. They are also
heterogeneous across time as can be veri"ed by inspecting the structure of gt .
cij
Alternative approaches to deriving LLNs that are applicable to a variety of non
i.i.d environments are proposed by several authors (e.g., White, 1984). These
3 We adopt the convention of putting g"n for j"1
S. Girma / Journal of Econometrics 98 (2000) 365}383
373
laws of large numbers typically place some restrictions on the dependence,
heterogeneity and moments of the data process.
The main problem in the present context is the contemporaneous within
group dependence introduced by the transformation function ¸ . One practical
2
solution may be to average all quasi-di!erences produced by ¸ and assume
2
that as n grows the dependence will become negligible (and accept the risk of any
"nite sample bias). In the sequel we work with the model produced by ¸ alone,
1
because the large sample argument with a "xed number of groups and time
periods is neater and the computations involved are simpler. Thus we have, for
i"1,2, n and t"2,2, ¹
y "ay
#b@x #g ,c@ = #g
0 iit
iit
i(t)
i(t~1)(t~1)
i(t)t
iit
and the resulting orthogonality conditions can be expressed as EMZ g N,
iit iit
N In the Appendix A
, x@
EMh N"0, with Z "My
, x@
it
iit
i~1(t~1)t~1 i~1(t~1)t~1 i~1(t)t
we prove that the following proposition holds:
Proposition 1.
T
n
1
1 0
+ + h P
it
n(¹!1)
it~2 i/1
as nPR with ¹ xxed.
De"ning S to be a sequence of weight matrices, the GMM estimator of
nT
c can thus be obtained upon solving
0
1
@
1
argmin
+h S
+h .
c n(¹!1)
it nT n(¹!1)
it
i,t
i,t
The optimal GMM estimator of c is obtained by setting the weight matrix in the
GMM minimand function to the inverse of
C
D C
C
D
D
1
+h
it
n(¹!1)
i,t
and it is given as c( "[=@ZX~1Z@=]~1[=@ZX~1Z@y], where the data and
instruments matrices are constructed in an obvious way. In Appendix B we
show that the following proposition holds under the assumptions of the model:
X,Cov
Proposition 2.
T n
$ N(0, X)
+ +h P
Jn(¹!1) t/2 i it
as nPR with ¹ xxed.
1
374
S. Girma / Journal of Econometrics 98 (2000) 365}383
Thus under some regularity conditions (White, 1984), c( will be asymptotically
c
normally distributed as
J¸(c( !c )PN(0,
A quasi-di!erencing approach to dynamic
modelling from a time series of
independent cross-sections
Sourafel Girma*
School of Economics, University of Nottingham, Nottingham NG7 2RD, UK
Received 4 May 1998; received in revised form 10 March 2000; accepted 13 March 2000
Abstract
We motivate and describe a GMM method of estimating linear dynamic models from
a time series of independent cross-sections. This involves subjecting the model to
a quasi-di!erencing transformation across pairs of individuals that belong to the same
group. Desirable features of the model include the fact that: (i) no aggregation is involved,
(ii) dynamic response parameters can vary across groups, (iii) the presence of unobserved
individual speci"c heterogeneity is explicitly allowed for, and (iv) the GMM estimators
are derived by realistically assuming group size asymptotics. The Monte Carlo experiments conducted to assess the "nite sample performances of the proposed estimators
provide us with encouragement. ( 2000 Elsevier Science S.A. All rights reserved.
JEL classixcation: C23
Keywords: Time series of cross sections; Panel data; GMM
1. Introduction
In this paper we propose a new method of using data from a time series of
independent cross-sections (abbreviated to TISICS in what follows) in "tting
* Corresponding author. Tel.: #44-115-951-4733; fax: #44-115-951-4159.
E-mail address: [email protected] (S. Girma).
0304-4076/00/$ - see front matter ( 2000 Elsevier Science S.A. All rights reserved.
PII: S 0 3 0 4 - 4 0 7 6 ( 0 0 ) 0 0 0 2 4 - 5
366
S. Girma / Journal of Econometrics 98 (2000) 365}383
linear dynamic models such as
y "ay
#bx #f #e ,
(1)
it
it~1
it
i
it
where i and t index individual units and time periods, respectively, f is an
individual e!ect and e is a disturbance term. The main problem is the fact that
such data sets do not track the same group of individuals over time, whereas the
econometric models of interest require some information on intra-individual
di!erences. Thus, it is not possible to consistently estimate the parameters of
interest unless additional information in the shape of instrumental variables is
available or further identifying restrictions are imposed on the model.
Deaton (1985) proposes a pseudo-panel approach to circumvent the identi"cation problem caused by the absence of repeated observations on individual
units. This involves placing individuals into distinct groups or cohorts, say G ,
c
for c"1,2, C and working with the expectation of model (1) at each point in
time conditional on cohort membership. Thus taking EMy Di(t)3G N, we have
it
c
(2)
yH "ayH #bxH #f H#eH
ct
ct
ctit
ct~1
ct
where superscripts asterisk denote the conditional expectations.
Eq. (2) is a model of the average behaviour of a cohort, much in the spirit of
the &representative agent' construction. Since cohort averages are e!ectively the
units of analysis and the number of cross-sections is often small in practice, one
might have to assume asymptotics on C to derive large-sample properties of
and f H are
the ensuing econometric estimators. On the other hand as yH
ct
ct~1
correlated by construction, it is desirable to di!erence the latter out of the
model. This requires that f H be invariant over time. One way to force this
ct
condition is to assume that all individuals within a cohort are homogeneous
with respect to the unobserved heterogeneity factors. It will then be possible to
apply standard dynamic panel data techniques by treating sample cohort
averages as error-ridden observations on typical individuals (Collado, 1997).
Notwithstanding the fact that the pseudo-panel approach o!ers a framework
for making joint use of independent cross-sectional information, there are
problems with some of its features. Firstly the assertion of intra-cohort homogeneity seems too strong. It is indeed di$cult to assume away unobserved
individual heterogeneity away in micro data. Secondly the aggregation framework might not be desirable. An estimation method that is based on individual
level data can make a more e$cient use of the available information and
potential problems of interpretation arising out of aggregation are likely to be
mitigated. Thirdly, the practice of establishing the large-sample properties of
econometric estimators and test statistics by driving the number of cohorts to
in"nity is not satisfactory. There is often a physical limit beyond which one
cannot increase the number of cohorts. The oft-cited example of date of birth
cohorts is a case in point. Finally, and this pertains to dynamic modelling in
particular, standard pooling estimators will lead to invalid inference if the
S. Girma / Journal of Econometrics 98 (2000) 365}383
367
response parameters are characterised by cohort-wise heterogeneity. This
follows from Pesaran and Smith (1995) who showed that dynamic modelling
from pooled heterogeneous panels is not trivial. Given that cohorts should be
chosen so that they are as distinct from each other as possible (cf. Deaton, 1985),
it is important to have a framework which allows for some sort of response
heterogeneity.
The above discussion demonstrates that there is clearly scope for studying
alternative estimators for dynamic models from TISICS. In a contribution to the
literature Mo$t (1993) suggests a two-stage least-squares approach where the
regressor y
in (1) is instrumented by the predicted value y(
from a linear
it~1
it~1
projection of the dependent variable on a vector of time-varying and timeinvariant variables, say S and Z , respectively. To make this operational, one
it
i
needs knowledge of S
. But it is not always clear where this information is
it~1
going to come from as TISICS data sets do not typically contain the history of
the variables for the cross-sectional units. Moreover two-stage least squares is
potentially problematic in that the "tted instruments (which need to be correlated with y
) will also have a non-zero correlation with the individual e!ect
it~1
f , since the two quantities are statistically related. The only obvious exception is
i
if S is a function of time alone, hence uncorrelated with the time invariant f .
it
i
But this will result in instruments that do not have much variation across
individuals in the likely case where the number of cross-sections is much smaller
than the cross-sectionals units. As a result the ensuing instrumental variables
estimators will be highly inaccurate. So, although Mo$t's (1993) paper generalises Deaton (1985) in the sense that the possibility of instrumentation procedures
other than grouping is discussed, it does not seem to be too promising due to its
rather strong informational requirement (Verbeek, 1992). In this paper we
explore the possibility of making inference on model (1) from individual level
data, by performing pair-wise quasi-di!erences across di!erent observed individuals within the same group. This produces the following regression function
at each time period and for all i and j : EMy Dy
, x N. This is potentially
it jt~1 it
estimable since CovMy , y
N can be identi"ed from the data as long there are
it jt~1
group-wise cross-sectional correlations.
The plan for the rest of the paper is as follows. In Section 2 we formally
describe the quasi-di!erenced model along with the maintained assumptions.
We identify the maximum number of linearly independent quasi-di!erences that
are available and those will be used as a basis for estimation. In Section 3 we
generate some population moment restrictions implied by the quasi-di!erenced
model and suggest a generalised method of moments (GMM) estimation framework. The asymptotic properties of the GMM estimators are realistically based
on having a large number of individuals per group-time cell. This compares
favourably to the Deaton-type estimators in which the number of group/time
periods is required to grow without limit. Another important advantage of the
quasi-di!erencing approach is that slope parameter heterogeneity across groups
368
S. Girma / Journal of Econometrics 98 (2000) 365}383
can easily be allowed since the method can be applied on a single group. In
Section 4 we conduct some Monte Carlo experiments to assess the "nite sample
performances of the proposed estimators. Section 5 concludes.
2. Model description
2.1. Basic identifying assumptions
We focus on the estimation of the following class of models form TISICS:
#b@ x #f #e
y "a yH
i(t) i(t)t
i(t)
i(t)t
i(t)t
i(t) i(t)t~1
t"2, ¹; i(t)"1,2, N .
(3)
t
Here x is a k-dimensional vector of control variables, f represents individualspeci"c e!ects and the e's are time-varying disturbances. The time index in
parentheses is used to emphasise the non-panel nature of the data. However,
when no ambiguity is apparent, we will sometimes drop this index for the sake of
notational elegance. It is assumed that we have cross-sectional observations for
the time index on the individual
t"1,2, ¹, with ¹*2. Notice that for yH
i(t)t~1
does not match to the (proper) time index. Such variables are not observed and
they will often be denoted by a superscript asterisk. We will next discuss some of
the basic identifying assumptions of the model.
H.1: The population is partitioned into a "nite set of mutually exclusive and
exhaustive groups G , c"1,2, C. Groups may be based on one or more
c
underlying observed variables that are constant over time for all individuals.
Without loss of generality, we assume that at each point in time n members are
drawn at random from G , ∀c.
c
H.2: a "a "a and b "b "b for ∀(i, j)3G .
i
j
c
i
j
c
c
This assumption restricts the individual level heterogeneity in response parameters by stating that those parameters are identical within a group, but we do
the next best thing by allowing for the possibility of group-wise heterogeneity.
H.3: Da D(1, ∀c.
c
H.4: y "k f #k e , where the k's are constants and f is a random
i(t)0
1c
2 i(t)0
c
group speci"c e!ect.
The above assumption implies that the initial (unobserved) conditions are
random draws from possibly di!erent populations, but share a common component in the speci"cation of a conditional mean. Assumption H.4 has an
important implication: CovMy , y
NO0 for ∀(i(t), j(s))3G and ∀t, s. That is,
i(t)t j(s)s)
c
di!erent individuals within the same group exhibit nonzero correlations.
The information obtained from this source of correlations will prove useful in
identifying the dynamic response parameters. While H.4 may not be an easily
testable restriction, it is not without its justi"cations. It has been noted in the
econometric literature that group structures are a common feature of many
S. Girma / Journal of Econometrics 98 (2000) 365}383
369
cross-sectional data sets (e.g., Frees, 1995). Researchers working with TISICS
data often assume that each member of the group has some information about
the average group value (Blundell et al., 1994).
#h #w where the eigenvectors of j lie within
H.5: x "t #j@x
i(t)
i(t)t~1
ct
i(t)t
i(t)t
the unit circle, t is a k-dimensional vector of individual speci"c drift and h is
i(t)
ct
group-time speci"c error component which can come about due to some shared
characteristics between individuals within a group-time cell. It can be seen that
h is a further source of within group correlations. From the above two
ct
assumptions, it is clear that intra-group correlation is directly proportional to
the variances of f and h .Further assumptions are:
c
ct
H.6: CovMf , e N"0, ∀i and t.
i ti
H.7: f , f , f , h and e are all mutually, independently distributed with mean
i c ct ct
it
zero and nonzero and "nite variances.
The above assumption allows for purely individual-speci"c heterogeneity,
even if individuals within a group share a common e!ect. This is in contrast to
Deaton's (1985) approach where within-group heterogeneity is not explicitly
allowed for.
H.8: Cov Mx , f N"0.
i(t)t j(s)
This is an innocuous assumption since f is speci"c to individual j(s).
j(s)
Making use of the above assumptions, we can show that CovMy , y NO0
i(t)t j(s)s
and equal ∀(i, j)3G . In other words, our assumptions imply an equicorrelation
c
structure within a group-time cell. Intuitively, without equicorrelation there
would not be enough independent information to estimate O(n2) covariances
per cell from just n observations. We like to note that in one form or another, the
assumption of equicorrelation amongst the units of analysis is prevalent in
econometric modelling. For example, the two-way panel estimation relies on
equicorrelation between individuals observed during the same time period (e.g.,
Hsiao, 1986). The covariance stationarity assumption employed in time series
models is also a case in point.
2.2. A quasi-diwerencing approach
We begin by writing (3) in its reduced-form representation as1
#n x #n yH #v
xH
y "n xH #2#n
i(t)t
tt i(t)t
t0 i(t)0
t(t~1) i(t)t~1
i(t)t
t1 i(t)1
(4)
with
v
i(t)t
t~1
1!at
f n
" + ase
#
i(t)(t~s)
1!a i(t) ts
s/0
"b@at~s for 1)s)t,
1 As the estimator proposed in this section deals with a single group, the index on a and b is
dropped.
370
S. Girma / Journal of Econometrics 98 (2000) 365}383
and
n "at.
t0
Quasi-di!erencing the model as My !ay
; ∀[i(t), j(t!1)] and ∀t*2N,
i(t)t
j(t~1)t~1
we obtain the following potentially estimable model:
y "ay
#b@x #g
i(t)t
j(t~1)t~1
i(t)t
ijt
where
(5)
g "at[y !y
]#Df #Dx #De
ijt
i(t)0
j(t~1)0
ijt
ijt
ijt
with
a!at
1!at
f !
f
,
*f "
ijt 1!a i(t)
1!a j(t~1)
t~2
N2
!xH
*x "b@ + ar`1MxH
j(t~1)(t~1~r)
i(t)(t~1~r)
ijt
r/0
and
t~1
*e " + arMe
!e
N#e .
ijt
i(t)t~r
j(t~1)t~r
i(t)t
r/1
Let D "[0
:I
]; D "[I
:0
] and A "D ?l ?I ,
1
(T~1)C1 T~1
2
T~1 (T~1)C1
2
2 n
n
where 0 ; I and l are the null matrix, the identity matrix and the vector of ones
a a
a
of dimension a, respectively. When all observations within a group are arranged
"rst by individuals and then by time periods, and the transformation matrix
B(a)"A !aA is applied to the resulting n¹]1 vector, it can be checked that
1
2
the pairwise quasi-di!erenced model of Eq. (4) will be produced. Since the
n2(¹!1)]n¹ matrix B(a) has rank equal to n¹!1, the transformed error
covariance matrix will be singular because r"n2(¹!1)!(n¹!1) equations
in the quasi-di!erenced model are redundant. Consequently, any estimation
strategy has to make due allowance for the singularity of the disturbance
covariance matrix. There are two main ways of dealing with this (e.g., Judge
et al., 1985): (a) the use of appropriate generalised inverse estimators, and (b) the
elimination of the redundant equations prior to estimation. Employing generalised inverses does not seem to be a terribly attractive alternative in our framework. This is because it involves a multiple of n4 terms and with possibly large n,
this will pose awkward computational problems. We thus attempt to identify
n¹!1 linearly independent pairwise quasi-di!erences within each group. To
2 Although x
is observed, it is left in the disturbances for reasons that will shortly become
j(t~1)t~1
apparent.
S. Girma / Journal of Econometrics 98 (2000) 365}383
371
do so we "rst consider the following set of quasi-di!erences within each group
between individuals with the same ordering in their respective time domains:
¸ "My !ay
; ∀i(t), i(t!1) and t*2N.
1
i(t)t
i(t~1)t~1
Note that i(t)"i(t!1) is to be understood that as the case where individual
i at time t and individual i at time (t!1) correspond to the same ordering within
their respective time domains. It can be veri"ed that ¸ consists of n(¹!1)
1
linearly independent quasi-di!erenced equations. By augmenting ¸ by the
1
following n!1 quasi-di!erences between the "rst observation at time t"2 and
all observations bar the "rst one at time t"1, ¸ "My
!ay
;
2
1(2)2
j(1)1
∀j*23G N, it can be checked that ¸"¸ X¸ will produce the desired full set
c1
1
2
of n¹!1 linearly independent quasi-di!erences. As a result ¸ will exhaust all
information contained in the pairwise quasi-di!erenced model. Of course this is
not the only set of linearly independent quasi-di!erences. However, since any
other such set is a nonsingular transformation of ¸, the same information will be
preserved.
Let = "[y
, x@ ]@ and de"ne c "[a, b@]@ as the true parameter
0
ijt
j(t~1)t~1 i(t)t
vector. When the above transformations are applied to Eq. (3), the following
model will then be available for estimation for
∀[i(t), j(t!1)]3S,S XS
1 2
y "ay
#b@x #g ,c@ = #g ,
0 ijt
ijt
i(t)t
j(t~1)t~1
i(t)t
ijt
where S "M[i(t), i(t!1)] and t*2N; S "M∀[1(2), j(1)], j(1)*2)N.
1
2
(6)
3. Inference
3.1. Identixcation
OLS on model (6) would produce inconsistent estimators because g is
ijt
correlated with both y
and x . This can easily be checked using the
j(t~1)t~1
i(t)t
underlying assumptions of the model. Fortunately, these assumptions also
imply that the composite error term gt satis"es the following two sets of linear
cij
moment conditions:
EMy
g N"0; t"2,2, ¹; s"1,2, t!1; ∀g(t!1)Oj(t!1)
g(t~s,t~s) ijt
and
EMx
g N"0; t"2,2, ¹; s"0,2, t!1;
g(t~s)t~s ijt
∀g(t!s) D g(t!s)Oi(t) or g(t!1)"j(t!1).
That is past and present values of the dependent and explanatory variables
within the same group can be used as instruments. Because of assumptions H.4
and H.5, a nonzero IV-regressor correlation is ensured. Notice that since the
372
S. Girma / Journal of Econometrics 98 (2000) 365}383
correlations in the x's between the di!erent individuals comes from the h 's,
ct
leaving the observable x
in the composite disturbance term ensures that
j(t~1)t~1
the latter does not contain a h term. Also notice that since g contains
ijt
y !y
, any y
will not be correlated with it because of the
i(t)0
j(t~1)0
g(t~s)(t~s)
equicorrelation assumption imposed on the model.
A problem that we face is the possibility of an in"nite number of instruments,
given that the number of observations per group-time cell is allowed to grow.
For example, X
; i"1,2, n can be used as instruments for the regrei(t~2)t~2
ssors in the quasi-di!erenced model. Since standard optimality results for GMM
estimators (e.g., Hansen, 1982) are not trivially applicable when the vector of
conditioning variables is in"nite dimensional, it pays to look for some ways to
economise on the number of instruments to be used. One apparent solution
might be to average the instruments over cells. In this case there would not be
enough internal variation in the instruments set, unless the number of groups
grows with n at a suitable rate. This is bound to adversely a!ect the precision of
the resulting estimators. In this paper we restrict ourselves to a set of "nite
orthogonality conditions by the simple, albeit ad hoc, device of using just one
orthogonality condition within each subset of the moment restrictions. For
example in the Monte Carlo experiments conducted in the next section, the
following moment restrictions are chosen to be the basis for estimating c:
, x@ N, where g is made to
EMZ g N"0 with Z "My
, x@
ijt ijt
ijt
g(t~1)t~1 g(t~1)t~1 g(t)t
correspond to j!1.3 We have to emphasise, however, that this is
entirely arbitrary. All that is needed is to assign distinct instruments to each
observation.
3.2. GMM inference
GMM estimation proceeds by forming sample analogs of those population
moment conditions. The sample analogs are usually sample averages of Z g
iji ijt
(also known as the disturbances of GMM) and the latter are expected to satisfy
appropriate laws of large numbers (LLNs) for all admissible points in the
parameter space. As far as justifying the use of LLNs is concerned, the simplest
case occurs when the random variables under consideration are independently
and identically distributed. In our case, however, there is some dependence
across observations due to the assumptions we employed and the quasi-di!erencing transformation of the data. In fact the disturbances of the model exhibit
contemporaneous as well as temporal correlations within groups. They are also
heterogeneous across time as can be veri"ed by inspecting the structure of gt .
cij
Alternative approaches to deriving LLNs that are applicable to a variety of non
i.i.d environments are proposed by several authors (e.g., White, 1984). These
3 We adopt the convention of putting g"n for j"1
S. Girma / Journal of Econometrics 98 (2000) 365}383
373
laws of large numbers typically place some restrictions on the dependence,
heterogeneity and moments of the data process.
The main problem in the present context is the contemporaneous within
group dependence introduced by the transformation function ¸ . One practical
2
solution may be to average all quasi-di!erences produced by ¸ and assume
2
that as n grows the dependence will become negligible (and accept the risk of any
"nite sample bias). In the sequel we work with the model produced by ¸ alone,
1
because the large sample argument with a "xed number of groups and time
periods is neater and the computations involved are simpler. Thus we have, for
i"1,2, n and t"2,2, ¹
y "ay
#b@x #g ,c@ = #g
0 iit
iit
i(t)
i(t~1)(t~1)
i(t)t
iit
and the resulting orthogonality conditions can be expressed as EMZ g N,
iit iit
N In the Appendix A
, x@
EMh N"0, with Z "My
, x@
it
iit
i~1(t~1)t~1 i~1(t~1)t~1 i~1(t)t
we prove that the following proposition holds:
Proposition 1.
T
n
1
1 0
+ + h P
it
n(¹!1)
it~2 i/1
as nPR with ¹ xxed.
De"ning S to be a sequence of weight matrices, the GMM estimator of
nT
c can thus be obtained upon solving
0
1
@
1
argmin
+h S
+h .
c n(¹!1)
it nT n(¹!1)
it
i,t
i,t
The optimal GMM estimator of c is obtained by setting the weight matrix in the
GMM minimand function to the inverse of
C
D C
C
D
D
1
+h
it
n(¹!1)
i,t
and it is given as c( "[=@ZX~1Z@=]~1[=@ZX~1Z@y], where the data and
instruments matrices are constructed in an obvious way. In Appendix B we
show that the following proposition holds under the assumptions of the model:
X,Cov
Proposition 2.
T n
$ N(0, X)
+ +h P
Jn(¹!1) t/2 i it
as nPR with ¹ xxed.
1
374
S. Girma / Journal of Econometrics 98 (2000) 365}383
Thus under some regularity conditions (White, 1984), c( will be asymptotically
c
normally distributed as
J¸(c( !c )PN(0,