Feasible GLS with Fixed Effects
thank Arindrajit Dube and Jason Lindo for bringing this issue to our attention. Within and LSDV estimation lead to the same cluster- robust standard errors if we apply For-
mula 11 to the respective regressions, or if we multiply this estimate by c = G G – 1. Differences arise, however, if we multiply by the small- sample correction
c given in
Equation12. Within estimation sets c = [G G − 1] [N − 1 N − K + 1]
since there are only K – 1 regressors—the within model is estimated without an intercept.
LSDV estimation uses c = [G G − 1] [N − 1 N − G − K + 1]
since the G clus- ter dummies are also included as regressors. For balanced clusters with
N
g
= N and
G large relative to K,
c 1 for within- estimation and
c N
N − 1
for LSDV es- timation. Suppose there are only two observations per cluster, due to only two indi-
viduals per household or two time periods in a panel setting, so N
g
= N = 2
. Then c
2 2 − 1 = 2 for LSDV estimation, leading to CRVE that is twice that from
within estimation. Within estimation leads to the correct fi nite- sample correction. In Stata, the within command xtreg y x, fe vcerobust gives the desired CRVE. The
alternative LSDV commands regress y x i.id_clu, vcecluster id_clu and, equivalently, regress y x, absorbid_clu vcecluster id_clu
use the wrong degrees- of- freedom cor- rection. If a CRVE is needed, then use xtreg. If there is reason to instead use regress
i.id , then the cluster- robust standard errors should be multiplied by the square root of
[N − K − 1] [N − G − K − 1]
, especially if N
g
is small. The inclusion of cluster- specifi c dummy variables increases the dimension of the
CRVE but does not lead to a corresponding increase in its rank. To see this, stack the dummy variable dh
ig
for cluster g into the N
g
× 1 vector dh
g
. Then d
′
h
g
ˆ u
g
= 0 , by the
OLS normal equations, leading to the rank of ˆ
V
clu
[ ˆ ]
falling by one for each cluster- specifi c effect. If there are k regressors varying within cluster and G – 1 dummies then,
even though there are K + G – 1 parameters 
, the rank of ˆ
V
clu
[ ˆ ]
is only the minimum of K and G – 1. And a test that
␣
1
, ..., ␣
G
are jointly statistically signifi cant is a test of G
– 1 restrictions since the intercept or one of the fi xed effects needs to be dropped. So even if the cluster- specifi c fi xed effects are consistently estimated that is, if
N
g
→ ∞ ,
it is not possible to perform this test if K G – 1, which is often the case. If cluster- specifi c effects are present, then the pairs cluster bootstrap must be
adapted to account for the following complication. Suppose Cluster 3 appears twice in a bootstrap resample. Then if clusters in the bootstrap resample are identifi ed from the
original cluster- identifi er, the two occurrences of Cluster 3 will be incorrectly treated as one large cluster rather than two distinct clusters.
In Stata, the bootstrap option idcluster ensures that distinct identifi ers are used in each bootstrap resample. Examples are regress y x i.id_clu, vceboot, clusterid_clu
idclusternewid reps400 seed10101 and, more simply, xtreg y x, fe vceboot,
reps400 seed10101 , as in this latter case Stata automatically accounts for this
complication.