Do Fixed Effects Fully Control for Within- Cluster Correlation?
17 y
ig
= ′
x
ig
 + ␣
g
+ u
ig
= ′
x
ig
 +
h=1 G
∑
␣
g
dh
ig
+ u
ig
, where dh
ig
, the h
th
of G dummy variables, equals one if the ig
th
observation is in cluster h
and equals zero otherwise. There are several different ways to obtain the same cluster- specifi c fi xed effects esti-
mator. The two most commonly used are the following. The least squares dummy variable LSDV estimator directly estimates the second line of Equation 17, with OLS regression
of y
ig
on x
ig
and the G dummy variables d1
ig
, . . . , dG
ig
, in which case the dummy variable coeffi cients
ˆ ␣
g
= y
g
− ′
x
g
ˆ 
where y
g
= N
g −1
⌺
i=1 G
y
ig
and
x
g
= N
g −1
⌺
i=1 G
x
ig
. The within es- timator, also called the fi xed effects estimator, estimates
 just by OLS regression in the
within or mean- differenced model 18
y
ig
− y
g
= x
ig
− x
g
′ + u
ig
− u
g
. The main reason that empirical economists use the cluster- specifi c FE estimator is
that it controls for a limited form of endogeneity of regressors. Suppose in Equation 17 that we view
␣
g
+ u
ig
as the error, and the regressors x
ig
are correlated with this error but only with the cluster- invariant component, that is,
Cov[x
ig
, ␣
g
] ≠ 0
while
Cov[x
ig
, u
ig
] = 0 . Then OLS and FGLS regression of y
ig
on x
ig
, as in Section II, leads to inconsistent estimation of
 . Mean- differencing Equation 17 leads to the within
Model 18 that has eliminated the problematic cluster- invariant error component ␣
g
. The resulting FE estimator is consistent for
 if either
G → ∞ or
N
g
→ ∞ .
The cluster- robust variance matrix formula given in Section II carries over imme- diately to OLS estimation in the FE model, again assuming
G → ∞ .
In the remainder of this section, we consider some practicalities. First, including fi xed effects generally does not control for all the within- cluster correlation of the
error and one should still use the CRVE. Second, when cluster sizes are small and degrees- of- freedom corrections are used, the CRVE should be computed by within
rather than LSDV estimation. Third, FGLS estimators need to be bias- corrected when cluster sizes are small. Fourth, tests of fi xed versus random effects models should use
a modifi ed version of the Hausman test.
A. Do Fixed Effects Fully Control for Within- Cluster Correlation?
While cluster- specifi c effects will control for part of the within- cluster correlation of the error, in general they will not completely control for within- cluster error correla-
tion not to mention heteroskedasticity. So the CRVE should still be used. There are several ways to make this important point.
Suppose we have data on students in classrooms in schools. A natural model, a special case of a hierarchical model, is to suppose that there is both an unobserved
school effect and, on top of that, an unobserved classroom effect. Letting i denote in- dividual, s school, and c classroom, we have
y
isc
= ′
x
isc
 + ␣
s
+ ␦
c
+
isc
. A regression with school- level fi xed effects or random effects will control for within- school cor-
relation but not the additional within- classroom correlation. Suppose we have a short panel T fi xed,
N → ∞ of uncorrelated individuals and
estimate y
it
= ′
x
it
 + ␣
i
+ u
it
. Then the error u
it
may be correlated over time that is,
within- cluster due to omitted factors that evolve progressively over time. Inoue and Solon 2006 provides a test for this serial correlation. Cameron and Trivedi 2005, p.
710 presents an FE individual- level panel data log- earnings regressed on log- hours example with cluster- robust standard errors four times the default. Serial correlation
in the error may be due to omitting lagged y as a regressor. When y
i ,t–1
is included as an additional regressor in the FE model, the Arellano- Bond estimator is used and, even
with y
i ,t–1
included, the Arellano- Bond methodology requires that we fi rst test whether the remaining error u
it
is serially correlated. Finally, suppose we have a single cross- section or a single time series. This can be
viewed as regression on a single cluster. Then in the model y
i
= ␣ + ′
x
i
 + u
i
or y
t
= ␣ + ′
x
t
 + u
t
, the intercept is the cluster- specifi c fi xed effect. There are many reasons for why the error u
i
or u
t
may be correlated in this regression.