Cluster- Robust F- tests

3. What if you have multiway clustering and few clusters? Sometimes we are worried about multiway clustering but one or both of the ways has few clusters. Currently, we are not aware of an ideal approach to deal with this problem. One potential solution is to try to add suffi cient control variables so as to minimize concerns about clustering in one of the ways, and then use a one- way few- clusters cluster robust approach on the other way. Another potential solution is to model one of the ways of clustering in a parametric way, such as with a common shock or an autoregressive error model. Then you can construct a variance estimator that is a hybrid of the parametric model and cluster robust in the remaining dimension.

VII. Extensions

The preceding material has focused on the OLS and FGLS estimator and tests on a single coeffi cient. The basic results generalize to multiple hypothesis tests, instrumental variables IV estimation, nonlinear estimators, and generalized method of moments GMM. These extensions are incorporated in Stata though Stata generally computes test p- values and confi dence intervals using standard normal and chi- squared distributions rather than T and F distributions. And for nonlinear models, stronger assumptions are needed to ensure that the estimator of ␤ retains its consistency in the presence of clustering. We provide a brief overview.

A. Cluster- Robust F- tests

Consider Wald joint tests of several restrictions on the regression parameters. Except in the special case of linear restrictions and OLS with iid normal errors, asymptotic theory yields only a chi- squared distributed statistic, such as W, that is ␹ 2 h distrib- uted where h is the number of linearly independent restrictions. Alternatively, we can use the related F statistic, F = W h. This yields the same p- value as the chi- squared test if we treat F as being F h, ∞ distributed. In the cluster case, a fi nite- sample adjustment instead treats F as being Fh, G – 1 distributed. This is analo- gous to using the TG – 1 distribution rather than N[0,1] for a test on a single coeffi cient. In Stata, the fi nite- sample adjustment of using the TG – 1 for a t- test on a single coeffi cient, and using the Fh,G – 1 for an F- test, is only done after OLS regression with command regress. Otherwise, Stata reports critical values and p- values based on the N[0,1] and ␹ 2 h distributions. Thus, Stata does no fi nite- cluster correction for tests and confi dence intervals fol- lowing instrumental variables estimation commands, nonlinear model estimation com- mands, or even after command regress in the case of tests and confi dence intervals us- ing commands testnl and nlcom. The discussion in Section VI was limited to inference after OLS regression, but it seems reasonable to believe that for other estimators one should also base inference on the TG – 1 and Fh,G – 1 distributions, and even then tests may overreject when there are few clusters. Some of the few- cluster methods of Section VI can be extended to tests of more than one restriction following OLS regression. The Wald test can be based on the bias- adjusted variance matrices CR2VE or CR3VE, rather than CRVE. For a bootstrap with asymptotic refi nement of a Wald test of H : R␤ = r, in the b th resample we compute W b = R ˆ ␤ b − R ˆ␤ ′[R ˆV clu [ ˆ ␤ b ] ′ R ] −1 R ˆ ␤ b − R ˆ␤ . Extension of the data- determined de- grees of freedom method of Section VID to tests of more than one restriction requires, at a minimum, extension of Theorem 4 of Bell and McCaffrey 2002 from the case that covers ␤ , where ␤ is a single component of ␤ , to R␤ . An alternative ad hoc ap- proach would be to use the Fh, v distribution where v is an average possibly weighted by estimator precision of v defi ned in Equation 26 computed separately for each exclusion restriction. For the estimators discussed in the remainder of Section VII, the rank of ˆ V clu [ ˆ ␤] is again the minimum of G – 1 and the number of parameters K. This means that at most G – 1 restrictions can be tested using a Wald test, in addition to the usual require- ment that h ≤ K .

B. Instrumental Variables Estimators