Scott-Clayton 633
IV. Bounding the Potential Bias Due to Selection
It is fair to ask whether the results above could be biased by differ- ential selection, given that an explicit goal of the program was to increase in-state
enrollment among qualified students. Yet the potential for nonrandom selection need not make the evaluation problem intractable; Manski 1995, Lee 2009, and others
suggest methods for bounding selection which influence the approach I take below. To understand how selection may bias the findings presented above, recall the be-
fore-after model as specified in Equation 2. The concern is that those who enter the sample as “eligible enrollees” after the implementation of PROMISE may be dif-
ferent from eligible enrollees who entered the sample in earlier cohorts. Any differ- ences captured by the covariates in X
i
including gender, raceethnicity, age at entry, ACT score and high school GPA can be controlled, but other differences may
remain. To control for these remaining differences, one would ideally like to include in all regressions an indicator of whether the student was induced by PROMISE to
become an eligible enrollee, instead estimating:
˜ ˜
˜ y ⳱Ⳮ
 after ⳭX ␦ⳭZ Ⳮε
5
it t
i i
it
where Z
i
is equal to 1 if the student was induced to become an eligible enrollee because of PROMISE, and zero otherwise.
27
The coefficient estimates how dif- ferent these marginal enrollees are from intramarginal enrollees, after controlling for
other observable characteristics. If at least some students are induced to become eligible enrollees because of the
program, and these students are different in unobservable ways ⬎ 0, then the estimated
from Equation 5 will not converge to the true . If X
i
were completely ˆ
˜ 
 orthogonal to Z
i
that is, if none of the covariates were useful proxies for Z
i
then: ˆ
˜ ⳮr
[
PrZ ⳱1after ⳱1
]
⳯
6
i t
In words, Equation 6 says that the size and magnitude of the bias will depend on two factors: 1 what fraction of eligible enrollees who are “marginal,” that is, induced
to become eligible enrollees by PROMISE, and 2 how different marginal enrollees are from intramarginal enrollees as measured by the parameter . This is an upper
bound on the potential bias; it will be smaller to the extent that the covariates in X
i
help proxy for the unobserved Z
i
. In this section, I first estimate Equation 1 using publicly available enrollment trend data, and then test the sensitivity of the main
findings to varying assumptions about Equation 2. To estimate the impact of PROMISE on eligible enrollment, Figure 6 plots four
different college enrollment rates for WV high school graduates: the percent en- rolling in a public WV institution as a PROMISE-eligible student, the percent en-
rolling in a public WV institution as a PROMISE-ineligible student, the percent enrolling in a WV private institution, and the percent enrolling in an out-of-state
27. This analysis follows a framework used by Jonathan Guryan 2004.
634 The Journal of Human Resources
Figure 6 Fall Enrollment Rates of Recent West Virginia High School Graduates
Source: Author’s calculations using WVHEPC data and publicly available statistics; see text for details.
institution.
28
The figure indicates that the percent of WV high school graduates enrolling in public WV institutions as PROMISE-eligible students jumped by four
to five percentage points between 2001 and 2002, from a baseline of about 15 percent.
29
This suggests that out of the 20 students eligible for the program in 2002
28. Trends in the number of high school graduates and private WV enrollments come from publicly available WVHPEC reports WVHEPC 2002; WVHEPC 2003; WVHEPC 2005. The annual Digest of
Education Statistics provides data on the home states of first-time college freshmen by institution, but only in even-numbered years U.S. Department of Education 2003, Table 207; U.S. Department of Education
2004, Table 204; U.S. Department of Education 2005, Table 203. WVHEPC reports include estimates of out-of-state enrollments, but the data are based on surveys of high school administrators. In even-numbered
years, the IPEDS out-of-state enrollment numbers are consistently about 75 percent the level of the WVHEPC estimates. I use the IPEDS statistics in even years and impute the out-of-state enrollments in
odd years as 75 percent of the WVHEPC estimates for those years. 29. The difference between just 2001 and 2002 is 3.7 percentage points, but including additional years
increases the average before-after difference to about five percentage points.
Scott-Clayton 635
and later, 15 would have met the initial requirements and enrolled in a public WV institution with or without the scholarship, while four to five 20 to 25 percent
appear to be “marginal” enrollees.
30
Figure 6 also provides some information about where these marginal enrollees came from, and where they did not. Between 2001 and 2002, the out-of-state en-
rollment rate declined by 1.2 percentage points. If one assumes that this entire de- crease represents students switching to WV public institutions as eligible enrollees,
then one-quarter to one-third of marginal enrollees were induced from out-of-state. The percentage may be much lower if some of those induced from out-of-state
decided to use their PROMISE scholarship at a WV private institution private WV enrollment does tick upward in 2002. These are the students most likely to create
a positive bias, so it is reassuring that they cannot account for more than a third of the enrollment increase, or more than 6 percent ⳱1.220 of all PROMISE-eligible
enrollees.
It is impossible to identify in the data precisely who these 6 percent are, but one approach to bounding, following Lee 2009, is to make the extreme assumption
that these marginal students represent the top 6 percent of values for a given outcome and then reestimate the effects with these top values excluded. But in the case where
multiple, related outcomes are available, Lee 2009 bounds, which were designed for the case of a single outcome, can be too conservative. In the present case, it is
empirically impossible for marginal students to simultaneously represent the top 6 percent of values for every outcome of interest. For example, the top 6 percent of
PROMISE recipients by cumulative college GPA had a five-year graduation rate of only 83 percent not 100 percent, a four-year graduation rate of 68 percent not
100 percent, and accumulated an average of only 118 credits which is just below the median, not the 94th percentile of credit accumulation.
Thus, instead of throwing out the top 6 percent of treatment-group values for each outcome individually, I reestimate the effects for all outcomes after “trimming” the
treatment group the after cohorts based on the 94th percentile of a key outcome, here either cumulative GPA 3.90 or above or cumulative credits earned 149 or
above. The results are shown in Table 5. Column 1 restates the baseline estimates for comparison, and Columns 2 and 3 provide the adjusted estimates after trimming
the sample. Even under this rather extreme assumption, the coefficients shrink but generally remain above zero, and several key impacts retain significance, including
the effects on first year outcomes, school-year earnings, meeting the 120 credit threshold, and earning a BA within four years. Interestingly, trimming based on
cumulative outcomes over four years has virtually no effect on the estimated effect on first year credits, which is arguably the outcome most proximal to the policy
because most students were meeting the 2.75 GPA threshold even prior to PROM- ISE; and recall that 25 percent of recipients lost the scholarship after the first year.
Note that this analysis only examines the effects of positive selection; if one made similarly extreme assumptions about negatively-selected marginal students those
30. While this clearly limits the potential for compositional change, it is still a sizable enrollment effect. This estimate is slightly higher than Cornwell, Mustard, and Sridhar 2006 find for Georgia HOPE, and
comparable to Dynarski’s 2002 estimate for seven state programs.
636 The
Journal of
Human
Resources
Table 5 Bounding Exercise
Outcome 1 Baseline
2 Trim Top 6 of “After” Cohorts Based on
Cumulative GPA 3 Trim Top 6 of “After”
Cohorts Based Based on Cumulative Credits
GPA, end of Year 1 0.077 0.006
0.038 0.007 0.054 0.004
Credits earned, end of Year 1 1.830 0.118
1.740 0.111 1.578 0.114
Semesters enrolled over four years 0.146 0.045
0.142 0.045 0.087 0.046
Credits earned over four years 5.782 1.136
5.540 1.174 2.108 1.135
Cumulative GPA over four years
a
0.039 0.018 ⳮ
0.011 0.016 0.012 0.020
Weekly school-year earnings
b
ⳮ 9.55 2.10
ⳮ 8.66 2.23
ⳮ 7.52 2.16
Had 120 credits within four years 0.111 0.018
0.104 0.018 0.082 0.018
3.0Ⳮ GPA over four years 0.035 0.012
0.020 0.012 0.019 0.013
Earned BA within four years 0.067 0.005
0.049 0.005 0.045 0.005
Earned BA within five years 0.037 0.012
0.024 0.012 0.012 0.012
Sample size 12,911
12,464 12,462
Source: Author’s calculations using WVHEPC administrative data on first-time degree-seeking freshmen aged 19 and younger, enrolling in the fall semester of school years 2000–2001 through 2003–2004. Unless otherwise noted, the sample is restricted to West Virginia residents who met the high school GPA 3.0Ⳮ and ACTSAT
211000Ⳮ score requirements for PROMISE eligibility. Notes: Robust standard errors, clustered by cohort, are in parentheses. Stars indicate the significance of individual findings at the p ⬍ 0.10, p ⬍ 0.05, or p ⬍ 0.01 level
note that due to the small number of clusters, critical values are taken from the relevant small-sample T distribution rather than the standard normal. All regressions include indicator controls for gender, raceethnicity, age, high school GPA and GPA squared, and indictors for each ACT score. The proportion of the sample that is
trimmed, 6 percent, is calculated based on an analysis of the enrollment shifts displayed in Figure 6. I then identify the set of students in the “after” cohorts with the top 6 percent of values on either cumulative GPA equivalent to a 3.90 or above after four years or cumulative credits earned after four years 149 credits or above,
respectively, and reestimate the effects with these students excluded. a. For students who drop out, cumulative GPA is imputed as the cumulative GPA when last enrolled.
b. I calculate average weekly earnings based on the four quarters of school year employment data that are available for all cohorts, corresponding to the spring of the second sophomore year, the spring and fall of the third year, and the fall of the fourth year following enrollment.
Scott-Clayton 637
who otherwise would not have enrolled at all or would have enrolled with an ACT score below the cutoff, the net effect of selection may well be a downward rather
than upward bias.
31
V. Inside the Black Box: Are Impacts Driven by Cost- of-College or Incentive Effects?