Bounding the Potential Bias Due to Selection

Scott-Clayton 633

IV. Bounding the Potential Bias Due to Selection

It is fair to ask whether the results above could be biased by differ- ential selection, given that an explicit goal of the program was to increase in-state enrollment among qualified students. Yet the potential for nonrandom selection need not make the evaluation problem intractable; Manski 1995, Lee 2009, and others suggest methods for bounding selection which influence the approach I take below. To understand how selection may bias the findings presented above, recall the be- fore-after model as specified in Equation 2. The concern is that those who enter the sample as “eligible enrollees” after the implementation of PROMISE may be dif- ferent from eligible enrollees who entered the sample in earlier cohorts. Any differ- ences captured by the covariates in X i including gender, raceethnicity, age at entry, ACT score and high school GPA can be controlled, but other differences may remain. To control for these remaining differences, one would ideally like to include in all regressions an indicator of whether the student was induced by PROMISE to become an eligible enrollee, instead estimating: ˜ ˜ ˜ y ⳱Ⳮ ␤ after ⳭX ␦Ⳮ␭Z Ⳮε 5 it t i i it where Z i is equal to 1 if the student was induced to become an eligible enrollee because of PROMISE, and zero otherwise. 27 The coefficient ␭ estimates how dif- ferent these marginal enrollees are from intramarginal enrollees, after controlling for other observable characteristics. If at least some students are induced to become eligible enrollees because of the program, and these students are different in unobservable ways ␭ ⬎ 0, then the estimated from Equation 5 will not converge to the true . If X i were completely ˆ ˜ ␤ ␤ orthogonal to Z i that is, if none of the covariates were useful proxies for Z i then: ˆ ˜ ␤ⳮ␤r [ PrZ ⳱1after ⳱1 ] ⳯ ␭ 6 i t In words, Equation 6 says that the size and magnitude of the bias will depend on two factors: 1 what fraction of eligible enrollees who are “marginal,” that is, induced to become eligible enrollees by PROMISE, and 2 how different marginal enrollees are from intramarginal enrollees as measured by the parameter ␭. This is an upper bound on the potential bias; it will be smaller to the extent that the covariates in X i help proxy for the unobserved Z i . In this section, I first estimate Equation 1 using publicly available enrollment trend data, and then test the sensitivity of the main findings to varying assumptions about Equation 2. To estimate the impact of PROMISE on eligible enrollment, Figure 6 plots four different college enrollment rates for WV high school graduates: the percent en- rolling in a public WV institution as a PROMISE-eligible student, the percent en- rolling in a public WV institution as a PROMISE-ineligible student, the percent enrolling in a WV private institution, and the percent enrolling in an out-of-state 27. This analysis follows a framework used by Jonathan Guryan 2004. 634 The Journal of Human Resources Figure 6 Fall Enrollment Rates of Recent West Virginia High School Graduates Source: Author’s calculations using WVHEPC data and publicly available statistics; see text for details. institution. 28 The figure indicates that the percent of WV high school graduates enrolling in public WV institutions as PROMISE-eligible students jumped by four to five percentage points between 2001 and 2002, from a baseline of about 15 percent. 29 This suggests that out of the 20 students eligible for the program in 2002 28. Trends in the number of high school graduates and private WV enrollments come from publicly available WVHPEC reports WVHEPC 2002; WVHEPC 2003; WVHEPC 2005. The annual Digest of Education Statistics provides data on the home states of first-time college freshmen by institution, but only in even-numbered years U.S. Department of Education 2003, Table 207; U.S. Department of Education 2004, Table 204; U.S. Department of Education 2005, Table 203. WVHEPC reports include estimates of out-of-state enrollments, but the data are based on surveys of high school administrators. In even-numbered years, the IPEDS out-of-state enrollment numbers are consistently about 75 percent the level of the WVHEPC estimates. I use the IPEDS statistics in even years and impute the out-of-state enrollments in odd years as 75 percent of the WVHEPC estimates for those years. 29. The difference between just 2001 and 2002 is 3.7 percentage points, but including additional years increases the average before-after difference to about five percentage points. Scott-Clayton 635 and later, 15 would have met the initial requirements and enrolled in a public WV institution with or without the scholarship, while four to five 20 to 25 percent appear to be “marginal” enrollees. 30 Figure 6 also provides some information about where these marginal enrollees came from, and where they did not. Between 2001 and 2002, the out-of-state en- rollment rate declined by 1.2 percentage points. If one assumes that this entire de- crease represents students switching to WV public institutions as eligible enrollees, then one-quarter to one-third of marginal enrollees were induced from out-of-state. The percentage may be much lower if some of those induced from out-of-state decided to use their PROMISE scholarship at a WV private institution private WV enrollment does tick upward in 2002. These are the students most likely to create a positive bias, so it is reassuring that they cannot account for more than a third of the enrollment increase, or more than 6 percent ⳱1.220 of all PROMISE-eligible enrollees. It is impossible to identify in the data precisely who these 6 percent are, but one approach to bounding, following Lee 2009, is to make the extreme assumption that these marginal students represent the top 6 percent of values for a given outcome and then reestimate the effects with these top values excluded. But in the case where multiple, related outcomes are available, Lee 2009 bounds, which were designed for the case of a single outcome, can be too conservative. In the present case, it is empirically impossible for marginal students to simultaneously represent the top 6 percent of values for every outcome of interest. For example, the top 6 percent of PROMISE recipients by cumulative college GPA had a five-year graduation rate of only 83 percent not 100 percent, a four-year graduation rate of 68 percent not 100 percent, and accumulated an average of only 118 credits which is just below the median, not the 94th percentile of credit accumulation. Thus, instead of throwing out the top 6 percent of treatment-group values for each outcome individually, I reestimate the effects for all outcomes after “trimming” the treatment group the after cohorts based on the 94th percentile of a key outcome, here either cumulative GPA 3.90 or above or cumulative credits earned 149 or above. The results are shown in Table 5. Column 1 restates the baseline estimates for comparison, and Columns 2 and 3 provide the adjusted estimates after trimming the sample. Even under this rather extreme assumption, the coefficients shrink but generally remain above zero, and several key impacts retain significance, including the effects on first year outcomes, school-year earnings, meeting the 120 credit threshold, and earning a BA within four years. Interestingly, trimming based on cumulative outcomes over four years has virtually no effect on the estimated effect on first year credits, which is arguably the outcome most proximal to the policy because most students were meeting the 2.75 GPA threshold even prior to PROM- ISE; and recall that 25 percent of recipients lost the scholarship after the first year. Note that this analysis only examines the effects of positive selection; if one made similarly extreme assumptions about negatively-selected marginal students those 30. While this clearly limits the potential for compositional change, it is still a sizable enrollment effect. This estimate is slightly higher than Cornwell, Mustard, and Sridhar 2006 find for Georgia HOPE, and comparable to Dynarski’s 2002 estimate for seven state programs. 636 The Journal of Human Resources Table 5 Bounding Exercise Outcome 1 Baseline 2 Trim Top 6 of “After” Cohorts Based on Cumulative GPA 3 Trim Top 6 of “After” Cohorts Based Based on Cumulative Credits GPA, end of Year 1 0.077 0.006 0.038 0.007 0.054 0.004 Credits earned, end of Year 1 1.830 0.118 1.740 0.111 1.578 0.114 Semesters enrolled over four years 0.146 0.045 0.142 0.045 0.087 0.046 Credits earned over four years 5.782 1.136 5.540 1.174 2.108 1.135 Cumulative GPA over four years a 0.039 0.018 ⳮ 0.011 0.016 0.012 0.020 Weekly school-year earnings b ⳮ 9.55 2.10 ⳮ 8.66 2.23 ⳮ 7.52 2.16 Had 120 credits within four years 0.111 0.018 0.104 0.018 0.082 0.018 3.0Ⳮ GPA over four years 0.035 0.012 0.020 0.012 0.019 0.013 Earned BA within four years 0.067 0.005 0.049 0.005 0.045 0.005 Earned BA within five years 0.037 0.012 0.024 0.012 0.012 0.012 Sample size 12,911 12,464 12,462 Source: Author’s calculations using WVHEPC administrative data on first-time degree-seeking freshmen aged 19 and younger, enrolling in the fall semester of school years 2000–2001 through 2003–2004. Unless otherwise noted, the sample is restricted to West Virginia residents who met the high school GPA 3.0Ⳮ and ACTSAT 211000Ⳮ score requirements for PROMISE eligibility. Notes: Robust standard errors, clustered by cohort, are in parentheses. Stars indicate the significance of individual findings at the p ⬍ 0.10, p ⬍ 0.05, or p ⬍ 0.01 level note that due to the small number of clusters, critical values are taken from the relevant small-sample T distribution rather than the standard normal. All regressions include indicator controls for gender, raceethnicity, age, high school GPA and GPA squared, and indictors for each ACT score. The proportion of the sample that is trimmed, 6 percent, is calculated based on an analysis of the enrollment shifts displayed in Figure 6. I then identify the set of students in the “after” cohorts with the top 6 percent of values on either cumulative GPA equivalent to a 3.90 or above after four years or cumulative credits earned after four years 149 credits or above, respectively, and reestimate the effects with these students excluded. a. For students who drop out, cumulative GPA is imputed as the cumulative GPA when last enrolled. b. I calculate average weekly earnings based on the four quarters of school year employment data that are available for all cohorts, corresponding to the spring of the second sophomore year, the spring and fall of the third year, and the fall of the fourth year following enrollment. Scott-Clayton 637 who otherwise would not have enrolled at all or would have enrolled with an ACT score below the cutoff, the net effect of selection may well be a downward rather than upward bias. 31 V. Inside the Black Box: Are Impacts Driven by Cost- of-College or Incentive Effects?