Sample Selection and the Horowitz- Manski Bounds

Our empirical results characterize the heterogeneous impact of JC training at differ- ent points of the wage distribution. The estimated bounds for a sample that excludes the group of Hispanics suggest positive effects of JC on wages, both at the mean and throughout the wage distribution. For the various non- Hispanic demographic groups analyzed, the statistically signifi cant estimated average effects are bounded between 4.6 and 12 percent, while the statistically signifi cant quantile treatment effects are bounded between 2.7 and 14 percent. Our analysis by race and gender reveals that the positive effects for blacks appear larger in the lower half of their wage distribution, while for whites the effects appear larger in the upper half of their wage distribution. Non- Hispanic females in the lower part of their wage distribution do not show statisti- cally signifi cant positive effects of JC on their wages, while those in the upper part do. Lastly, our set of estimated bounds for Hispanics are wide and include zero. 3 The paper is organized as follows. Section II presents the sample selection problem and the Horowitz and Manski 2000 bounds. Sections III and IV discuss, respectively, bounds on average and quantile treatment effects, as well as the additional assump- tions we consider. Section V describes the JC program and the NJCS, and Section VI presents the empirical results from our application. Section VII concludes.

II. Sample Selection and the Horowitz- Manski Bounds

We describe the sample selection problem in the context of estimating the causal effect of a training program for example, JC on wages, where the problem arises because —even in the presence of random assignment—only the wages of those employed are observed. Formally, consider having access to data on N individuals and defi ne a binary treatment T i , which indicates whether individual i has participated in the program T i = 1 or not T i = 0. We start with the following assumption: Assumption A. T i is randomly assigned. To illustrate the sample selection problem, assume for the moment that the indi- vidual’s wage is a linear function of a constant term, the treatment indicator T i and a set of pretreatment characteristics X 1i . 4 1 Y i = β + T i β 1 + X 1i β 2 + U 1i , where Y i is the latent wage for individual i, which is observed conditional on the self- selection process into employment. This process is also assumed for the moment to be linearly related to a constant, the treatment indicator T i and a set of pretreatment characteristics X 2i , 2 S i = δ + T i δ 1 + X 2i δ 2 + U 2i , 3. Our set of estimated bounds for Hispanics does not employ one of our assumptions individual-level monotonicity of the treatment on employment for reasons that will be discussed in subsequent sections. 4. Linearity is assumed here to simplify the exposition of the sample selection problem. The nonparametric approach to address sample selection employed in this paper does not impose linearity or functional form assumptions to partially identify the treatment effects of interest. Similarly, S i is a latent variable representing the individual’s propensity to be em- ployed. Let S i denote the observed employment indicator that takes values S i = 1 if individual i is employed and 0 otherwise. Then, S i = 1[S i ≥ 0]=1, where 1[·] is an indicator function. Therefore, we observe individual i’s wage, Y i , when i is employed S i = 1 and it remains latent when unemployed S i = 0. In this setting, which assumes treatment effects are constant over the population, the parameter of interest is β 1 . Conventionally, point identifi cation of β 1 requires strong assumptions such as joint independence of the errors U 1i , U 2i in the wage and employment equations and the regressors T 1 , X 1i , and X 2i , plus bivariate normality of U 1i , U 2i Heckman 1979. The bivariate normality assumption about the errors can be relaxed by relying on exclusion restrictions Heckman 1990; Imbens and Angrist 1994, which requires variables that determine employment but do not affect wages, or equivalently, variables in X 2i that do not belong in X 1i . However, it is well known that fi nding such variables that go along with economic reasoning in this situation is extremely diffi cult Angrist and Krueger 1999, 2001. More generally, in many economic applications it is diffi cult to fi nd valid exclusion restrictions. An alternative approach suggests that the parameters can be bounded without rely- ing on distributional assumptions or on the availability and validity of exclusion re- strictions. Horowitz and Manski 2000; HM hereafter proposed a general framework to construct bounds on treatment effects when data are missing due to a nonrandom process, such as self- selection into nonemployment S i 0, provided that the out- come variable has a bounded support. To illustrate HM’s bounds, let Y i and Y i 1 be the potential counterfactual wages for unit i under control T i = 0 and treatment T i = 1, respectively. The relationship between these potential wages and the observed Y i is that Y i = Y i 1T i + Y i 01 − T i . Defi ne the average treatment effect ATE as: 3 ATE = E[Y i 1 − Y i 0] = E[Y i 1] − E[Y i 0] . Conditional on T i and the observed employment indicator S i , the ATE in Equation 3 can be written as: 4 ATE = E[Y i | T i = 1, S i = 1] PrS i = 1 | T i = 1 +E[Y i 1 | T i = 1, S i = 0] PrS i = 0 | T i = 1 −E[Y i | T i = 0, S i = 1] PrS i = 1 | T i = 0 −E[Y i 0 | T i = 0, S i = 0] PrS i = 0 | T i = 0. . Examination of Equation 4 reveals that, under random assignment, we can identify from the data all the conditional probabilities PrS i = s | T i = t, for t, s = 0, 1 and also the expectations of the wage when conditioning on S i =1 E[Y i | T i = 1, S i = 1] and E[Y i | T i = 0, S i = 1]. Sample selection into nonemployment makes it impossible to point identify E[Y i 1 | T i = 1, S i = 0] and E[Y i 0 | T i = 0, S i = 0]. We can, however, construct HM bounds on these unobserved objects provided that the support of the outcome lies in a bounded interval [Y LB , Y UB ] , because this implies that the values for these unobserved objects are restricted to such interval. Thus, HM’s upper and lower bounds UB HM and LB HM , respectively are identifi ed as follows: 5 UB HM = E[Y i | T i = 1, S i = 1] PrS i = 1 | T i = 1 + Y UB PrS i = 0 | T i = 1 −E[Y i | T i = 0, S i = 1] PrS i = 1 | T i = 0 − Y LB PrS i = 0 | T i = 0 LB HM = E[Y i | T i = 1, S i = 1] PrS i = 1 | T i = 1 + Y LB PrS i = 0 | T i = 1 −E[Y i | T i = 0, S i = 1] PrS i = 1 | T i = 0 − Y UB PrS i = 0 | T i = 0. Note that these bounds do not employ distributional or exclusion restrictions as- sumptions. They are nonparametric and allow for heterogeneous treatment effects— that is, nonconstant effects over the population. On the other hand, a cost of imposing only Assumption A and boundedness of the outcome is that the HM bounds are often wide. Indeed, this is the case in our application, as will be shown below. For this rea- son, we take this approach as a building block and proceed by imposing more structure through the use of assumptions that are typically weaker than the distributional and exclusion restriction assumptions needed for point identifi cation.

III. Bounds on Average Treatment Effects