Data and Method Results

years of schooling and DTC; a male-female differential in the quality of educational attainment; occupational segregation of females into sectors where the returns to school- ing are relatively high; biased estimates attributable to a failure to take account of sam- ple selection; and biased estimates attributable to a failure to take account of the endogeneity of schooling or work experience. Doubtless this list is incomplete. The present intention is to argue that, of these explanations, the first may be an important one. It is suggested that schooling may have two effects on earnings, at least for females: a direct human capital effect, and an indirect effect via an attenuation of the adverse impact of DTC. There are two reasons for hypothesizing that the impact of discrimination may not be uniform in the labor market and that, in particular, it may be inversely related to the level of schooling. First, it is possible that the better educated an individual is, the more likely he or she is to have a degree or other formal qualification that would help to standardize wage offers regardless of sex. Second, it is possible that the better edu- cated a woman is, the more likely she is to be capable of resisting discrimination. Similar arguments may be made with respect to that component of the unexplained earnings gap attributable to tastes and circumstances. It is possible that the better edu- cated a woman is, the more likely she is to be willing to seek employment outside the low-paying traditionally female occupations. At the same time, it is possible that the better educated she is and the greater her potential earnings, the more capable she is of paying for childcare and other services that allow her to seek a wage offer that fully values her characteristics. The impact of these factors may be inversely related to the level of schooling and failure to allow for them could impart an upward bias in the estimated female schooling coefficient.

IV. Evidence from the National Longitudinal Survey of Youth 1979–

A. Data and Method

The data set used for the present analysis is the NLSY 1979–, a panel study sponsored by the Bureau of Labor Statistics and managed by the Center for Human Resource Research at the Ohio State University. It consists of a nationally representative core sample of 6,111 individuals aged 14–21 in 1979, the base year, and supplementary oversamples of minorities, poor whites, and those serving in the military. The survey was fielded annually until 1994 and since then it has been fielded biennially. The data used in the present analysis were taken from the core sample, the hourly earnings and other work variables as pooled data for the current or most recent job at the 1988, 1992, 1996, and 2000 interviews, with earnings being converted into 1996 constant dollars using the Urban Consumer Price Index. For the wage equations observations were dropped if the respondent was currently attending school, if high school transcript data had not been collected or were incomplete, if hourly earnings were less than 2.50 or more than 100 at 1996 prices, if hours worked were 0 or exceeded 60, or if there were missing data. Altogether in the wage equations there were 11,451 observations relating to 3,852 individuals. Table 1 presents summary statistics for key variables. Dougherty 973 To take account of the fact that there were multiple observations for most respon- dents, the model was fitted using random effects, the appropriate procedure in the case of a random sample from a large population if the unobserved heterogeneity is inde- pendent of the correlates Baltagi 2001; Hsiao 2002. One attraction of the NLSY data set for fitting wage equations is that it includes a large number of control variables including, in particular, measures of cognitive ability that allow one to minimize potential bias attributable to unobserved heterogeneity. Fixed effects estimation would in principle have been preferable, but the schooling variable would have been washed out. 9

B. Results

Column 1 of Table 2 shows the result of a conventional regression of the logarithm of hourly earnings on a female dummy variable, years of schooling, and a set of control variables. The latter comprised actual work experience and its square; tenure with the current employer and its square; dummy variables for black and Hispanic ethnicity; a dummy variable for being married with spouse present; the arithmetic reasoning, word knowledge, paragraph comprehension, numerical oper- ations, and coding speed test scores from the Armed Services Vocational Aptitude Battery; dummy variables for living in the country or on a farm when aged 14; a dummy variable for the purchase of magazines by anyone in the family when the respondent was aged 14; a dummy variable for living in the northeast, north-central, or west census regions; a dummy variable for living in an urban area; and the local unemployment rate. All of the control variables were interacted with the female dummy variable. 9. A Hausman test rejects random effects in favor of fixed effects, but this is inevitable with such a large sample. The Journal of Human Resources 974 Table 1 Summary statistics Males Females Geometric mean of hourly earnings 13.61 10.63 Mean years of schooling 13.39 13.63 Years of work experience 12.41 11.45 Years of tenure 5.16 4.54 Black 0.09 0.11 Hispanic 0.06 0.07 Government worker 0.12 0.20 Self-employed 0.07 0.05 Collective bargaining 0.19 0.13 Number of observations 5,734 5,717 Dougherty 975 Table 2 Wage equations, dependent variable logarithm of hourly earnings 1 2 3 4 5 Female −0.3560 −0.3917 −0.1636 −0.1232 0.0958 0.0968 0.0976 0.1076 0.1112 0.1307 Schooling 0.0490 0.0491 0.0505 0.0490 0.0505 0.0041 0.0041 0.0046 0.0041 0.0046 Schooling female 0.0196 — 0.0173 0.0097 0.0070 0.0057 0.0062 0.0062 0.0065 Schooling female1988 — 0.0273 — — — 0.0058 Schooling female1992 — 0.0220 — — — 0.0057 Schooling female1996 — 0.0183 — — — 0.0057 Schooling female2000 — 0.0158 — — — 0.0058 Index of DTC — — — −0.5597 −0.6079 0.1323 0.1986 Inverse of Mills’ ratio — — −0.0162 — −0.0178 0.0427 0.0389 IMRfemale — — −0.2288 — −0.2382 0.0595 0.0558 R 2 0.3917 0.3952 — 0.3930 — χ 2 — — 0.14 — 0.21 n 11,451 11,451 12,946 11,451 12,946 , significant at the 5 and 1 percent levels. Standard errors in parentheses. For controls, see text. The regression yields years of schooling coefficients of 0.0490 for males and 0.0686 for females, respectively, the differential of 0.0196 being significant at the 1 percent level and entirely in keeping with the findings in the literature review. To investigate the stability of the differential, the femaleyears of schooling interac- tive term was replaced by triple interactive terms for the four sample period years. The differential declines from 1988 to 2000 but remains highly significant Column 2. Column 3 of Table 2 presents the results of reestimating the model allowing for sample selection. The explanatory variables in the probit regression comprised all of those in the wage equation with age, a dummy variable for having a child aged younger than six in the household, and another dummy variable for having a child younger than 16 but not younger than six in the household, each with female interactive terms, added as identifying variables. The model was fitted using maxi- mum likelihood estimation with the inverse of Mills’ ratio interacted with the female dummy variable. There is evidence of significant selectivity for females but not males, the reduction in the negative coefficient of the female dummy variable suggesting that to a large extent the latter reflects the impact of selectivity, rather than being female per se . The male-female differential in the schooling coefficients is reduced to 0.0173, mostly as a consequence of the marginal increase in the male schooling coefficient to 0.0505. In line with most previous studies, the impact of adjusting for sample selec- tion bias does not appear to be dramatic. 10 To investigate the relationship between the impact of schooling on earnings and the impact of DTC, a Blinder–Oaxaca decomposition of the earnings gap was performed for each year of schooling, with those respondents with fewer than 11 years, or more than 17, being grouped into single categories. Table 3 shows the mean earnings of males and females by years of schooling, female earnings adjusted for differences in coefficients, 11 and the unexplained part of the decomposition attributed to DTC. The index of DTC thus computed for each year of schooling on the whole varies inversely with years of schooling as hypothesized. The estimates for some of the less numerous categories are imprecise but the negative association is confirmed by a descriptive regression of the index on years of schooling that yields a schooling coef- ficient of −0.0136, with t statistic −54.6. This negative association is consistent with the hypothesis that a minor but impor- tant side-benefit of schooling for females is that it reduces the gap in male and female earnings attributable to DTC. If this is the case, it would appear inevitable that esti- mates of the returns to schooling will be higher for females than for males in a con- ventional regression specification. The size of the coefficient in the descriptive 10. Kenny et al 1979 found that allowing for selection bias and endogeneity reduced their years of college coefficient from 0.041 to 0.038. Heckman 1980 found that allowing for selection bias increased the school- ing coefficient for white, married women from 0.076 to 0.078. Blau and Beller 1988 found that allowing for sample selection bias increased the male schooling coefficient in 1971 from 0.065 to 0.069 and increased the female coefficient in 1981 from 0.054 to 0.061. However, it had no effect on the female coefficient in 1971 or the male coefficient in 1981. Wellington 1993 did not find significant selection bias and there were minimal changes in the schooling coefficients for white men and women when allowance was made for it. 11. To make the decompositions comparable with those in most other studies, male coefficients were used for valuing characteristics. The analysis was repeated using Reimers’ decomposition instead characteristics valued by the average of the male and female coefficients. The results, presented in Appendix 2, are very similar. The Journal of Human Resources 976 regression suggests that about a half of the extra returns to schooling enjoyed by females could be attributed to this effect. To make the same point in a different way, Column 4 of Table 2 shows the results of controlling for DTC by including the index in the regression specification. Column 5 additionally allows for selectivity. A comparison of Columns 1 and 4, or Columns 3 and 5, suggests that failure to allow for variation in DTC has no effect on the male schooling coefficient but biases upward the female coefficient by approximately one percentage point and accounts for half of the differential in the schooling coefficients. Heckman, Lochner, and Todd 2003 suggest that a wage equation should allow for an interaction between the effects of schooling and experience on earnings. When such a term was added to the specification, together with a further interaction with the female dummy, it was found that it did have a significant coefficient 0.0021, standard error 0.0004, while the female interaction did not −0.0008, standard error 0.0006. The introduction of these variables caused the estimates of the returns to schooling of those with no experience to be relatively low 0.0236 for males, 0.0526 for females but it did not otherwise impact on the substance of the findings. For males and females with mean years of work experience, the differential in the schooling coefficients was 0.0190 in the specification without the index of DTC, and 0.0104 in the specification including it. These figures are close to those reported in Columns 1 and 4 of Table 2.

C. Other Possible Causes of the Differential