OLS Regression Estimates Results

Hubbard 573 I deflate all wage values using the Personal Consumption Expenditures PCE price index to 1982 dollars, and drop all observations with annual wages less than 3,484, or one-half minimum wage in 1982. For a 52-week year, this is equivalent to the 67week threshold used in Katz and Murphy 1992 and Mulligan and Ru- binstein 2008. 7 Finally, I also exclude observations flagged as containing “allo- cated” imputed values for education or the amount or source of wage and salary income. No results are sensitive to the inclusion or exclusion of either type of im- puted data. Table 1 provides descriptive statistics for the sample.

V. Results

I calculate college wage premiums by sex using alternate specifica- tions, each designed to account for topcoding of wage data. First, I run ordinary least squares OLS regressions in which topcoded wage values are adjusted to eliminate bias from censoring. Second, I deploy the Tobit model for censored re- gression. Third, I use median regressions, which are not sensitive to the values of upper-tail wages. Regardless of the method used, I find no female advantage in the college wage premium in recent years. I then examine separately the premiums associated with bachelor’s degrees and advanced degrees.

A. OLS Regression Estimates

My specification is fairly standard in the literature. I run a set of yearly regressions of log wages on a dummy for female sex, a dummy for college completion, an interaction between college completion and female sex, and a set of other controls all of which are interacted with female sex: lny ⳱␣Ⳮ␤Fem Ⳮ␥Educ Ⳮ␦Fem • Educ Ⳮ␪X Ⳮε 1 i i i i i i For my initial OLS regressions, is annual wage income and is a dummy y Educ i i for college graduate. The vector includes potential experience, potential experience X i squared, and Census region dummies, all of them interacted with the dummy for female sex. The difference between the male and female college wage premium is the coefficient on the femalecollege interaction term. ␦ Figure 2 shows the estimates of the college wage premiums from an OLS log wage regression when wages are recensored at 100,000. This generates the familiar conclusion that the college wage premium for women has been consistently higher than the premium for men. As we see from Figure 3, the gender difference is sta- tistically significant in nearly all years, and appears to be growing in recent years. As noted above, the problem with these estimates is that censoring wages at a topcode or censoring point will bias downward the coefficients of variables that T tend to raise wages above . Because the true values of topcoded wage observations T are not available, my approach is to replace topcoded log wage observations with their expected value; in other words, I choose an “adjustment factor” such that I A 7. The results are not sensitive to the trimming of outliers. See Bollinger and Chandra 2005. 574 The Journal of Human Resources Table 1 Descriptive Statistics for Sample of March CPS Cross-Sections 1970–2008 Share Share Year N Female College Censored Topcoded Year N Female College Censored Topcoded 1970 13,725 0.328 0.322 0.003 0.003 1990 18,849 0.409 0.444 0.017 0.017 1971 13,249 0.334 0.328 0.003 0.003 1991 18,248 0.417 0.449 0.018 0.018 1972 13,157 0.331 0.334 0.005 0.005 1992 18,160 0.424 0.473 0.022 0.022 1973 13,699 0.327 0.344 0.005 0.005 1993 16,790 0.412 0.486 0.027 0.027 1974 14,377 0.335 0.350 0.007 0.007 1994 17,313 0.409 0.490 0.034 0.034 1975 11,255 0.349 0.362 0.005 0.005 1995 14,905 0.416 0.492 0.028 0.012 1976 13,979 0.351 0.373 0.007 0.007 1996 14,855 0.413 0.488 0.030 0.014 1977 13,666 0.350 0.374 0.009 0.009 1997 14,734 0.421 0.496 0.038 0.015 1978 14,100 0.362 0.377 0.012 0.012 1998 14,802 0.419 0.510 0.046 0.020 1979 17,542 0.370 0.382 0.017 0.017 1999 14,838 0.421 0.519 0.052 0.023 1980 17,383 0.374 0.386 0.022 0.022 2000 13,849 0.421 0.531 0.060 0.026 1981 15,714 0.380 0.399 0.007 0.007 2001 23,114 0.425 0.539 0.064 0.029 1982 15,303 0.390 0.430 0.011 0.011 2002 22,227 0.428 0.553 0.076 0.017 1983 16,026 0.400 0.426 0.011 0.011 2003 21,719 0.426 0.554 0.078 0.016 1984 16,461 0.395 0.427 0.005 0.005 2004 21,366 0.421 0.557 0.084 0.018 1985 16,989 0.398 0.430 0.006 0.006 2005 21,568 0.425 0.557 0.094 0.019 1986 16,982 0.397 0.439 0.009 0.009 2006 21,414 0.424 0.580 0.106 0.023 1987 19,006 0.402 0.428 0.010 0.010 2007 21,460 0.431 0.591 0.111 0.023 1988 18,363 0.405 0.440 0.012 0.012 2008 21,251 0.440 0.601 0.124 0.026 1989 19,291 0.397 0.443 0.016 0.016 Note: “Share Censored” is the share of wage observations that are topcoded or greater than 100,000. Hubbard 575 Table 2 Estimates of College Wage Premium with Standard Errors, March CPS 1970–2008 Male Premium Female – Male Gender Difference OLS Censored OLS Adjusted Tobit Median OLS Censored OLS Adjusted Tobit Median Year 1 2 3 4 1 2 3 4 1970 0.376 0.378 0.376 0.380 0.036 0.034 0.035 0.095 0.010 0.010 0.010 0.009 0.020 0.020 0.020 0.016 1971 0.395 0.398 0.396 0.391 0.069 0.065 0.067 0.097 0.010 0.010 0.010 0.009 0.017 0.017 0.017 0.017 1972 0.390 0.395 0.392 0.378 0.045 0.040 0.043 0.093 0.010 0.011 0.010 0.009 0.017 0.017 0.017 0.017 1973 0.382 0.387 0.384 0.371 0.038 0.033 0.036 0.073 0.010 0.010 0.010 0.008 0.017 0.017 0.017 0.015 1974 0.358 0.364 0.361 0.358 0.051 0.046 0.049 0.067 0.010 0.010 0.010 0.008 0.016 0.017 0.016 0.014 1975 0.373 0.379 0.376 0.355 0.071 0.065 0.068 0.083 0.010 0.011 0.011 0.011 0.016 0.017 0.016 0.019 1976 0.377 0.384 0.381 0.354 0.036 0.029 0.033 0.073 0.009 0.009 0.009 0.011 0.016 0.016 0.016 0.019 1977 0.335 0.344 0.340 0.320 0.054 0.045 0.050 0.090 0.010 0.010 0.010 0.010 0.016 0.017 0.016 0.017 1978 0.327 0.339 0.334 0.324 0.055 0.044 0.049 0.067 0.010 0.010 0.010 0.009 0.016 0.016 0.016 0.015 1979 0.327 0.343 0.335 0.309 0.061 0.045 0.052 0.083 0.009 0.009 0.009 0.008 0.014 0.015 0.014 0.014 1980 0.329 0.348 0.340 0.321 0.065 0.047 0.054 0.087 0.009 0.010 0.009 0.009 0.014 0.015 0.015 0.015 1981 0.351 0.359 0.355 0.336 0.053 0.046 0.050 0.093 0.010 0.010 0.010 0.010 0.016 0.016 0.016 0.017 1982 0.377 0.388 0.382 0.371 0.042 0.032 0.037 0.061 0.011 0.011 0.011 0.010 0.017 0.017 0.017 0.017 1983 0.418 0.429 0.423 0.402 0.028 0.017 0.023 0.055 0.010 0.011 0.010 0.011 0.015 0.016 0.015 0.019 1984 0.419 0.425 0.422 0.394 0.052 0.046 0.050 0.078 0.010 0.010 0.010 0.011 0.015 0.016 0.015 0.017 1985 0.447 0.455 0.451 0.419 0.026 0.020 0.023 0.069 0.010 0.010 0.010 0.010 0.015 0.016 0.015 0.017 1986 0.442 0.453 0.447 0.430 0.054 0.046 0.050 0.069 0.010 0.010 0.010 0.011 0.016 0.016 0.016 0.017 1987 0.457 0.467 0.462 0.446 0.033 0.024 0.028 0.046 0.009 0.010 0.010 0.010 0.015 0.015 0.015 0.016 1988 0.464 0.475 0.470 0.457 0.043 0.034 0.038 0.052 0.010 0.011 0.010 0.012 0.016 0.016 0.016 0.018 continued 576 The Journal of Human Resources Table 2 continued Male Premium Female – Male Gender Difference OLS Censored OLS Adjusted Tobit Median OLS Censored OLS Adjusted Tobit Median Year 1 2 3 4 1 2 3 4 1989 0.493 0.511 0.502 0.480 0.041 0.026 0.033 0.064 0.010 0.010 0.010 0.010 0.015 0.015 0.015 0.017 1990 0.488 0.506 0.497 0.471 0.031 0.015 0.022 0.053 0.010 0.010 0.010 0.012 0.015 0.016 0.015 ⳮ 0.018 1991 0.518 0.540 0.529 0.504 0.016 ⳮ 0.002 0.007 0.030 0.010 0.011 0.011 0.012 0.015 0.016 0.016 0.020 1992 0.511 0.536 0.524 0.514 0.022 0.001 0.011 0.033 0.010 0.011 0.011 0.011 0.016 0.016 0.016 0.017 1993 0.528 0.556 0.542 0.524 0.025 0.002 0.013 0.035 0.011 0.012 0.011 0.011 0.016 0.017 0.016 0.017 1994 0.517 0.554 0.537 0.527 0.069 0.041 0.053 0.059 0.011 0.012 0.012 0.011 0.017 0.018 0.017 0.017 1995 0.531 0.567 0.559 0.537 0.053 0.021 0.029 0.051 0.012 0.013 0.012 0.014 0.017 0.018 0.018 0.022 1996 0.510 0.547 0.538 0.525 0.073 0.043 0.050 0.055 0.011 0.012 0.012 0.013 0.017 0.018 0.018 0.020 1997 0.528 0.570 0.560 0.533 0.051 0.020 0.026 0.051 0.011 0.013 0.012 0.015 0.017 0.018 0.018 0.023 1998 0.520 0.576 0.563 0.536 0.051 0.004 0.015 0.029 0.011 0.012 0.012 0.013 0.017 0.018 0.018 0.020 1999 0.521 0.585 0.569 0.541 0.090 0.039 0.052 0.073 0.011 0.013 0.012 0.011 0.017 0.018 0.018 0.017 2000 0.527 0.597 0.581 0.570 0.052 ⳮ 0.001 0.011 0.012 0.012 0.014 0.013 0.011 0.018 0.020 0.020 0.018 2001 0.542 0.617 0.600 0.582 0.057 0.000 0.013 0.018 0.010 0.011 0.011 0.011 0.015 0.016 0.016 0.018 2002 0.543 0.624 0.613 0.608 0.044 ⳮ 0.015 ⳮ 0.007 ⳮ0.021 0.010 0.012 0.011 0.012 0.015 0.017 0.017 0.018 2003 0.510 0.590 0.580 0.559 0.060 0.003 0.011 0.019 0.010 0.012 0.012 0.012 0.016 0.017 0.017 0.018 2004 0.520 0.611 0.599 0.583 0.067 ⳮ 0.004 0.006 0.021 0.010 0.012 0.012 0.011 0.016 0.018 0.018 0.018 2005 0.530 0.635 0.621 0.611 0.041 ⳮ 0.037 ⳮ 0.026 ⳮ0.045 0.010 0.012 0.011 0.011 0.016 0.017 0.017 0.018 2006 0.522 0.633 0.618 0.589 0.060 ⳮ 0.019 ⳮ 0.008 0.002 0.010 0.012 0.012 0.012 0.016 0.018 0.017 0.019 2007 0.522 0.630 0.615 0.610 0.071 ⳮ 0.002 0.007 0.005 0.010 0.012 0.012 0.012 0.015 0.018 0.017 0.018 2008 0.526 0.655 0.638 0.612 0.079 ⳮ 0.014 ⳮ 0.002 0.008 0.010 0.012 0.012 0.011 0.016 0.018 0.017 0.018 Hubbard 577 Figure 2 College Wage Premium, OLS Regressions, Recensored Wages Figure 3 Gender Difference in the Premium, Recensored OLS Regressions, with 95 Percent Confidence Interval Bands can replace topcoded wage observations observations where with new values y ⳱T i where the log of the adjusted wage is equal to the expected value of the y ⳱AT i log of the true unobserved wage . lnAT⳱E [ lny y T ] i i I generate year- and sex-specific adjustment factors for each year in my sample. 578 The Journal of Human Resources To do this, I employ the common assumption that upper-tail wages are Pareto dis- tributed. Given a Pareto distribution with minimum value and the Pareto index m parameter , the parameter uniquely determines the adjustment factor and vice k k A versa. If wages are Pareto distributed, then for any topcode value , y T i 1 1 k lnAT⳱E [ lny y T ] ⳱ lnTⳭ rA⳱e 2 i i k Thus, my task is to estimate by year and sex. I construct the likelihood function k for a censored Pareto distribution k k m km n N L k⳱ • 3 兿 兿 i⳱ 1 j⳱nⳭ 1 kⳭ 1 冢 冢 冣 冣 冢 冢 冣冣 T y j where there are observations, of which the first observations are topcoded at N n . T Maximizing log likelihood yields the following estimate for : k Nⳮn ˆk⳱ 4 N lny Ⳮn lnTⳮN lnm 兺 j j⳱nⳭ 1 The only unquantified term in Equation 4 is . Rather than impose an a priori m value for , I take advantage of the fact that the adjustment factor —and therefore m A —is known for each sex in recent years. With this information, I derive an estimate k of , which I then use in Equation 4 to estimate for all years. Since 1995, each ˆ m k topcoded wage observation in the CPS public-use data has been replaced with the mean wages of all topcoded observations in the same sex-by-race-by-FTFY-status demographic cell in other words, for each year and sex during 1995–2008, the CPS data provide . For each sex, I calculate the average ratio of these ex- E [ y y ⱖ T ] i i pected values to the topcodes; call this sex-specific ratio . 8 Again using the Pareto R distribution, I derive a value for directly from the ratio of topcoded values to the k topcode: R k⳱ 5 Rⳮ 1 I then calibrate for each sex such that the maximum likelihood estimate ˆ m k matches the derived directly from the 1995–2008 CPS data. 9 Armed with this k estimate of , I then use the estimator for given in Equation 4 to generate ad- m k justment factors for each sex and year in the sample. Finally, I recensor 1995–2008 wage observations at their topcodes and multiply all topcoded wage observations 8. The values for are 1.835 for men and 1.884 for women. R 9. Because nominal and real income levels are changing over time, I define as a multiple of the median m for that year: 1.149 men or 0.929 women times the median wage income. Hubbard 579 Figure 4 College Wage Premium, OLS Regressions, Adjusted Wages for 1967–2008 by their year- and sex-specific adjustment factors before taking logs of all wage values. 10 When I reestimate college wage premiums using adjusted wages, a very different picture emerges. See Figures 4 and 5. 11 I find little or no gender difference in the college wage premium since about 1990.

B. Tobit Regression Estimates