Econometric issues Directory UMM :Data Elmu:jurnal:E:Economics of Education Review:Vol20.Issue3.2001:

264 M.J. Hilmer Economics of Education Review 20 2001 263–278 of Lee 1983 type two-stage corrections for self-selec- tion bias, we extend the analysis to consider such selec- tivity correction models. The estimated results for the years of college completed by 2- and 4-year attendees suggest that the choice of specification of the first-stage college attendance equation may have a significant impact on the second-stage selectivity-corrected coef- ficient estimates. Namely, the effects of several key vari- ables are estimated to be statistically significant under some specifications but not others. Prominent among these are test scores, which are only estimated to have large and significant effects among 4-year attendees for the ordered probit and a series of family background and high school performance measures, which are only esti- mated to have large and significant effects among 2-year attendees for the multinomial logit. In addition to the estimated coefficients differing across specifications, pre- dicted outcomes for students of different genders and ethnicities possessing average sample characteristics appear to differ across specifications. Hence, the results suggest the importance of considering specification issues before estimating the college attendance equation, especially when being used as the first stage of selection correction models.

2. Econometric issues

The econometric specifications examined in this study are well known and are all examples of discrete choice models. 1 In the context of college attendance, the models all assume that a student makes his or her attendance decision on the basis of a latent variable, either the expected utility of an attendance option, the probability of college graduation, or more generally the underlying propensity to attend college. Unfortunately, the researcher does not directly observe the latent variable. Instead, he or she only observes the student’s actual attendance decision for purposes of this study 4-year college attendance [E i = 2], 2-year college attendance [E i = 1], or non-attendance [E i = 0]. 2 The discrete choice models discussed below are all methods of “back- 1 Descriptions of the models analyzed in this study are available in most econometric texts. For a nice intuitive dis- cussion of these types of models see Kennedy 1998. For a more rigorous treatment, the classic references are Maddala 1983 and Greene 1997. 2 There may be some question as to the definition of 2-year colleges. In this study, a student is defined as attending a 2- year college if they are taking academic courses at a 2-year college. Students taking only vocational courses are defined as being non-attendees. All three models in this study could also have been estimated with vocational school defined as a separ- ate attendance path. As with Weiler 1989, doing so does not significantly alter the results. tracking” from the observed attendance decision to the underlying relationships between certain explanatory variables and the attendance path decision. While the basic goals of the three models are the same, they differ according to the assumptions made about the relationship between the different attendance options. 2.1. Mutinomial logit Estimation of the multinomial logit follows directly from expected utility maximization. As with other ran- dom utility models, the multinomial logit assumes that a student chooses which attendance path to follow by comparing the indirect utility provided by each path and choosing the one that provides the highest. For the cur- rent application, the student’s attendance path choice can be defined as: E i = 2 if B 2 9X i + emaxB 1 9X i + e i , B 9X i + e i = 1 if B 1 9X i + emaxB 2 9X i + e i , B 9X i + e i = 0 if B 9X i + emaxB 2 9X i + e i , B 1 9X i + e i 1 where X i is a vector of observed individual character- istics and state-level relative net attendance costs that affect the student’s expect utility from each attendance option and e i is an i.i.d log Weibull distributed error term. 3 Parameters to be estimated by maximum likeli- hood are B , B 1 , and B 2 . The multinomial logit has gained favor in estimating discrete choice models due to it computational ease. Namely, the probability of choosing each potential out- come can be easily expressed and the resulting log-likeli- hood function can be maximized in a straightforward fashion. A potential shortcoming of the multinomial logit is its reliance on the independence of irrelevant alterna- tives IIA. The IIA property assumes that the relative probability of two existing outcomes is unaffected by the addition of a third outcome. For example, suppose that an individual’s choice is initially between two different outcomes and that he or she is evenly split between the two. Now, suppose we add a third alternative that is nearly identical to the second. We would then expect the probability of choosing the second outcome to be split in half and the probability of choosing the first outcome to be unaffected. Unfortunately, the IIA property does not account for this, but rather splits the probabilities equally among all three alternatives in order to keep the 3 The Log Weibull Type I extreme-value distribution is assumed due to its convenient property that the cumulative den- sity of the difference between any two random variables distrib- uted Log Weibull is given by the logistic function Kennedy, 1998, pp. 244. 265 M.J. Hilmer Economics of Education Review 20 2001 263–278 relative probabilities of the first two options equal. 4 Hence, in cases where two alternatives are close substi- tutes the multinomial logit may be inappropriate as it relies on the IIA property. Hausman and McFadden 1984 suggest a specification test, based on dropping a category from the estimation and observing whether the estimated coefficients change, that can be used to assess the validity of the IIA property in the model logit model. This test provides a means test whether the multinomial logit is an appropriate specification for this exercise. 2.2. Ordered probit The ordered probit assumes that the variable of inter- est follows a strict ordering based on the value of the latent variable. Hilmer 1998 suggests that the latent variable is the student’s subjective probability of gradu- ation and that his or her decision follows the natural ordering of students with the highest probabilities attending 4-year colleges, students with midrange prob- abilities attending 2-year colleges, and students with the lowest probabilities attending neither institution. 5 Accordingly, the student’s attendance path can be defined as: E i = 2 if a 2 , d9X i + m i ,` = 1 if a 1 , d9X i + m i , a 2 = 0 if − `, d9X i + m i , a 1 2 where X i is a vector of factor’s affecting the student’s subjective probability of graduation and m i is a normally distributed error term. a 1 and a 2 partition the student’s attendance path choice into the decision to attend a 4- year college, attend a 2-year college, or attend no post- secondary institution and therefore represent the mini- mum probability levels at which a student chooses to 4 As a simplified example, suppose that in the absence of 2- year colleges a student is equally likely to choose to attend a 4-year college 12 as to not attend college 12. Now, suppose the student is given the choice between a 2-year college and a 4- year college and assume that he or she views the two as perfect substitutes. We would then expect the probabilities of non- attendance, 2-year attendance, and 4-year attendance to be, 12, 14, 14. This is not how the multinomial logit treats the prob- abilities, however. Due to the IIA property, the multinomial logit treats the probabilities as 13, 13, 13 in order to keep the relative probabilities of non- and 4-year attendance constant. 5 Hilmer 1998 explains the intuition as follows: “To avoid the time cost associated with transferring, a student who thinks he or she is likely to graduate will start at a university. A student who is uncertain about his or her ability will start at a com- munity college since the foregone cost of the first 2 years will be much lower should he or she be forced to drop out. A student who is not likely to graduate will choose to work since doing so will make him or her better off than attending a community college for 2 years and dropping out.” attend a 4-year college and a 2-year college. Parameters to be estimated by maximum likelihood are d, a 1 , and a 2 . A primary difference between the multinomial logit and the ordered probit is that due to the assumed natural ordering the latter does not require the IIA property. However, for the model to be appropriate, the assumed natural ordering must be realistic. For example, the natu- ral ordering of 4-year2-yearnon-attendance seems reasonable at least for students expecting to receive a Bachelor’s degree due to the lower attendance cost at 2-year colleges and the transfer cost associated with transferring from a 2-year college to a 4-year college. 6 On the other hand, if one were examining the decision between public and private 4-year colleges assuming a natural ordering of privatepublic may not be reasonable as it has been demonstrated that many students choose to attend public institutions that are potentially lower in quality than the private colleges they would have chosen in order to take advantage of the in-kind subsidy afforded by public higher education Ganderton, 1992. This observation suggests that the estimated thresholds in the ordered probit model should always be significant. If not, then we might conclude that the assumed natural ordering and consequently the ordered probit is an inap- propriate specification for this exercise. While this obser- vation is potentially valuable in determining whether the ordered probit is inappropriate it would be of limited value in assessing whether it is superior to the alterna- tives models we are discussing. 2.3. Bivariate probit with sample selection The bivariate probit with sample selection Greene, 1998 assumes that the potential student makes two sequential decisions: 1 whether to attend a postsecond- ary institution and 2 if so which type of institution to attend. The model can thus be defined as: 7 Z 1i = f 1 9X 1i + e 1i E 2 = 1 if Z 1i . E 1 = 1 otherwise Z 2i = f 2 9X 2i + e 2i Z 1i observed if Z 2i . 0 E = 1 otherwise e 1i , e 2i BVN0,0,1,1, r 3 6 Community college transfer students may be forced to take longer to graduate for a variety of reasons. For example, com- munity college students often take smaller class loads than uni- versity students, and as a result, are required to either spend longer taking classes at the community college before transfer- ring or at the university after transferring. Either way, such stu- dents will be required to spend longer in school before receiving their degree. 7 This nesting structure is much like the nested logit model proposed by Weiler 1989. Because the two nesting structures are similar and the bivariate probit is computationally simpler, especially in two-stage selectivity-correction models, the bivari- ate probit is much more popular. 266 M.J. Hilmer Economics of Education Review 20 2001 263–278 where Z 1i and Z 2i are the latent variables determining the attendancenon-attendance and 2-year4-year attendance decisions, X 1i and X 2i are vectors of individual- and state- specific characteristics affecting those decisions, and the error terms e 1i and e 2i are distributed bivariate normal BVN with r representing the correlation coefficient between the two. Parameters to be estimated by maximum likelihood are f 1 , f 2 , and r. As with the ordered probit, a potential benefit of the bivariate probit with sample selection is that by assuming the two attendance decisions are made sequentially, the model does not rely on the IIA property. A potential drawback is the requirement that the error terms from the two equations be distributed jointly normal. Due to this requirement, it should be possible to determine whether the model is inappropriate by testing whether the assumed joint normality of the two error terms holds. Again, while such a test is valuable in determining whether the bivariate probit with sample selection is inappropriate it would be of limited value in assessing whether a model for which joint normality is not rejected is superior to the models discussed above. 2.4. Correcting for self-selection bias An application of the college attendance equation that has recently become popular is as the first-stage in two- stage econometric models that correct OLS estimates for the presence of self-selection bias. For example, Brewer, Eide and Ehrenberg 1999 estimate a multinomial logit selection model to correct for potential selectivity bias in the return to elite private colleges while Ganderton 1992 estimates a bivariate probit selection model to correct for potential selection bias in student quality choices at public and private universities. The problem inherent in such studies is that the observed outcomes are the result of non-random decisions. Namely, the return to quality for students attending elite, private institutions and the quality choices of students attending public and private universities are only observed for students mak- ing the non-random decisions to attend an elite, private institutions and public or private universities and not for the entire population of college-age students. As Heck- man 1979 and others demonstrate, this non-ran- domness, or self-selection of college attendance choices violates the familiar Gauss–Markov assumptions. Conse- quently, estimating the desired outcomes by OLS yields potentially biased results. To correct for the potential of self-selection bias, most studies employ the two-stage methodology of Lee 1983. According to this methodology, it is possible to correct for the non-random assignment to different attendance paths by: 1 estimating the student’s self- selected college attendance choice and 2 using those results to calculate selectivity-correction terms that when included as regressors in the second-stage functions cor- rect for the potential self-selection bias. The intuition behind this procedure is that the selectivity-correction terms are derived from the attendance path estimates and therefore include the important effect that a student’s unobservable characteristics have on his or her attend- ance path decision. Including those terms in the second- stage functions then corrects for the bias that may be induced by students with identical observable character- istics non-randomly self-selecting different attendance paths and subsequently making persistence decisions that differ due strictly to differences in their unobservable characteristics. In the work below, we examine the effect that choice of specification for the college attendance equation has on the two-stage selectivity-corrected results by estimat- ing the years of college that a student completes. The econometric model to be estimated is a system of reduced form equations that can be specified as: E i estimated as a multinomial logit, ordered probit, 4 or bivariate probit Y i 5 h 1 W i 1 h 2 l i 1 n i 5 where Y i represents the number of years of college that the student completes, W i is a vector of observed individ- ual characteristics affecting the student’s college persist- ence decision, l i is the selectivity-correction term derived from the first-stage Eq. 4, and n i is a stochastic error term. Parameters to be estimated are W i and l i , with W i representing the selectivity-corrected results.

3. Data