379 S.L. DesJardins et al. Economics of Education Review 18 1999 375–390
ture research, one drawback of model 1 is that it assumes that all the determinants of the event are
accounted for by the explanatory variables z
k
. Model 1 also assumes that the effects of the explanatory vari-
ables are constant over time. Violations of either of these assumptions, which are common when doing social
science research, may cause biased estimates.
The model outlined below generalizes model 1 by allowing for time-varying effects and includes an unob-
served heterogeneity variable. The new model is there- fore a substantial improvement over the proportional
hazards model presented above see McCall, 1994 for details. To account for unobserved heterogeneity, it is
assumed that the event of interest is influenced by a ran- dom variable u, where u is unobserved and distributed
independently of z
k
. Let G denote the cumulative distri- bution function c.d.f. of u. For identification purposes
the mean of u is fixed at 1 the unit mean assumption is simply a normalization; the mean could be fixed at any
finite value. Let PK 5 k uK k 2 1, z
1
, …, z
k
, u rep- resent the conditional probability that the event occurs
in period k given that it has not occurred in the first k 2 1 periods of enrollment. The values of the time-varying
regressors in periods 1 through k, z
1
, …, z
k
are observ- able and the unobserved variable is specified as u. It is
assumed that PK 5 k
uK k 2 1,z
1
,…,z
k
,u 5 3
1 2 exp 2 expa
k
1 b
k
z
k
u where b
k
measures the possibly time-varying effect of z
k
in period k and a
k
is again a time-varying constant term, k 5 1, 2, 3, …. Model 1 is the special case of
3 where b
k
5 b for all k and u equals 1 with prob-
ability one. Model 3 is estimated by maximum likelihood and
non-parametric maximum likelihood techniques see Heckman Singer, 1984. McCall 1994 has shown
that G is non-parametrically identified so that the latter method is feasible.
The competing risks models jointly estimating first stopout and graduation estimated were run as robustness
checks of our single risks results, specify a functional form similar to 3 for each risk. Each risk has a separate
unobserved heterogeneity variable although it is possible that they are correlated.
4. Specifying the empirical model
4.1. The sample and statistical routine Table 1 provides a detailed description of the compo-
sition of the sample. The original sample consisted of 4100 students who entered the University of Minnesota
Minneapolis campus only as New High School students
Table 1 Descriptive statistics of the sample
Variable Term one
Term one Term one
range or mean SE
Asians 0–1
0.05 —
Blacks 0–1
0.02 —
Whites 0–1
0.91 —
Hispanics 0–1
0.009 —
Females 0–1
0.47 —
Disabled 0–1
0.02 —
ACT score 3–36
22.8 5.02
HS rank 1–99
70.9 23.24
From metro area 0–1
0.65 —
From out of state 0–1
0.16 —
MN From reciprocity
0–1 0.14
— state
From other US state 0–1 0.06
— Enrollment age
15–39 18.3
1.2 Institute of
0–1 0.21
— Technology
General College 0–1
0.14 —
College of Liberal 0–1
0.65 —
Arts Cum GPA
a
0–4.00 2.58
0.87 Athlete
a
0–1 0.03
— Transfer credits
1–39 13.5
9.7 Loan
a
40–3011 774
316 Earnings
a
11–2734 603
466 Scholarship
a
8–2430 458
528 Grants
a
18–1908 609
373 Workstudy
a
3–4049 1955
1392
a
Indicates possibly time-varying regressors.
in the fall term of 1986.
2
After deleting records with missing information the effective sample size used in the
event history procedure was 3975, or roughly 97 of the original sample. Twenty-two terms of data were col-
lected on these individuals from a variety of institutional sources. It should be noted that this dataset includes only
one record per person. In contrast, when using logistic regression to do event history modeling one must con-
struct a “person-period” dataset which includes a record for each time period in which the individual is at risk of
the event see Allison, 1984; Singer Willett, 1991; Yamaguchi, 1991; DesJardins, 1993.
After construction, the dataset was moved to a Cray X-MP-EA supercomputer housed at the Minnesota
Supercomputer Institute. The single risks models were
2
New High School students are students entering the Univer- sity with fewer than 39 transfer credits. Some of these students
may have taken college course work while in high school through the Postsecondary Educational Opportunity Program
funded by the state of Minnesota.
380 S.L. DesJardins et al. Economics of Education Review 18 1999 375–390
estimated with a FORTRAN program initially developed by Bruce Meyer of Northwestern University and modi-
fied for our purposes by co-author Brian McCall. The competing risks specifications were estimated using a
statistical model designed and programmed by McCall. The maximum likelihood technique used to estimate the
models is an iterative process and, coupled with the large number of parameters included as regressors, the amount
of memory needed to estimate the models is substantial.
As indicated in Table 1, the value of some of the inde- pendent variables may change from term-to-term. Also,
the averages cited for each of the financial aid variables are conditional on receipt of an aid offer, and the transfer
credit average includes only students with previous col- lege credits which is why the transfer credit and finan-
cial aid variables are italicized in Table 1. Another note about the financial aid variables is necessary. For the fol-
lowing analyses the amount of aid by type offered to an individual is the relevant measure. Aid offered is used
in an attempt to mitigate the self-selection endogeneity bias that results if financial aid paid is used.
4.2. The empirical model As mentioned above, there are four different model
specifications with respect to the outcome of interest; time to first stopout, time to dropout, time to “censored”
dropout, and a competing risks model of the duration to first stopout and graduation. For each of the models
estimated the specification of the dependent variable is duration until the time of the relevant event events in
the competing risks case. Thus, conditional on having the event, for each individual in the sample we know the
time the number of terms to the relevant events. Time to events, therefore, is the fundamental outcome of
interest in each of the models estimated.
As is the case for many discrete dependent variable models, an unobserved continuous variable representing
the individual’s utility level actually generates the dis- crete outcome of interest. It is assumed that each student
makes a decision about continued enrollment by weigh- ing the future costs and benefits of going to college. If
the net benefits are negative positive, the student exits remains enrolled in college. Thus, the assumption that
students make rational utility calculations allows us to implicitly include student intentions, a factor found to be
important in the student-departure literature Tinto, 1975; Bean, 1978. That students base the choice of
whether to stay or not to stay in school on internal opti- mality conditions is an often overlooked but important
point. “It is this choice component that distinguishes the econometric analysis of transition data from standard
applied statistical analysis of survival and transition data and gives a richness but also an added complexity to
econometric work” Lancaster, 1990, p. 6.
The vectors of regressors z
1
, …, z
k
specified in 3 include
individual background, organizational, and
environmental variables. The independent variables included in the models were chosen based on theoretical
considerations and previous research on student depar- ture. Individual background variables include race, gen-
der, age, initial home location and whether the student has a disability. Race is entered into the models by
inclusion of three dummy variables Asian-American, African-American, and ChicanoHispanic. The refer-
ence group is white students. American-Indian and inter- national students were omitted because of the small num-
ber of these students in the sample. Race was included in the empirical specification since many studies of student
departure have found that minority students tend to have higher probabilities of dropout and stopout, and lower
probabilities of graduation than majority students. There may very well be race differences but it is also possible
that these results are a function of insufficient control variables in models used to date. Race was also included
because little information is available about the time pro- file of college departure for minority students.
Gender is specified by inclusion of a dummy variable indicating whether the student is a female or not. Over
the years, conflicting results have been found with regard to the relationship between gender and college departure.
Thus, this variable is included to examine whether there are longitudinal differentials by gender. Age at the time
of initial enrollment is also included as a regressor and it is hypothesized that older students are more likely to
stop out and drop out than traditional college students given the likelihood of significant time constraints jobs,
family. A variable indicating whether the student is dis- abled is also included. No a priori empirical research was
found indicating whether disabled students were at higher risk of leaving college before graduation. During
the 1980s, however, disabled students at the study insti- tution had voiced concerns about physical access to
buildings, classrooms, and special programs. Therefore, the inclusion of this variable seemed appropriate.
Home location is included as a control and oper- ationalized by a series of dummy variables. These vari-
ables include whether a student is from the Twin Cities metropolitan area, from greater Minnesota, or from a tui-
tion reciprocity agreement state. The reference category consists of students who are non-Minnesotans and not
from a reciprocity state. Research has shown that dis- tance from campus to a student’s home is associated with
persistence Ramist, 1981. Also, since students from tui- tion reciprocity agreement states receive discounted tui-
tion relative to the normal non-resident tuition, we included the reciprocity controls to examine the effects
that these agreements have on student departure.
Precollege variables are also considered appropriate background variables and include a student’s overall
score on the ACT entrance exam and the student’s high school rank percentile. Expectations are that students
381 S.L. DesJardins et al. Economics of Education Review 18 1999 375–390
with high scholastic aptitudes should have relatively more academic potential and would be less likely to exit
before graduation see Spady, 1970. It is also possible, however, that there are differences in what the ACT
exam score and high school rank percentile typically used as ability proxies are measuring. ACT scores mea-
sure ability within the pool of all entrance exam test tak- ers and if students with high ACT scores have more
schooling options, they may be prone to leave insti- tutions if they perceive it to be a bad academic fit. High
school rank percentile reflects variation within one’s high school and after controlling for other ability
measures may also be thought of as a proxy for stud- ent effort.
Also included is a variable indicating the number of transfer credits of University matriculants. Students who
have had some prior college experience they could have taken college course work while in high school should
be better able to adjust to college life, be more likely to become academically and socially integrated Tinto,
1975, and, therefore, be more likely to persist and graduate than students entering college for the first time.
There is at least one alternative hypothesis though: stu- dents who enter with previous college course work may
be “movers” who are searching for the right institutional fit and are therefore more likely to leave.
Institution-related variables are included to examine the effects of student interactions with the institution.
The initial collegiate unit of enrollment of a student is included to examine whether there are college specific
environmental factors that help to explain student depar- ture from college. Cross-sectional designs have found
that students in the Institute of Technology IT are less likely to drop out and more likely to graduate than Col-
lege of Liberal Arts CLA students Matross DesJard- ins, 1994. General College GC students,
3
on the other hand, appear to have lower chances of graduating and
higher chances of dropping out than students enrolled in other collegiate units. It is not clear, however, that these
subgroup results garnered from aggregate graduation and retention rate data will hold after accounting for other
factors usually found to be related to student departure. Also, even though students may change collegiate units
during their academic careers, we were only interested in their initial college of enrollment because most student
departure at this institution takes place from these units. In future studies we will examine how transferring
among colleges allowing collegiate unit to vary by term affects student departure decisions, especially at the
upper division.
3
General College enrolls underprepared and other special needs students and prepares them for transfer to schools and
colleges of the University and other higher education insti- tutions. General College does not grant degrees.
A student’s grade point average for each term of enrollment is calculated and included to control for vari-
ations in academic performance. One’s grade point aver- age is also hypothesized to be the reward for successful
academic achievement. Financial aid offered is included for each term and dissagregated into its component parts:
loans, scholarships, grants, workstudy earnings, and earnings as a student employee other than workstudy
on campus. Typically, financial circumstances are con- sidered environmental variables because they are out of
the control of the institution Bean, 1981. Since state and institutional policymakers do have some direct con-
trol over the way aid is distributed, aid variables are con- sidered organizational in this model. For instance, grants
are included separately from scholarships in order to examine whether there are differences in how these
sources of aid independently affect student departure [St John and Starkey 1995 discovered that financial sub-
sidies have differential effects]. To examine these effects in more detail we plan to disaggregate grants into federal
and state components so that we can evaluate whether these aid packages have differential effects on student
departure.
Finally, a dummy variable is included indicating whether the student is an athlete during each term of
enrollment. Athletes’ rates of dropout and graduation have been a source of much discussion nationally and at
the study institution. Therefore a variable distinguishing athletes from the general student population is included
in an effort to better understand the longitudinal nature of athletes’ academic progress Naughton, 1996.
5. The empirical results