Split Population and Unobserved Individual Heterogeneity Identifiability Issues

166 The Journal of Human Resources out. This approach has been used in the econometric literature in the context of a ‘‘split-population’’ framework for a single risk Schmidt and Witte 1989. 8 In the context of a grouped duration model, a straightforward way to incorpo- rate the possibility of defective risks is to redefine the specific survival function as S˜ j m ⫽ 1 ⫺ P j ⫹ P j S j m , where P j is the proportion of movers associated with destina- tion j. Thus, the survival probability is given by the proportion of j stayers, 1 ⫺ P j , who do not exit into destination j with probability l, plus the proportion of movers, P j , multiplied by the corresponding probability of transition into j at m, S j m . Taking P j as additional unknown parameters to be estimated, the new parame- terization of the specific survivor function can be employed in a likelihood function identical to Equation 11. In order to guarantee that P j lies between zero and one, we employ the logit reparameterization for P j ⫽ expµ j l ⫹ expµ j . This use of a logit link function is inconsequential in terms of finding evidence of stayers since it does not preclude the possibility of P j being as close to one or zero as needed. A natural extension of this model is to allow P j to depend on a set of regressors z , leading to an extended logit link function P j ⫽ expµ j ⫹ z′γ j 1 ⫹ expµ j ⫹ z′γ j see Yamaguchi 1992. That is, we can again use the structure of Equation 11 to specify the likelihood function 14 L θ | t, j, x ⫽ 冦 兿 K⫺1 m⫽1 兿 2 j⫽1 冤 P j S j m⫺3 ⫺ S j m 1 ⫺ P j ⫹ P j S j m ⫺3 冥 δ mj 冧 冦 兿 K m⫽2 兿 2 j⫽1 冤 1 ⫺ P j ⫹ P j S j m 1 ⫺ P j ⫹ P j S j m⫺3 冥冧 1⫺δ mi , where θ now represents the vectors β j , µ j , γ j , and the baseline hazard parameters. A censored observation results from the interplay of being a U-E stayer namely, 1 ⫺ P 1 , being an U-I stayer 1-P 2 , being an U-E mover and not exiting to EP 1 S 1 m , and being an U-I mover and not exiting to I P 2 S 2 m . The probability of observing an incomplete duration will be given by the product of the probabilities of not exiting to employment being an U-E stayer plus being a survivor U-E mover and not exiting to inactivity being an U-I stayer plus being a survivor U-I mover.

G. Split Population and Unobserved Individual Heterogeneity

In order to account for both defective risks and gamma heterogeneity, we employ the transformation Sˆ j m ⫽ 1 ⫺ P j ⫹ P j 1 ⫹ σ 2 j Λ j m ⫺1σ 2 j . In short, inserting this defini- tion into Equation 11 we define the following likelihood function 15 L θ, σ 2 | t, j, x ⫽ 冦 兿 K⫺1 m⫽1 兿 2 j⫽1 冤 P j 1 ⫹ σ 2 j Λ j m⫺3 ⫺1σ 2 j ⫺ 1 ⫹ σ 2 j Λ j m ⫺1σ 2 j 1 ⫺ P j ⫹ P j 1 ⫹ σ 2 j Λ j m⫺3 ⫺1σ 2 j 冥 δ mj 冧 8. See also Pudney and Thomas 1995 for an extension of the split-population model to multiple destina- tions, and Maller and Zhou 1996 for an exploitation of duration models with long-term survivors. Addison and Portugal 167 冦 兿 K⫺1 m⫽1 兿 2 j⫽1 冤 1 ⫺ P j ⫹ P j 1 ⫹ σ 2 j Λ j m ⫺1σ 2 j 1 ⫺ P j ⫹ P j 1 ⫹ σ 2 j Λ j m⫺3 ⫺1σ 2 j 冥冧 1⫺δ m . In this case, there are two sources of unobserved heterogeneity competing with each other to account for unforeseen factors. On one hand, there is a distinction between movers and stayers, in terms of both employment and inactivity. On the other hand, conditional on being a mover, there is an error term in the specific hazard function that accounts for omitted variables. Finally, we note that the ML routine from the econometric package TSP Time Series Processor was employed to obtain the maximum likelihood estimates. In each case, starting values from a simple single risk specification were used.

H. Identifiability Issues

The specification of our models is based on a delicate compromise between parsimony and computational simplicity on the one hand, and flexibility and goodness of fit on the other. In the interests of tractability, we have assumed a independent competing risks, b conditional proportional hazards, and c para- metric gamma unobserved individual heterogeneity. We will not elaborate on these assumptions other than to note that they are fairly conventional and well studied see, in particular, Heckman and Honore´ 1989 and Crowder 2001, on the identifiability of competing risk models. A crucial issue for the present study is the identifiability of defective risks. There are two questions that we have to address. First, is it possible to identify nonparamet- rically the presence of defective risks? And second, subject to an affirmative answer to this question, can one be confident that the follow-up period is long enough to enable the detection of long-term survivors? Under ideal data conditions say, obser- vation over an infinite time horizon the answer to the first question is obviously yes, because one can then distinguish between finite and infinite durations. But even under less than ideal conditions one can still detect nonparametrically the presence of long-term survivors if the follow-up period is sufficiently long. In practice, the presence of defective risks would be indicated by an empirical survivor function such as is obtained from the Kaplan-Meier estimator that con- verges to a positive value; meaning that after some point—corresponding to the maximum noncensored time—all actual survival times will always be censored. Crit- ical to this assessment is the length of the follow-up period. It is clear that if the follow-up period is too short there is no hope of identifying the presence of defective risks because the censored observations can be generated by either movers finite durations or stayers infinite durations. Maller and Zhou 1996 discuss the tests for the presence of long-term survivors and sufficient follow-up in the context of a single risk framework. In our study we feel confident that having a follow-up period of up to 98 months is sufficient to detect the presence of long-term survivors. Fulfill- ing this informational requirement comes at the cost of having to deal with a stock- sampling plan. 168 The Journal of Human Resources Through functional form assumptions one can obtain parametric identifiability of defectiveness at the cost of some additional arbitrariness. Here we have explored two routes: the mover-stayer paradigm and the unlucky draws approach. Whereas the latter approach lends itself to a number of structural job search interpretations under which the hazard rate converges to zero at too fast a rate to allow all individuals to transition, the former arrives naturally from a sequential decision framework or from a finite mixture model McLachlan and Peel 2000. These are, in essence, two distinct ways to interpret duration data with defective risks. Nevertheless, it has to be admitted at the outset that one cannot tell which model applies solely from the duration data. In this sense, the two approaches are observa- tionally equivalent. Identification hinges on the functional forms of the hazard func- tions employed. 9 A crucial issue with defective risks models is the behavior of the model in the right tail of the duration distribution. The reason why we selected two flexible subhazard functions—piecewise-constant and polynomial—was the need to avoid imposing too much structure on the data, notably at the right tail of the survival distribution. We hope that by proceeding in this way there is a lesser risk of artificially generating evidence of defective risks. Finally, we decided to impose the proportional hazards functional form on the conditional specific hazard function. This form of hazard is more natural and has greater appeal than results from imposing proportionality on the unconditional hazard function Ibrahim, Chen, and Sinha 2001. It also provides a simple interpretation of the hazard regression coefficients. Proportionality is not necessarily indicated by theory and can and should be tested, as here.

IV. Findings