Manajemen | Fakultas Ekonomi Universitas Maritim Raja Ali Haji 073500108000000033

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

Dynamic Treatment Assignment
Peter Fredriksson & Per Johansson
To cite this article: Peter Fredriksson & Per Johansson (2008) Dynamic Treatment Assignment,
Journal of Business & Economic Statistics, 26:4, 435-445, DOI: 10.1198/073500108000000033
To link to this article: http://dx.doi.org/10.1198/073500108000000033

Published online: 01 Jan 2012.

Submit your article to this journal

Article views: 268

View related articles

Citing articles: 1 View citing articles

Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=ubes20
Download by: [Universitas Maritim Raja Ali Haji]

Date: 12 January 2016, At: 22:50

Dynamic Treatment Assignment:
The Consequences for Evaluations
Using Observational Data
Peter F REDRIKSSON
Institute for Labour Market Policy Evaluation (IFAU) and Department of Economics, Uppsala University, Uppsala,
Sweden (peter.fredriksson@ifau.uu.se)

Per J OHANSSON

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:50 12 January 2016

Institute for Labour Market Policy Evaluation (IFAU) and Department of Economics, Uppsala University, Uppsala,
Sweden (per.johansson@ifau.uu.se)
We discuss estimation of treatment effects when the timing of treatment is the outcome of a stochastic
process. We show that the duration framework in discrete time provides a fertile ground for effect evaluations. We suggest easy-to-use nonparametric survival function matching estimators that can be used to

estimate the time profile of the treatment. We study the small-sample properties of the proposed estimators
and apply one of them to evaluate the effects of an employment subsidy program. We find that the longerrun program effects are positive. The estimated time profile suggests locking-in effects while participating
in the program and a significant upward jump in the employment hazard on program completion.
KEY WORDS: Dynamic treatment assignment; Method of matching; Program evaluation; Treatment
effects.

1. INTRODUCTION
The prototypical evaluation problem is cast in a framework
in which treatment is offered only once. Thus treatment assignment is a static problem, and the information contained in the
timing of treatment is typically ignored (see Heckman, Lalonde,
and Smith 1999 and Imbens 2004 for overviews of the literature). This prototype concurs rather poorly with how most realworld programs work. Often it makes more sense to think of the
assignment to treatment as a dynamic process, where the start
of treatment is the outcome of a stochastic process.
This article is concerned with program evaluations when
(a) there are no restrictions on the timing of the individual treatment and (b) the timing of treatment is linked to the potential
outcome of interest. Evaluations of labor market programs often include these two features. A common evaluation problem
is when individuals may enter, for instance, a training program
at any time during the unemployment spell and interest lies in
the employment effects of this training program. This situation
can raise complex simultaneity issues, because both the outcome of interest (employment) and actual treatment status are

functions of potential unemployment duration; even if intended
treatment status were randomized at the time of unemployment
entry, dynamic selection would determine actual treatment status if a time span existed between randomization and the start
of treatment.
Our main objective is thus to discuss estimation of treatment
effects when assignment to treatment is without restrictions
on when to participate (i.e., it is the outcome of an stochastic
process). We propose an estimator of the effect of treatment on
the treated in this situation and examine the small-sample properties of this estimator using Monte Carlo simulations. Finally,
we illustrate the evaluation problem and the usefulness of the
estimator in an empirical application.
The empirical evaluation concerns the effect of an employment subsidy (ES) program on future employment. These

data were previously analyzed by Forslund, Johansson, and
Lindquist (2004). The subsidy was targeted at the long-term unemployed, registered as unemployed at the public employment
service (PES) for at least 12 months. The subsidy amounted
to 50% of total wage costs and was paid for a maximum
of 6 months. This evaluation problem has the features described above because after having become eligible (i.e., after
12 months), the subsidy program may start at any point in time.
In general, two types of approaches have been used in previous work to analyze the evaluation problem that we consider

here. One of these is to try to estimate the effect of treatment
on the treated on the employment propensity some fixed time
period after treatment; a recent example of this was provided
by Gerfin and Lechner (2002). In this approach, it is common
to use nonparametric matching procedures and to regard treatment assignment as a static problem. The other common approach is to estimate the effect on the hazard to employment.
Usually, more structure is imposed on the form of the hazard,
but there is also greater concern about unobserved heterogeneity than in the first approach. The most recent work (e.g., Abbring and van den Berg 2003; van den Berg, van der Klaauw,
and van Ours 2004) in this vein has explicitly recognized that
treatment assignment is a dynamic process and has shown that
if sufficient structure is imposed, this can be useful for identification purposes. Abbring and van den Berg (2003) showed that
a (homogenous) treatment effect can be identified, assuming a
mixed proportional hazards model in discrete time.
The estimator that we propose here uses elements from both
of these approaches. It is a nonparametric matching estimator in
discrete time. But unlike most estimators that take a matching
approach, the estimator recognizes that treatment assignment

435

© 2008 American Statistical Association

Journal of Business & Economic Statistics
October 2008, Vol. 26, No. 4
DOI 10.1198/073500108000000033

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:50 12 January 2016

436

Journal of Business & Economic Statistics, October 2008

is a dynamic process; assuming that the process of treatment
assignment is static inevitably yields a biased estimator.
Compared with the approach of Abbring and van den Berg
(2003), we thus can be more flexible in terms of specifying the
hazard; the cost of this added flexibility is that we must assume
that unobserved heterogeneity does not simultaneously determine the duration to employment if not treated and the duration
to treatment.
We focus on estimating the additive effect of entering ES
on the survival probability in unemployment for those entering this program. If the appropriate conditional independence
assumptions are fulfilled, we can estimate the time profile of

such treatment effects. Moreover, we can estimate treatment effects for different entry time points, as well as the overall effect of the program averaged over pretreatment durations. This
overall estimate is akin to the treatment effect estimated in a
randomized experiment in which actual treatment status is randomized among the stock of eligibles. Our matching estimator
of the overall effect balances the pretreatment durations in the
treated and comparison groups in the same way as random assignment in the experiment.
Estimating the time profile of treatment effects is particularly
interesting when it comes to the evaluation of labor market programs. Usually we think of program participation as an investment in current time (and money) for a potential increase in the
future employment probability. We would expect search activity to be decreased during the in-program period—that is, there
is a locking-in effect—whereas the outflow to employment may
increase after program completion.
The results from the empirical application concur with this
prior. During the first 6 months after entry into the subsidy
program, the employment outflow is lower among the treated;
this locking-in effect is particularly evident for those who enter
early after becoming eligible. After the maximum duration of
the program (i.e., after 6 months), a sharp increase in the employment hazard occurs. One may interpret the peak occurring
after program completion as evidence of displacement effects;
employers use the subsidy to fill vacancies that otherwise would
be filled by hiring on the regular market.
The rest of this article is organized as follows. In Section 2 we

present the evaluation framework. In Section 3 we consider estimation and inference, and suggest an estimator of the effect of
treatment on the treated on the survival rate in unemployment,
that is based on a discrete time assumption. In Section 4 we
study the small-sample performance of this estimator. In Section 5 we estimate the effects of the employment subsidy program on the survival rate in unemployment for those entering
the ES program. We present some conclusions in Section 6.

Assumption (unconfoundedness).
λ0i (t) = λ0 (t),
γi (t) = γ (t).
In other words, these two hazards are assumed to be identical across individuals conditional on the date of unemployment
entry and observed covariates. (Note that we surpress observed
covariates for most of our analysis.)
This assumption implies competing risks; that is, there is no
unobserved heterogeneity that jointly determines employment
and treatment assignment. Note that, because our assumption
rules out unobserved heterogeneity, it also rules out differences
in the hazard at t due to unobserved anticipation about future
(t + 1, . . .) events. (See Richardson and van den Berg 2001 for
a good discussion of anticipation effects in a competing-risks
framework.) Whether these assumptions are reasonable or not

depends on the richness of the data and the application in question.
It is also important to note what is not implied by our key
assumption. We emphasize two things: (a) There is no implied
restriction on the variation in potential duration across individuals, and (b) the response to treatment may well vary across
individuals (but they are not allowed to act on this heterogeneity).
Let us introduce some notation to further illustrate the evaluation framework. Let T 0 denote potential unemployment duration if not treated and S denote the potential duration until treatment start. In this setting, S is stochastically dependent on T 0 .
Whether or not we observe S depends, inter alia, on whether the
individual had the luck to receive a job offer before receiving
an offer to participate in treatment. In particular, the treatment
indicator (D = 1) is observed if and only if unemployment duration if not treated is longer than the time period until the start
of the program,
D = 1(T 0 > S),

where 1(·) is the indicator function. Note that the dynamic treatment assignment (1) is not an issue in the literature on sequential treatments (e.g., Robins 1986; Lechner 2004). In this literature, treatment is assumed to occur at the start of a given period,
and the outcomes occur at the end of this period.
We introduce two additional potential outcomes. For individuals who have survived up to s, we define W 1 (s) as survival time
after s if treated at s and; W 0 (s) as survival time after s if not
treated at s or thereafter. Furthermore, we define the treatment
indicator
D(s) = 1(T 0 > s).


2. THE EVALUATION FRAMEWORK
We consider a pool of workers, indexed by i = 1, . . . , N, who
became unemployed on the same date. These individuals are
all eligible for the program as long as they stay unemployed.
These individuals are exposed to two types of risk at time t:
either they get a job offer, with probability λ0 (t) per unit time,
or they get an offer to participate in a treatment (a program),
with probability γ (t) per unit time. We make the following key
assumption concerning the job hazard and the program entry
hazard at time t:

(1)

(2)

In general, there are (at least) two policy questions of interest.
First, one may be interested in the effect of the program for
those entering the program at a given duration. This parameter
is defined as



1 (s) = E W 1 (s) − W 0 (s)|D(s) = 1 .
(3)
This is a relevant parameter if one is interested mainly in the
effectiveness of offering a treatment at different durations.
However, ideally one also would like to know the overall effect of the program, that is, if the treatment reduces the average

Fredriksson and Johansson: Dynamic Treatment Assignment

437

duration in unemployment for all of those who enter the program. Thus the parameter of interest is

Proof. The duration of the comparison individual, ci , is
given as

1 = E(W 1 − W 0 |D = 1),





 

Wci = Tc0i − si 1 Tc0i < Sci + Wc1i 1 Tc0i ≥ Sci



= Wc0i + 1 Tc0i ≥ Sci Wc1i − Wc0i .

(4)

where E(W 0 |D = 1) is the posttreatment duration for the treated
if not treated. We now focus on the estimation of (4). The problems encountered when estimating (4) also pertain when estimating (3).
Provided that there is no right-censoring of unemployment
durations, it is possible to estimate E(W 1 |D = 1) as
1

n
1 
Ŵ = 1
(Ti − si ),
n
1

i=1

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:50 12 January 2016

n1

where is the number of individuals receiving treatment. The
evaluation problem involves not observing the posttreatment
duration without treatment for the treated. What makes the evaluation nonstandard in the dynamic treatment assignment setting
is that we lack (treatment) start dates for those not treated. Thus
only the pretreatment duration for the treated is observed. This
is different than in the experimental situation, in which treatment for the stock of eligibles is offered at some fixed time
point, and the fairly uncommon situation in which a program
starts after a fixed time period.
One consequence of the data-generating process (1) is that
the potential duration if not treated is longer for the treated than
for the nontreated. Using the notation of Dawid (1979), this can
be expressed as
W 0 ⊥
⊥ D.

(5)

Thus any attempt (e.g., Lechner 1999) to estimate treatment effects based on the static treatment indicator, D, will yield biased
estimates.
However, under the unconfoundedness assumption, the probability distribution of the potential duration if not treated is independent of D(s), that is,
W0 ⊥
⊥ D(s).

(6)

Equation (6) implies is that the estimator of E(W 0 |D = 1) can
be based on matching treated individuals at s with nontreated
individuals at s. Applying this procedure, we get a matchedcomparison sample and can estimate E(W 0 |D = 1) as
Ŵ 0 =

1

1

i=1

i=1

n
n

1 
1 
Wci ,
T

s
=
ci
i
n1
n1

(7)

where Wci = Tci − si is the observed unemployment duration
after si for a (randomly assigned) matched individual (ci ). The
treatment effect is then estimated as
 1 = Ŵ 1 − Ŵ 0 .

Proposition 1.
1
a. Under the null hypothesis of no treatment effect (H0 ), 
is unbiased.
1 is biased.
b. Under the alternative (H1 ), 
c. If the treatment effects have the same sign for all entry
1 is biased toward 0.
durations, then 

Wci consists of two parts: the postduration as nontreated at si
and the difference in duration if treated and not treated from the
time of treatment, Sci , if treated in the future (Sci > si ). Thus
the fact that comparison individuals may be treated in the future
biases the estimate. If the treatment effects have the same sign
for all entry durations, then the estimate is attenuated.
This estimator is useful only when it comes to testing for the
existence of a treatment. In the realistic case where the treatment effects are of the same sign for all entry durations, the
estimator is biased toward 0; thus this opportunity may be of
little practical use.
To sum up this section, we have illustrated that any estimator
of (4) using a time-invariant treatment indicator will be biased.
This follows from the treatment assignment process (1), which
implies (5). Transforming a world in which treatment assignment is the outcome of two stochastic processes to an idealized
world in which treatment assignments and outcomes occur at
single time points simply is not possible (see Fredriksson and
Johansson 2004 for a detailed discussion). Because (6) holds,
valid tests for treatment effects may be based on the observed
duration and the time-varying treatment indicator D(s). But the
power of these tests is likely to be low, because the estimators
are likely biased toward 0 if there is a treatment effect.

3. ESTIMATION AND INFERENCE IN
DISCRETE TIME
The previous section outlined a set of “impossibility results.”
In this section we are more constructive and consider strategies to nonparametrically estimate and test for treatment effects
when time is discrete. For all practical applications concerning
unemployment durations, this assumption is not restrictive; it
may be more restrictive in other circumstances, in which case
one should bear in mind the potential bias caused by time aggregation. Also in this section we suppress observed heterogeneity.
The extension to observed heterogeneity is relegated to the Appendix because it is relatively straightforward.
3.1 Matching Estimator of the Survival Difference
Let us think of W k (s), k = 0, 1, as being measured in discrete time. Then we define the potential employment outcomes in time period w if treated at s as follows: Y 1 (w, s) =
1(W 1 (s) = w) is the employment status at w if treated at s,
and Y 0 (w, s) = 1(W 0 (s) = w) is the employment status if not
treated at s
or thereafter. Furthermore, we define the risk sets
Rk (w, s) = 1(W k (s) ≥ w), that is, the number of individuals
still at risk of employment at w should they have treatment status k at s.

438

Journal of Business & Economic Statistics, October 2008

The hazard rate for those treated at s can be estimated using
the sample receiving treatment at s,

1
i∈D(s)=1 Yi (w, s)
1
χ
 (w, s) =
R1 (w, s)

1
i∈D(s)=1 1(Wi (s) = w)
,
w = 1, . . . , L − s,
=
1
i∈D(s)=1 1(Wi (s) ≥ w)

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:50 12 January 2016

where L denotes calender time. (Here the censoring date is assumed to be independent of treatment status.)
How should the counterfactual hazard to employment be calculated? From (6), we have that for a given time period, s, treatment status contains no information about the potential posttreatment duration if not treated. The discrete-time equivalent
to this condition is
{Y 0 (w, s)}∞
⊥ D(s),
w=1 ⊥

∀s.

(8)

Equation (8) follows from the unconfoundedness assumption. The implication of (8) is that censoring into a program is
independent of the outcome. Thus we can compare the hazard
rate for those who received treatment at s with those who did
not. This implies that the estimates of the counterfactual hazard
to employment for those treated at s can be based on those not
yet treated at s,

0
i∈D(s)=0 Yi (w, s)
0
χ
 (w, s) =
R0 (w, s)

0
i∈D(s)=0 1(Wi (s) = w)
,
w = 1, . . . , L − s.
=
0
i∈D(s)=0 1(Wi (s) ≥ w)
Conditioning on s, the survival function for the treated and the
counterfactual survival function can be estimated as
w


(1 − χ
k (u, s)),
w = 1, . . . , L − s, k = 0, 1.
F k (w, s) =
u=1

By the independence condition (8), we can calculate the effect
of entering the program at s as the difference between the two
survival functions, that is,
(w, s) = 

F 1 (w, s) − 
F 0 (w, s),
w = 1, . . . , L − s. (9)

Under the assumption of no treatment heterogeneity (across
individuals), the variance of the estimator can be calculated as

where

 (w, s)) = v
v
ar(
ar(
F 1 (w, s)) + v
ar(
F 0 (w, s)),

v
ar(
F k (w, s)) = [
F k (w, s)]2

w

u=1

Rk (u, s) −

χ
k (u, s)


(10)

k
i∈D(s)=k Yi (u, s)

,

k = 0, 1,
is the asymptotic variance of the estimated survival functions
based on the formula of Greenwood (1926) (see Kaplan and
Meier 1958 for a justification).
This result implies that the variance can be consistently estimated under the null hypothesis (i.e., no effect). Inference
then can be based on the regularity conditions that yield asymptotic normality of the Kaplan–Meier estimator. If there is
treatment heterogeneity, then asymptotic normal confidence intervals based on (10) most likely are too small (Imbens 2004).

A common approach to calculating the variance of matching estimators is the bootstrap; however, Abadie and Imbens (2006)
have shown that there is no theoretical justification for using the
bootstrap to calculate the variance of matching estimators.
Aggregation Over Pretreatment Durations. The estimator
defined by (9) has a clear interpretation; however, for policy
analysis, one is generally interested in estimating the treatment
effect for the entire treated population. If there are sufficient
data such that (9) is estimable, then the most obvious way of
aggregating over pretreatment durations is by taking the expectation over the distribution of treatment starts in the treated population, that is,
 (w) =


L

s=0

 (w, s),
p1 (w, s) × 

w = 1, . . . , L − s, (11)


where p1 (w, s) = R1 (w, s)/ Ls=0 R1 (w, s) is the sample probability of having received treatment at s among the treated who
were still unemployed at w. Calculating the variance of this estimator is complicated by the fact that it is a function of two stochastic variables: the estimated treatment effects and the time
until the start of treatment. The only conceivable option is to
treat the distribution of the start of treatment as given. This implies that we are focusing on the “sample treatment effect,” that
is, the average treatment effect for those treated in the data under consideration (see Imbens 2004). Under this assumption,
the variance can be calculated as
 (w)) =
var(

L

 (w, s))
[p1 (w, s)]2 × var(
s=0

+2

L−1 
L


p1 (w, s)p1 (w, s′ )

s=0 s′ =s+1

 (w, s), 
(w, s′ )].
× cov[

(12)

The covariances arise because uncensored spells at s are potentially used to estimate 
F k (w, s′ ), k = 0, 1, at s′ > s.
Estimating the variance defined in (12) is particularly difficult. Moreover, estimating the conditional survival functions
with precision may not be possible, because there are too few
individuals at each (or some) s. This limits the uses of (11).
A more convenient way to average over the pretreatment duration is to first estimate the average hazard rate and then calculate the survival function. The average hazard rate for the
treated is equal to

L
1

i∈D(s)=1 Yi (w, s)
1
1
χ
 (w) =
p (w, s) ×
R1 (w, s)
s=0

=

L

s=0

p1 (w, s) × χ
1 (w, s).

Given the independence condition (8), the average hazard rate
for the treated if not treated can be estimated as

L
0

i∈D(s)=0 Yi (w, s)
0
0
χ
 (w) =
p (w, s) ×
R0 (w, s)
s=0

=

L

s=0

p0 (w, s) × χ
0 (w, s),

Fredriksson and Johansson: Dynamic Treatment Assignment

439

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:50 12 January 2016


where p0 (w, s) = R0 (w, s)/ Ls=0 R0 (w, s) is the estimated
probability to be at risk at w if not treated at s for the treated
population.
Now the difference in these two averaged hazards has a clear
interpretation. But if the main purpose of the analysis is to test
for the existence of a treatment effect, then power—rather than
interpretability—is the major issue. For the purpose of testing,
consider the estimator
 1 (w) = 
F 1 (w) − 
F 0 (w),
w = 1, . . . , L,
(13)


F k (w) = w
k (u)), w = 1, . . . , L, k = 0, 1. In
where 
u=1 (1 − χ
general, when the treatment effects vary with pretreatment du 1 (w) may be difficult, because it does not
ration, interpreting 
equal the difference in average survival rates due to treatment
for the treated population. The estimator (13) does have two
virtues, however. First, the power in rejecting a false null hypothesis is likely to be much greater than that when using an
estimator based on the difference in the averaged hazards; second, if the hazard rates, χ k (w, s), do not vary with s, then this
estimator has a clean interpretation.
Note, finally, that the hazard χ
0 (w) and the survival function

0
F (w) define matching estimators. To apply these, we simply
need to create the same distribution of pretreatment durations
as for those who actually enter the program. The estimator (13)
thus balances the pretreatment duration, analogously to what
random assignment accomplishes in experimental data. Random assignment then balances the pretreatment duration (and
any other covariates) for the treated and control groups, which
is why we do not need to condition on the length of the period
before treatment entry when estimating the treatment effect.
1 (w)? The variance is equal to
What about the variance of 
 1 (w)) = var(
var(
F 0 (w)) + var(
F 1 (w))

F 1 (w), 
F 0 (w)).
− 2 cov(

(14)

Because individuals used as controls may be treated in the
F 0 (w)
future and thus included in the estimation of both 

1


and F (w ), w > w, there is a covariance term to worry
F 1 (w), 
F 0 (w)) = 0 if w = 1 and
about. We have that cov(


1
0
cov(F (w), F (w)) = 0, ∀w > 1. With no treatment heterogeneity (across individuals), we can apply the formula of
Greenwood (1926) to estimate the two variances,

4. MONTE CARLO SIMULATION
Here we study the small-sample performance of the estimators introduced earlier. Toward this end, we assume that T(0),
W(1), and S are geometrically distributed. Thus probability distributions of the durations until employment and treatment are
given by
Pr(Ti (0) = t) = p0 (xi )q0 (xi )t
and
Pr(Si = s) = ps (xi )qs (xi )s .
Here p0 (xi ) = 1 − q0 (xi ) is the hazard rate if not treated and
ps (xi ) = (1 − qs (xi )) is the treatment hazard; p0 (xi ) and ps (xi )
are assumed to be logistic [i.e., p0 (xi ) = (1 + exp(−(a0 +
a1 xi )))−1 , ps (xi ) = (1 + exp(−(b0 + b1 xi )))−1 ]; x is taken to
be uniformly distributed and fixed in repeated samples; and
a0 = −3.0, b0 = −6.5 or −5.5, and b1 = a1 is either 1 or 0.
In the homogenous case (i.e., b1 = a1 = 0), the exit hazard if
not treated is .047 and the hazard into treatment γ (t) is .0015
(b0 = −6.5) or .0041 (b0 = −5.5). The proportion treated is on
average equal to 2.9% and 7.6% in these two situations encountered. The case where 2.9% are treated resembles the situation
encountered in the empirical application described in the next
section. We also consider inference in a situation in which a
larger fraction is treated.
The probability distribution of the duration until employment
after having entered treatment at s is modeled as
Pr(Wi1 (s) = w) = p1 (xi )q1 (xi )w ,
where p1 (xi ) = (1 + exp(−(a0 + as + a1 xi )))−1 is the exit hazard to employment. The no treatment effect situation is obtained
with as = 0, and the constant treatment effect situation is obtained with as = .69. In the case with no observed heterogeneity, the difference in the hazard rate between treated and nontreated is 4.3%, that is, p1 (xi ) − p0 (xi ) = .043.
Figure 1 illustrates the two data-generating processes (treated
and not treated) without heterogeneity and also shows the treatment effect (i.e., the difference between the two survival functions). Note that the difference between the two survival functions is greatest around 15–18 months after program entry.

F k (w)) = [
F k (w)]2
var(
×

w

u=1

χ̂ k (u)
,

k
k
i∈D(s)=k Yi (u, s)]
s=0 [R (u, s) −

L

k = 0, 1.
The crux is to estimate the covariance term. At present, we have
no solution. An (admittedly simple-minded) approach is to ignore the covariance. The approximation error involved is small
if there are few treated individuals compared with the number
of never treated individuals, because then the covariance is negligible relative to the variance. In the next section, we present
some Monte Carlo evidence to evaluate whether ignoring the
covariance is a serious omission.

Figure 1. The simulated survival functions and the treatment effect
without heterogeneity [
, F 1 (t);
, F 0 (t);
, F 0 (t) − F 1 (t)].

440

Journal of Business & Economic Statistics, October 2008

Toward the end of the evaluation period, the difference becomes smaller,
likely reducing the power of the Wald test:


 1 (w)).
 1 (w)/ var(

In the experiment, the numbers of individuals (N) are taken to
be 500, 1,500, and 6,000. This implies that when N = 500, the
number of treated, n1 , is on average 14.5 and 38, respectively,
whereas when N = 6,000, the number of treated is 174 and 456.
The number of replicates is always taken to be 1,000. For the
matching algorithm, the “treated” individual, i, is matched (or
compared) with the subsample of individuals fulfilling tc > si ,
c = 1, . . . , nci , where nci denotes the number of such individuals.
The unique match is found as
ci = arg minc (x1i − x1c ).

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:50 12 January 2016

c∈ni

After a match is found for individual i, the process starts over
again until n1 comparable individuals are found in the comparison sample. The process is started by randomly drawing an
individual in the treatment sample; then another random draw
is made from the remaining n1 − 1 treated individuals and so
on until n1 matching individuals are found.
It is noteworthy that this approach does not match exactly on
the covariates and thus could lead to biased estimates. However,
the evidence given by Abadie and Imbens (2002) suggests that
because we have only one covariate (implying that we match on
the true propensity score), the variance dominates the bias, and
thus, asymptotically, we can ignore the bias.
4.1 Results
We restrict our presentation of the results to the performance
of (13) and (14), suitably adapted to heterogeneity when necessary; see the Appendix. Inference using the aggregate estimator
is performed using asymptotic Wald tests, ignoring the covariance term in (14). The conditional estimators (9) and (10) perform as expected (i.e., no bias and correct sizes of the Wald tests
for all sample sizes); to save space, these results are omitted.
The biases of the aggregated estimator when w = 1, 12, and
36 are given in Table 1. In general, the bias is small but slightly
greater when there is heterogeneity; this is particularly true
when 7.6% of the population is treated when the bias with heterogeneity is around 2% when w = 36.
The size and power of the 5% nominal level double-sided
Wald tests are displayed in Table 2. When 2.9% of the eligibles are treated, the Wald tests perform satisfactory. The tests
are significantly undersized for small N and w, but the size improves with N. The power increases with N for all w’s both with
and without heterogeneity. When the fraction treated is 7.6%,
the general result is that the actual size is much too large for all
N’s; that the power is not monotonic in w is related to the fact
that the treatment effect is significantly smaller at w = 36 than
at w = 12. Thus the variance is underestimated (i.e., we overstate the information in the data) when ignoring the covariance.
We have heuristically stated that when the number of treated is
small, this would be a possible approach; however, in this setting the error involved is large when 7.6% of the population is
treated. Note that when w = 1, we estimate only a one-periodahead effect, and there is no covariance between the survival
functions. This is also reflected in Table 2 when w = 1, the real
size, is correct for the matching estimator.

Table 1. Bias of the aggregated estimator (%)
Fraction
treated w\N

2.9%
500

1,500

7.6%
6,000

500

1,500

6,000

No treatment effect, no heterogeneity
1
−.09 −.06
.09
12
.08 −.45
.01
36
.25
.06
−.08

−.01
−.24
−.10

−.01
.05
.19

.02
.09
.03

No treatment effect, heterogeneity
1
−.28
.08
12
−.20
.27
36
.23
.46

.11
.33
.11

−.07
1.21
1.80

−.07
1.47
1.69

.05
1.36
1.46

Treatment effect, no heterogeneity
1
−.11
.11
12
.83
.73
36
.72
.92

.18
.97
.78

0
−.09
.16

−.02
.08
.31

.03
.13
.16

.22
−.30
.75

.02
.41
2.76

.04
.61
2.61

.10
.53
2.41

Treatment effect, heterogeneity
1
−.35
.35
12
−.04 −.31
36
.78
.80

4.2 Summary
In our Monte Carlo simulation, we have considered only the
performance of the aggregated estimator. The estimator conditioning on a particular entry point, s, performs well in all of the
different configurations that we consider. The aggregated estimator has a very slight bias when there is heterogeneity (2–3%
in the worst case). When performing inference, we ignored a
covariance stemming from the fact that individuals may enter
the survival function as nontreated or as treated. Heuristically,
we argued that if the number of treated is small compared with
the number of never-treated, then simply ignoring the covariance is a viable approach. The Monte Carlo concurs with the
argument; this procedure works fine when the sample size is as
small as 500, at least when the fraction of treated is 2.9% of the
Table 2. Size and power of the aggregated estimator (%)
Fraction
treated w\N

2.9%
500

1,500

7.6%
6,000

500

1,500

6,000

No treatment effect, no heterogeneity
1
1.5
3.50
4.40
12
4.3
5.10
4.40
36
3.5
6.40
5.00

4.90
8.00
13.60

5.10
7.00
13.70

5.00
7.70
12.30

No treatment effect, heterogeneity
1
2.3
4.80
4.00
12
5.2
5.10
5.60
36
0
3.40
5.30

5.30
8.10
9.80

3.70
11.60
16.00

4.90
15.50
20.50

Treatment effect, no heterogeneity
1
3.3 12.60 32.40
12
24.9 54.30 99.50
36
4
37.90 97.40

30.00
95.30
65.30

71.10
100.00
97.80

99.90
100.00
100.00

Treatment effect, heterogeneity
1
5.9 15.50
12
24.8 57.10
36
.2
3.90

49.20
96.60
26.80

93.70
100.00
70.20

100.00
100.00
99.50

43.70
99.10
63.10

Fredriksson and Johansson: Dynamic Treatment Assignment

eligible population, but is less viable when the fraction treated
is 7.6%.
It should be stressed that the aggregated estimator (13) is of
little use if the fraction of treated is large, because then it is
generally possible to use the estimator that conditions on s. But
if the fraction of treated is low, then sample size may not permit
tests for a treatment effect for given s, and then inference using
the aggregate estimator may be a good alternative.

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:50 12 January 2016

5. AN EMPIRICAL APPLICATION
In this section we evaluate the effects of an ES program using the estimators proposed in Section 3. The ES program was
administered by the Swedish PES. The prime objective of the
PES is to provide the unemployed with job search assistance;
a secondary objective is to provide training and subsidized employment. To receive unemployment insurance (UI) or cash assistance (CA), the unemployed need to register at the PES; thus
all long-term unemployed are registered at the PES. As a way to
monitor the unemployed, officials at the PES may require that
the unemployed either take a job or take part in a training program or subsidized employment. If an unemployed individual
refuses, then she or he may lose his UI or CA.
The ES program was introduced on January 1, 1998; Forslund et al. (2004) have provided a thorough description of the
program. The subsidy was targeted at the long-term unemployed, that is, individuals registered as unemployed at the PES
for at least 12 months. The subsidy amounted to 50% of the total wage costs and was paid for a maximum period of 6 months.
We use unemployment register data from the National
Labour Market Board, which contains information on all individuals registering at the PES in Sweden since August 1991.
The database includes information on, among other factors,
age, educational attainment, and sex, as well as the individuals’ registration date, job training activities, and starting dates
of participation in various labor market programs.
For each individual registered at the PES, we observe an
event history including the number of spells and days of unemployment. We drop all individuals who left the register before the introduction of the ES program from the data set. We
also exclude all individuals for whom the first spell of unemployment occurred before January 1, 1992 and for whom all
registered spells were shorter than 365 days. The reason for the
last two exclusions is that previous labor market history is the
key variable for the matching estimator, and the main eligibility criteria for the program is continuous unemployment for at
least 365 days.
We focus on individuals age 25 to 63 years at the time of
registration at the PES. We chose the lower age limit because
eligibility conditions were changed for individuals under age
25. We impose the upper age limit because retirement is imminent for individuals above age 63 years. A spell of unemployment is defined as an uninterrupted period of time during
which an unemployed person is registered at the PES. The spell
is ended if the unemployed person has a job for a period of at
least 30 days or leaves the register for a period of at least 30
days for any other reason. We thus aggregate the daily data to
monthly intervals. We do this for two reasons. The first reason
is measurement error regarding the exact day of the start of a

441

job spell. The main cause of this measurement error is the strategy used by PES officers for obtaining the information on when
the job spell began. If the unemployed individual has not been
in contact with the PES office for some specific time period, the
individual is asked over the phone whether or not she or he is
employed. The second reason is that we want to ameliorate anticipation effects. The no-anticipation assumption requires that
the individual knows neither the exact date of the program start
nor the exact date of a start of a job spell. This is a much more
plausible assumption at the monthly interval than at the daily
frequency.
It is possible to have more than one spell of unemployment
of at least 365 days without interruption while the ES program
has been running. Thus an individual can be eligible for the program more than once. The unit of observation is chosen to be
every time that an individual becomes eligible for the ES program. In the analysis, we use information on each individual’s
total number of spells and total days of unemployment before
becoming eligible for the ES. For an individual who is eligible
more than once, the total number of days and spells is updated
each time he or she becomes eligible. Thus the data include
only persons who have been eligible for the ES program on at
least one occasion.
The individuals in the data are separated into two different
groups: those who start the ES program after having become
eligible and eligibles who do not start the program. Each time
that a person becomes eligible, the total number of days until he
or she either leaves the PES office or becomes right-censored is
calculated. The point in time for right-censoring is October 1,
2002 or when a person leaves the register for another destination than work. For those who enter the ES program, the duration to ES is calculated as well.
A total of 631,358 individuals, age 25–63, were eligible for
ES between January 1998 and October 2002; 3% of the eligible
spells ended in ES. The mean characteristics of the ES participants and nonparticipants in the eligible population are reported
in Table 3.
A significantly higher fraction (64% compared with 39%) of
the ES spells ended up in employment. This does not indicate
a positive treatment effect; to some extent, it reflects the fact
that the program participants on average registered earlier at the
PES (see T0 ) and thus on average had spent a longer time looking for a job. Males and non-Nordic immigrants are overrepresented and disabled individuals are underrepresented among
the participants. Participants are younger, more educated, and
have spent less time at the PES before the last period of unemployment. Given reasonable priors about how these characteristics should influence the exit to employment, the participants
should be expected to leave unemployment more rapidly than
the nonparticipants.
To apply our estimators in the present setting, we must check
whether selection on observables is a reasonable assumption.
To make a long story short, we believe that these assumption
are palatable in the present setting; nonetheless, we attempt to
substantiate this claim somewhat.
In a stated preference experiment, Eriksson (1997) found that
the heterogeneity of the PES caseworker was more important
than the heterogeneity of the individuals in determining program participation. Carling and Richardson (2004) reported evidence in the same vein. They compared the effects of eight

442

Journal of Business & Economic Statistics, October 2008

Table 3. Mean characteristics of participants (ES), nonparticipants
(no ES), and exactly matched sample (matched)
Variable

ES, No ES, ES–No ES, Matched,
mean mean
t-value
mean

Duration, months
Employed

34.38
.64

23.37
.39

62.90
71.54

.61
.21
.16
.06
.43
.12
.26
.32
.27

.41
.14
.18
.10
.35
.12
.22
.31
.24

56.38
25.41
−5.43
−20.41
24.08
−3.43
11.34
3.39
7.04

.56
.14
.11
.02
.42
.09
.24
.30
.25

Days in register during previous spell (TD)
TD = 0
.41
.38
0 < TD ≤ 100
.05
.05
100 < TD ≤ 500
.22
.20
500 < TD ≤ 1000
.18
.18

9.44
4.28
8.37
1.22

.51
.02
.19
.15

7.15
7.82
−18.62

.51
.35
.14

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:50 12 January 2016

Covariates
Male
Non-Nordic
No UI
Disabled
Upper-secondary degree
University degree
Age ≤ 30
30 < age ≤ 40
40 < age ≤ 50

Number of previous programs (TP)
TP = 0
.41
0 < TP ≤ 5
.42
5 < TP ≤ 15
.17

.38
.39
.22

Month turning eligible (T0 ) (January 1998 = 1; October 2002 = 118)
58.33 70.22
−65.75
60.18
T0

different programs on the probability of finding a job and found
that their results did not reflect selection by demonstrating that
program placement depended more on the employment service
office that the job seeker had visited than on his or her observed
characteristics.
It may well be that individual unobserved characteristics are
correlated with the PES office. Thus, ideally we would like to
control for the PES office in our analysis. This is not possible,
however; instead, we control for local labor market in the analysis. The local labor market and the PES office overlap to some
extent; at the time, there were 100 local labor markets (defined
on the basis of commuting patterns) and about 300 PES offices.
For the specific ES program that we consider, there also is
survey evidence on the selection process (see Lundin 2000).
The survey was directed to the caseworkers at the PES. The
most important piece of evidence is that only 6% maintained
that the initiative to suggest ES came from the eligible, suggesting that individual self-selection to ES is not a big problem.
Nevertheless, unobserved heterogeneity may remain an issue if
caseworkers have more information about the unemployed than
we do. However, Eriksson (1997) showed that caseworkers categorize the motivation of identical unemployed individuals very
differently, suggesting that this is not so problematic. Moreover,
we have detailed information on the previous labor market history (presumably a good indicator of motivation/skills), and we
can control for the local labor market in which the individual is
registered.
Another threat to the selection on observables assumption
stems from the amount of information possessed by the unemployed. We must assume that the unemployed knows neither

the exact date of entering ES nor the exact date of starting a
job; there must be some randomness in these events. The expiration of UI benefits may be considered a violation of this
assumption; however, this is not true. Because the expiration of
UI occurs at the same time point for all unemployed with the
same unemployment duration, the hazards to ES or work are
affected in the same way for all unemployed.
We match on the covariates listed in Table 3 and the local
labor market in which the individual is registered. In this application, we use a one-to-one exact matching estimator. Exact
matching implies that the common support restriction significantly reduces the sample (a 60% reduction). An alternative is
to use a propensity score-matching estimator (see the App. and
Fredriksson and Johansson 2003 for an application). Propensity score-matching increases the efficiency of the estimators—
because it is easier to find a match—but also may introduce
bias. In this particular application, a propensity score-matching
approach has no implications for the results.
Index the treated at s = 0, . . . , L − 1 by i and the comparison
group at s by c. The unique match (for each s) is then given by
ci = xi ≡ xc ,

c ∈ N(s),

(15)

where N(s) is the number of individuals in the comparison
group. If there is more than one individual in the comparison
group with the same values of the covariates, then we randomize over the potential matches. If no unique match from c is
found for individual i, then this individual is removed from
the estimation. With complete pairs of treated and nontreated
individuals, (13) is estimated using the difference in Kaplan–
Meier survival function if treated and nontreated for the subset of treated individuals with support common to the comparison sample. Matching is based on 7,651 treated individuals.
Descriptive statistics for the matched pairs are given in Table 3.
5.1 Results
In this application, there is a sufficient number of treated individuals to compute treatment effects by pretreatment duration;
see (9) and (10). Figure 2(c) plots the treatment effect for those
entering during months 0–3 after eligibility, (b) plots the treatment effect for those entering during months 36–39, and (a)

Figure 2. Treatment effects for early and late entrants [
, 95%
confidence interval;
, difference (36 ≤ t ≤ 39);
, difference
(0 ≤ t ≤ 3)]

Fredriksson and Johansson: Dynamic Treatment Assignment

443

Downloaded by [Universitas Maritim Raja Ali Haji] at 22:50 12 January 2016

6. CONCLUDING REMARKS

Figure 3. Treatment effects for all entrants: (a) matched; (b) not
, difference;
, 95% confidence interval).
matched (

shows the treatment effects in the same graph. The increased
variance of the estimates in (b) is due to the fact that there were
only 206 treated individuals on the common support. The general message is that treatment effects are similar irrespective of
the timing of program entry. A formal Kolmogorov–Smirnov
test of equality of the two distributions substantiates this conclusion; the test has a p-value of .48.
Because there is no evidence suggesting that treatment effects vary by pretreatment duration, we move on to the aggregate estimator (13). Figure 3 shows the results. The confidence
intervals are calculated using the approximate covariance estimator (14). In this application this is a good approximation
because the fraction treated is small (3%); furthermore, only
256 of the 7,651 treated individuals are included in the set of
comparison individuals earlier on.
Figure 3(a) shows the result when controlling for both preES duration and covariates, whereas the estimate in the (b) controls only for pre-ES duration. This means that the difference
between (a) and (b) reflects the effects of observed heterogeneity (including the common support restriction). These results
indicate that individuals with favorable characteristics are more
likely to participate in the ES program.
From Figure 3, we can see that after an initial period of about
6 months with a negligible (negative) treatment effect, a downward jump occurs; from then on, the effect gradually becomes
smaller, but it is significant over the remainder of the followup horizon. This scenario is consistent with an initial period of
locking-in and a subsequent period with a positive treatment effect. The sum of the effects over the entire follow-up horizon is
7.78 months, which thus implies that unemployment duration
decreased by 14% over the follow-up horizon for the average
individual.
A likely explanation for the downward jump in the estimated
treatment effect after 6 months is that the participants simply
tend to stay on at the workplace where they were employed
with the subsidy. On the one hand, this is an intended effect of
the program, on the other hand, this result may be seen as an
indication that the program tends to displace regular employment; that is, employers use the subsidy to fill vacancies that
would have been filled by hiring on the regular market in the
absence of the program. The qualitative evidence reported by
Lundin (2000) is consistent with this interpretation.

In this article we have considered the evaluation problem using observational data when the start of treatment is the outcome of a stochastic process. We have shown that the duration
framework in discrete time (with a time-varying treatment indicator) is a fertile ground for effect evaluations. A treatment
effect estimator that is based on a static treatment indicator invariably will be biased.
We have suggested easy-to-use nonparametric matching estimators of the survival functions. These estimators do not rely
on strong assumptions about the functional forms of the two
processes generating the inflow into programs and employment.
We have assumed that selection is based purely on observables.
Whether or not the conditional independence assumptions required for the estimators are reasonable depends crucially on
the richness of the information in the data. But even if we assume that unobserved heterogeneity is not an issue, the evaluation problem is demanding o