Directory UMM :Data Elmu:jurnal:J-a:Journal of Econometrics:Vol96.Issue1.May2000:
                                                                                Journal of Econometrics 96 (2000) 183}199
E$cient estimation of binary choice models
under symmetry
Songnian Chen*
Department of Economics, The Hong Kong University of Science and Technology, Clear Water Bay,
Kowloon, Hong Kong, People+s Republic of China
Received 1 July 1997; received in revised form 1 June 1999
Abstract
This paper proposes a semiparametric maximum likelihood estimator for both
the intercept and slope parameters in a binary choice model under symmetry and
index restrictions. The estimator attains the semiparametric e$ciency bound in Cosslett
(1987) under the symmetry and independence restrictions. Compared with the estimator
of Klein and Spady (1993), which attains the semiparametric e$ciency bound in
Chamberlain (1986), and Cosslett (1987) under the independence restriction, we
show that there are possible e$ciency gains in estimating the slope parameters by
imposing the additional symmetry restriction. A small Monte Carlo study is carried
out to illustrate the usefulness of our estimator. ( 2000 Elsevier Science S.A.
All rights reserved.
JEL classixcation: C31; C34
Keywords: Binary choice; Symmetry; Semiparametric estimation; E$ciency bound
1. Introduction
In this paper we consider the estimation of the binary choice model de"ned by
d"1Mv (x, h )'uN,
0
* Tel.: #852-2358-7602; fax: #852-2358-2084.
E-mail address: [email protected] (S. Chen)
0304-4076/00/$ - see front matter ( 2000 Elsevier Science S.A. All rights reserved.
PII: S 0 3 0 4 - 4 0 7 6 ( 9 9 ) 0 0 0 5 9 - 7
184
S. Chen / Journal of Econometrics 96 (2000) 183}199
where 1MAN is the usual indicator function of the event A, x is a vector of
exogenous variables, v (x, h ) is a given parametric function of x with an
0
unknown parameter vector h , and u is an unobservable error term. In most
0
applications v (x, h ) is assumed to be a linear function, then the model (the
0
linear binary choice model hereafter) becomes
d"1Ma #xb 'uN
0
0
with h "(a , b@ )@, where a and b are the intercept and slope parameters,
0
0
0
0 0
respectively.
For the binary choice model, traditionally, the logit and probit estimation
methods are the most popular approaches by restricting the error distribution to
parametric families. However, misspeci"cation of the parametric distributions,
in general, will result in inconsistent estimates for likelihood-based approaches.
In addition, speci"c functional forms for the error distribution cannot usually be
justi"ed by economic theory. This consideration has motivated recent interest in
estimating the binary choice model by semiparametric methods which do not
assume a parametric distribution for u. Most attention has been focused on the
linear binary choice model (see Horowitz, 1993, for a survey) in semiparametric
estimation literature. Under various weak distributional assumptions on the
error term u, a number of semiparametric estimators have been proposed for h
0
(Ahn et al., 1996; Cavanagh and Sherman, 1998; Chen and Lee, 1998; Cosslett,
1983; Han, 1987; HaK rdle and Stoker, 1989; Horowitz and HaK rdle, 1996;
Ichimura, 1993; Klein and Spady, 1993; Powell et al., 1989; Sherman, 1993),
among others, for estimating the slope parameters with no location restriction
on u, and Chen, 1998, 1999a; Horowitz, 1992; Lewbel, 1997; Manski, 1985 for
estimating the intercept and/or slope parameter with location restrictions imposed on u). Cosslett (1987) considered semiparametric e$ciency bounds
for the parameters in the binary choice model under the assumption that the
error term is independent of the regressors and various location restrictions.
The estimators by Chen and Lee (1998) and Klein and Spady (1993), attain the
e$ciency bound of Cosslett (1987) (and Chamberlain, 1986) for the slope
parameter under the independence restriction. Cosslett (1987) further pointed
out that under an additional symmetry restriction, not only is the location
parameter identi"ed, but there is possible e$ciency gain for estimating the slope
parameter. Therefore, exploiting the symmetry restriction could potentially lead
to more e$cient estimator for the slope parameter than the estimator of Klein
and Spady (1993).
Consistent estimation of the intercept term is also important for the binary
choice model (see Chen, 1998, 1999a and Lewbel, 1997 for detailed discussions).
In the context of the sample selection model, by imposing symmetry and index
restrictions, Chen (1999b) recently considered Jn-consistent estimation of both
the intercept and slope parameters for the outcome equation without the usual
exclusion restriction (see Chamberlain, 1986; Powell, 1989, etc.). However,
S. Chen / Journal of Econometrics 96 (2000) 183}199
185
Chen's (1999b) approach requires an initial Jn-consistent estimator for the
intercept term of the binary choice selection equation.
In this paper, we consider joint estimation of the intercept and slope
parameters under index and symmetry restrictions. Following Klein and
Spady (1993), we propose a semiparametric maximum likelihood estimator
with both the index and symmetry restrictions taken into account in formulating the semiparametric likelihood function. The resulting estimator attains the
e$ciency bound of Cosslett (1987) under the independence and symmetry
restrictions.
The article is organized as follows. Section 2 introduces our estimator and
considers its large sample properties. In Section 2, we also compare Cosslett's
e$ciency bounds under the independence restriction with and without the
symmetry restriction, and discuss possible e$ciency gain by imposing
the symmetry restriction. In Section 3 a small Monte Carlo study is carried out
to assess the "nite sample performance of the estimator. Section 4 concludes.
2. Motivation and the estimator
For the above binary choice model, it is well known that some scale and
location normalization is necessary in order to identify h . In parametric
0
models, such as the probit and logit models, scale normalization is accomplished
by setting the variance of u to one. In the semiparametric literature for the linear
binary choice model, two common scale normalization schemes are to set either
D b D "1 or the coe$cient of one component of b to one. Location restrictions
0
0
are imposed on various locational measures, for example, the conditional mean
or median, of u directly. In this paper, the location and scale restrictions are
imposed by setting v (x, h )"a #v (x, b ) with v (x, b )"v (x )#
0
0
1
0
1
0
1a 1
v (x , b ), and the error term u being symmetrically distributed conditional on x.
1b 2 0
Traditionally, the binary choice model is estimated by the maximum likelihood method, most notably for the logit and probit models. For a random
sample of size n, maximum likelihood estimator is de"ned by maximizing the
log-likelihood function
n
ln ¸ (h)" + [d ln F (v (x , h ))#(1!d ) ln(1!F (v (x , h )))],
n
i
x
i
i
x
i
i/1
where the error term u is assumed to have distribution F ( ) ) given x. However,
x
misspeci"cation of the parametric distribution function, in general, will lead to
inconsistent estimates. Consequently, various semiparametric estimators have
been proposed without a parametric speci"cation for the error distribution. In
particular, under the index restriction that the error distribution depends on
x only through v (x, b ), Klein and Spady (1993) proposed a semiparametric
1
0
186
S. Chen / Journal of Econometrics 96 (2000) 183}199
maximum likelihood estimator for b by maximizing
0
n
ln ¸ (b)" + q [d ln F (v (x , b))#(1!d ) ln(1!F (v (x , b)))],
,4
ni i
n 1 i
i
n 1 i
i/1
where F is a nonparametric estimator for the unknown distribution of u!a ,
n
0
and q are some trimming functions adopted for technical convenience. Since no
ni
location restriction is imposed in their case, they only provide an estimator for
b . Under the condition that the error term is independent of the regressors,
0
they further show that their estimator attains Cosslett's (1987) (and Chamberlain's, 1986) semiparametric e$ciency bound.
In this paper, we consider joint estimation of (a , b ) under a location
0 0
restriction in the form of conditional symmetry and an index restriction stronger
than the one in Klein and Spady (1993). More speci"cally, we assume that the
conditional density of u given x, f (u D x) is an even function of u, and depends on
x only through v2(x, h ); that is, f (!u D x)"f (u D x) and f (u D x)"f (u D v2(x, h ).
0
0
Following Klein and Spady (1993), we propose a semiparametric maximum
likelihood estimator; moreover, we take into account the symmetry as well as
the index restriction in formulating the semiparametric likelihood function.
Under our assumptions, the location parameter a can be consistently esti0
mated; furthermore, there are possible e$ciency gains in estimating b by
0
imposing the symmetry restriction.
To formulate the semiparametric likelihood function, let g(t, h) denote
the density of v (x, h) evaluated at t, P(t, h)"E(d D v (x, h)"t), P (t, h)"
0
E(1!d D v (x, h)"t), g (t, h)"P(t, h)g(t, h)
and
g (t, h)"P (t, h)g(t, h)"
1
0
0
g(t, h)!g (t, h) with P(t)"P(t, h ) and g(t)"g(t, h ). Notice that under the
1
0
0
symmetry and index restrictions, it can be easily shown that
P(t)"P (!t)
0
namely,
E(d D v (x, h )"t)"E((1!d) D v (x, h )"!t)
0
0
which, in turn, suggests that under our conditions G (t, h ) is a natural nonn
0
parametric kernel estimator for P(t), where G (t, h) is de"ned as the solution to
n
the following problem1:
C
A
B
A
n
v (x , h)!t
v (x , h)#t
i
i
min + (d !k)2H
#(1!d !k)2H
i
i
a
a
k i/1
n
n
1 I thank one referee for pointing this out.
BD
,
S. Chen / Journal of Econometrics 96 (2000) 183}199
187
where H is a kernel function and a is a sequence of bandwidth converging to
n
zero as the sample size increases to in"nity, that is,
+n [d H((v (x , h)!t)/a )#(1!d )H((v (x , h)#t)/a )]
i
n
i
i
n .
G (t, h)" i/1 i
n
+n [H((v (x , h)!t)/a )#H((v (x , h)#t)/a )]
i
n
i
n
i/1
It is clear that G (t, h) locally approximates (for h in a neighborhood of h )
n
0
G (t, h ). Also, g (t, h), g (t, h), and g(t, h) can be estimated nonparametrically by
n
0
1
0
g (t, h), g (t, h), and g (t, h), respectively, where
n1
n0
n
v (x , h)!t
1 n
i
+ dH
,
g (t, h)"
i
n1
a
na
n
n i/1
v (x , h)!t
1 n
i
+ (1!d )H
,
g (t, h)"
i
n0
a
na
n
n i/1
and g (t, h)"g (t, h)#g (t, h). Hence, G (t, h) can be rewritten as
n
n1
n0
n
g (t, h)#g (!t, h)
n0
.
G (t, h)" n1
n
g (t, h)#g (!t, h)
n
n
These nonparametric estimators will provide the basis for constructing the
semiparametric likelihood function.
Following Klein and Spady (1993) we introduce probability and likelihood
trimmings to deal with technical di$culties arising from small estimated densities. For the probability trimming, let
B
A
A
B
d (t, h),ae{[ez1 /(1#ez1 )]
z "[(ae{@5!g (t, h ))/ae{@4],
n
n
1n
n
n1
1
0
0
d (t, h),ae{[ez /(1#ez )]
z "[(ae{@5!g (t, h ))/ae{@4],
n
0n
n
n
n0
0
z"[(ae{@5!g (t, h ))/ae{@4],
d (t, h),ae{[ez/(1#ez)]
n
n
n1
n
n
then we de"ne the estimated probability function as
g (t, h)#g (!t, h)#d (t, h)#d (!t, h)
n0
1n
0n
GK (t, h)" n1
.
n
g (t, h)#g (!t, h)#d (t, h)#d (!t, h)
n
n
n
n
We now consider the likelihood trimming. Let hK be a preliminary consistent
1
estimate for h for which D hK !h D "O (n~1@3). The estimators mentioned
0
1
0
1
above satisfy this condition. Then the likelihood trimming function is de"ned as
q( (h)"q( (h)q( (h), where
i
1i
0i
q( (h)"q(g (v (x , h), h), eA)q(g (!v (x , h), h), eA),
1i
n1
i
n1
i
q( (h)"q(g (v (x , h), h), eA)q(g (!v (x , h), h), eA),
0i
n0
i
n0
i
where
q(s,e),M1#exp[(ae@5!s)/ae@4]N~1
n
n
188
S. Chen / Journal of Econometrics 96 (2000) 183}199
with eA(e@(1. As in Klein and Spady (1993), the likelihood trimming is more
severe than the probability trimming. Finally, we de"ne our semiparametric
maximum likelihood estimator, hK , by maximizing
QK (h, q( )"+ (q( (hK )/2)[d ln[GK 2(v (x , h), h]
n
i
n
i 1
i
# (1!d ) ln[(1!GK (v (x , h), h ))2].
(1)
i
n
i
Let Dr m(z) denote the rth-order partial of m(z) with respect to z and
z
D0m(z)"m(z). Let c denote a generic positive constant. We make the following
z
assumptions.
Assumption 1. The observations (d , x ), i"1, 2, 2 , n are i.i.d. The conditional
i i
density of u, f (u D x), depends on x only through v2(x, h ), and symmetric around
0
0, namely, f (u D x)"f (u D v (x, h ))"f (u D v2(x, h )) and f (!u D x)"f (u D x); and
0
0
f (u D v (x, h )"t) is four times continuously di!erentiable with respect to (u, t)
0
and its derivatives are uniformly bounded.
Assumption 2. The parameter vector h lies in a compact parameter space, H. The
true value of h, h , is in the interior of H.
0
Assumption 3. There exist P, P that do not depend on x such that
0(P)E(d D x)4P(1.
Assumption 4. The index function v (x, h) is smooth in that for h in a neighborhood of h and all s:
0
M D D*r+v (s; h) D , D LD*r+v (s; h)/Lx D N(c (r"0, 1, 2, 3, 4).
h
h
Also, the conditional density of x given x , f 1 2 (x ) is supported on [a, b] and
1
2 x @x 1
smooth in that for all x
D Dr f 1 2 (s) D (c (r"0, 1, 2, 3, 4; s3[a, b]).
s x @x
Assumption 5. The kernel function H(v) is a symmetric function that integrates
to one, four times continuously di!erentiable with bounded derivatives. Also,
H(v) is a higher-order kernel in that :v2H(v) dv"0, and :v4H(v) dv(R.
Assumption 6. The bandwidth sequence Ma N is chosen such that n~1@(6`2e{)
n
(a (n~1@(8~eA).
n
Assumption 7. The model is identi"ed in that for almost all x,
G(v (x, h ), h )"G(v (x, h ), h ) implies h "h ,
H H
0 0
H
0
S. Chen / Journal of Econometrics 96 (2000) 183}199
189
where
g (v (x, h), h)#g (!v (x, h), h)
0
G(v (x, h), h)" 1
.
g(v (x, h), h)#g(!v (x, h), h)
Assumption 1 describes the model and the data. The symmetry restriction is
a location normalization. Compared with the usual index restriction in the
literature that f (u D x)"f (u D v (x, h )) (see, e.g. Ichimura, 1993; Klein and Spady,
0
1993), our index restriction is slightly stronger; however, in both cases, the
unknown form of heteroscedasticity allowed is very limited. Assumptions 2}6
are similar to those in Klein and Spady (1993) (see Klein and Spady, 1993 for
detailed discussions); in particular, a higher-order bias reducing kernel is used
here. However, following Klein and Spady (1993), we can also adopt an adaptive
kernel function with local smoothing that has similar bias reduction properties.
Speci"cally, we set the bandwidth to be variable and data dependent. For
sample size n, let lK "(1/na )+n H((v (x , h)!t)/a ) be a preliminary density
j
np i/1
i
np
estimator with bandwidth a . For
0(e@(1, set the bandwidth for observation
np
j to a "a p( (h)¸K , where p( (h) is the sample standard deviation of v (x, h),
nj
n
j
¸K "M[ lK #(1!q( )/m]N
j
j
pj
with
q( "q( lK , ae{ )
pj
j np
and m being the geometric mean of the lK . As pointed out by Klein and Spady
j
(1993), the estimated probabilities using local smoothing are always positive,
whereas those based on higher-order kernels can be negative. Assumption 7 is
an identi"cation condition. Note that G(v (x, h ), h )"P(v (x, h )) under our
0 0
0
index and symmetry restrictions. Klein and Spady (1993) provided two sets of
models for which the slope parameters are identi"ed, allowing both monotonic
and nonmonotonic models. We "rst consider the identi"cation of the slope
parameter. For a "xed h"(a, b@)@, if
G(v (x, h), h)"G(v (x, h ), h )"P(v (x, h ))
0 0
0
for almost all x, then de"ne G (x, b)"G(a#v (x, b), (a, b@)@), which can be
a
1
viewed as a function of v (x, b), then by following the arguments of Klein and
1
Spady (1993) we can show that b"b if the assumptions in Theorems 1 or 2 of
0
Klein and Spady (1993) are satis"ed. Once the slope parameter is identi"ed, we
only need to show that
G(v (x, (a, b@ )@), (a, b@ )@)"G(v (x, h ), h )
0
0 0
0
for almost all x implies a"a . It is straightforward to show that under our
0
index and symmetry restrictions
G(v (x, (a, b@ )@), (a, b@ )@)"g (x, a)P(v (x, h ))
0
w1
0
0
# g (x, a)P(v (x, h #2(a!a ))),
w2
0
0
(2)
190
S. Chen / Journal of Econometrics 96 (2000) 183}199
where
g(v (x, (a, b@ )@), (a, b@ )@)
0
0
g (x, a)"
w1
g(v (x, (a, b@ )@), (a, b@ )@), (a, b@ )@)#g(!v (x, (a, b@ )@), (a, b@ )@)
0
0
0
0
0
g(v (x, h ))
0
"
g(v (x, h )#g(!v (x, h )#2a !2a)
0
0
0
and
g(!v (x, (a, b@ )@), (a, b@ )@)
0
0
g (x, a)"
w2
g(v (x, (a, b@ )@), (a, b@ )@)#g(!v (x, (a, b@ )@), (a, b@ )@)
0
0
0
0
g(!v (x, h )#2a !2a)
0
0
"
g(v (x, h )#g(!v (x, h )#2a !2a)
0
0
0
with g (x, a)#g (x, a)"1. Thus (2) is equivalent to
w1
w2
g (x, a)[P(v (x, h )#2(a!a ))!P(v (x, h ))]"0.
(3)
w2
0
0
0
When the support of v (x, h ) is large enough so that there exists an interval
0
[a , b ], we have g(t)g(!t#2a !2a)'0 for t3[a , b ] and (a, b )3H, then
v v
0
v v
0
identi"cation of a follows if
0
P(t)"P(t#2(a!a ))
(4)
0
for t 3[a , b ], implies a"a , which, in turn, obviously holds for monotonic
v v
0
models. Even many nonmonotonic models, (4) typically ensures the identi"cation of a ; for example, identi"cation of a follows easily from (4) when
0
0
P(t)'0.5 if and only if t'0, and dP(t)/dt changes sign only once for t'0.
Theorem 1. Under Assumptions 1}7, the estimator hK is consistent and asymptotically normal,
D
Jn(hK !h )P N(0, R),
0
where
GC DC
R"E
LG LG
Lh Lh@
1
P(v)(1!P(v))
DH
~1
with v"v (x, h ) and LG/Lh"[LG(v (x, h ), h )]/Lh.
0
0 0
Proof. See the appendix. h
We now turn to the issue of e$ciency. Klein and Spady (1993) have shown
that their semiparametric maximum likelihood estimator attains the asymptotic
e$ciency bound of Chamberlain (1986) and Cosslett (1987) associated with the
S. Chen / Journal of Econometrics 96 (2000) 183}199
191
independence restriction. We will show that by also taking the symmetry
restriction into account, our estimator attains the asymptotic e$ciency bound of
Cosslett (1987) associated with both the independence and symmetry restrictions.
Lemma 2. For the binary choice model under consideration here, the asymptotic
covariance of hK achieves the ezciency bound as specixed in Cosslett (1987) under
the condition that the error is independent of the regressors, and symmetrically
distributed around the origin.
Following the arguments of Lemma 2 of Chen and Lee (1998) and Klein and
Spady (1993), we can show that under the independence and symmetry restrictions
A K BD
C
L
Lv
Lv
v
[E(d D v (x , h)"v (x, h)] D
"f (v)
!E
i
h/h0
Lh
Lh
Lh
and
C
A K BD
Lv
Lv
L
[E(d D v (x , h)"!v (x, h)] D 0 "!f (v)
#E
!v
i
h/h
Lh
Lh
Lh
,
where Lv/Lh"[Lv (x, h )]/Lh. Therefore,
0
Lv
LG
"f (v)
!K(v) ,
Lh
Lh
A
B
where
C C KD
A K BD
Lv
Lv
1
K(v)"
g(v)E
v !g(!v)E
!v
Lh
Lh
g(v)#g(!v)
,
as speci"ed in Cosslett (1987). Therefore,
G
R~1"E
A
BA
f 2(v)
Lv
!K(v)
P(v)(1!P(v)) Lh
BH
Lv
@
!K(v) .
Lh
(5)
Then Lemma 2 follows by comparing (5) and Eq. (3.27) in Cosslett (1987).
Since the estimator by Klein and Spady bK attains the e$ciency bound under
,4
the independence restriction, while our estimator hK "(a( , bK @)@ attains the e$ciency bound under both the independence and symmetry restrictions, it is of
interest to examine whether the additional symmetry restriction leads to any
e$ciency gain in estimating b . From Cosslett (1987), the e$ciency bound for
0
b under the independence restriction is
0
Lv
Lv
Lv
Lv
@
f 2(v)
I "E
!E
!E
v
v
,
1
Lb
Lb
Lb
P(v)(1!P(v)) Lb
G
A
C K DBA
C K DB H
192
S. Chen / Journal of Econometrics 96 (2000) 183}199
while the corresponding e$ciency bound for b under the independence and
0
symmetry restrictions is ((I~1) )~1, where (I~1) is the lower right block of
2 bb
2 bb
I~1, with
2
BA
A
G
Lv
f 2(v)
I "E
!K(v)
2
P(v)(1!P(v)) Lh
BH
Lv
@
!K(v) .
Lh
By some algebraic manipulation, we can show that
TC KD
C KD
TC KD
U
T C KD U
Lv
Lv
v !K(v), E
v !K(v)
((I~1) )~1!I " E
2 bb
1
Lb
Lb
! E
U
Lv
v !K(v), gH SgH, gHT~1
Lb
] gH, E
Lv
v !K(v) ,
Lb
where gH"[2g(!v)]/[ g(v)#g(!v)], and S ) , ) T denotes an inner product
such that
C
D
f 2(v)
g g .
Sg , g T"E
1 2
P(v)(1!P(v)) 1 2
Consequently ((I~1) )~1!I , can be viewed as the corresponding norm of the
2 bb
1
projection residual of E[Lv/Lb D v]!K(v) onto gH, it is thus semi-de"nite, as
expected. When v (x, h) is linear, and x has a multivariate normal distribution, we
have E(x D v)"a#bv and E(x D !v)"a!bv, thus
C KD
E
C C K D C K DD
Lv
Lv
Lv
g(!v)
v !K(v)"
E
v !E
!v
Lb
Lb
Lb
g(v)#g(!v)
"cgH.
Therefore, in this special case ((I~1) )~1"I , and there will be no e$ciency
2 bb
1
gain in imposing the symmetry restriction. In fact, the semiparametric e$ciency
bound for the slope parameter under the independence restriction in this case
agrees with the parametric bound, as noted in Cosslett (1987). In general,
however,
C C K D C K DD
g(!v)
Lv
Lv
E
v !E
!v
g(v)#g(!v)
Lh
Lh
is of full rank, and there is indeed e$ciency gain in estimating the slope
parameter by imposing the additional symmetry restriction.
S. Chen / Journal of Econometrics 96 (2000) 183}199
193
3. A small Monte Carlo study
In this section we perform some Monte Carlo simulations to investigate the
"nite sample performance of our estimator. The data is generated according to
the following model:
d"1Ma #x #b x #u'0N
0
1
0 2
with (a , b )"(0, 1). Four di!erent designs are constructed by varying the
0 0
distributions of the regressors and error term.
Notice that given the estimator for b , bK , of Klein and Spady (1993), a can
0 ,4
0
also be estimated by a( , de"ned by maximizing QK (a, bK , q( ) in (1) with respect to
1c
n
,4
a. Here we consider the "nite sample performance of our estimator, (a( , bK ), the
probit estimator (a( , bK ), a( and bK de"ned above. We examine the e$ciency
1" 1" 1c
,4
loss relative to a correct parametric speci"cation and e$ciency gain in imposing
symmetry and resulting joint estimation.
The results from 300 replications for each design are presented with sample
sizes of 100 and 300. For each estimator under consideration, we report the
mean value (Mean), the standard deviation (SD), and the root mean square error
(RMSE). As in Klein and Spady (1993), we report results for the semiparametric
estimators obtained without probability or likelihood trimming since estimates
obtained are not sensitive to trimming. Also, as in Klein and Spady (1993),
adaptive local smoothing with standard normal kernel is used for the
semiparametric estimators, with local smoothing parameters de"ned without
any trimming. The nonstochastic window width a and the corresponding pilot
n
window component a are set to n~1@7 to satisfy the restriction speci"ed above.
np
Tables 1 and 2 report the results for two homoscedastic designs in which u is
drawn from N(0, 1) independent of (x , x ). In Homoscedastic Design I,
1 2
Table 1
Homoscedastic design I
n
Estimator
Mean
SD
RMSE
100
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
0.000
1.026
!0.003
1.037
!0.005
1.030
0.005
1.000
0.003
1.007
0.003
1.002
0.149
0.238
0.165
0.283
0.160
0.248
0.091
0.131
0.095
0.160
0.095
0.138
0.149
0.239
0.165
0.285
0.160
0.250
0.091
0.131
0.095
0.160
0.095
0.138
300
194
S. Chen / Journal of Econometrics 96 (2000) 183}199
both x and x are drawn from N(0, 1) independent of each other, while x is
1
2
1
still drawn from N(0, 1) and x is drawn independently from a standardized chi2
squared distribution with 3 degrees of freedom with zero mean and unit variance
in Homoscedastic Design II. The probit estimator, as expected, performs best in
both designs, while our estimator is competitive and compared favorably with
bK , even though earlier discussions on e$ciency suggests that both the es,4
timator of Klein and Spady (1993) and our estimator attain the parametric
e$ciency bound when (x , x ) is jointly normal. In both cases, a( is also quite
1 2
c1
competitive compared with our estimator for the intercept.
Tables 3 and 4 report the results for two heteroscedastic designs in which
u"0.7uH(1#0.5v#0.25v2) for v"a #x #b x with uH drawn from
0
1
0 2
Table 2
Homoscedastic design II
n
Estimator
Mean
SD
RMSE
100
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
!0.004
0.995
!0.007
1.017
!0.012
0.977
0.009
1.008
0.004
1.016
0.002
0.998
0.148
0.234
0.169
0.335
0.160
0.240
0.087
0.139
0.094
0.169
0.092
0.140
0.148
0.234
0.169
0.335
0.160
0.241
0.088
0.139
0.094
0.170
0.092
0.140
300
Table 3
Heteroscedastic design I
n
Estimator
Mean
SD
RMSE
100
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
!0.004
1.003
!0.010
1.007
!0.012
1.004
0.009
0.981
0.002
1.002
0.001
0.996
0.251
0.339
0.241
0.308
0.240
0.302
0.164
0.231
0.133
0.217
0.130
0.193
0.251
0.339
0.241
0.308
0.240
0.302
0.164
0.232
0.133
0.217
0.130
0.193
300
S. Chen / Journal of Econometrics 96 (2000) 183}199
195
Table 4
Heteroscedastic design II
n
Estimator
Mean
SD
RMSE
100
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
!0.056
0.910
!0.009
0.987
!0.014
0.978
0.083
0.904
0.000
1.014
0.003
1.010
0.235
0.366
0.223
0.336
0.216
0.311
0.164
0.220
0.136
0.193
0.137
0.190
0.241
0.377
0.223
0.336
0.217
0.312
0.184
0.240
0.136
0.194
0.137
0.190
300
N(0, 1) independent of (x , x ). In Heteroscedastic Design I, (x , x ) is drawn
1 2
1 2
as in Homoscedastic Design I, and in Heteroscedastic Design II, (x , x ) is
1 2
drawn as in Homoscedastic Design II. In Heteroscedastic Design I, the probit
estimator still has small bias despite misspeci"cation, but its bias increases
signi"cantly in Heteroscedastic Design II. In both cases, the probit estimator
has the largest RMSE, whereas all the semiparametric estimators perform well.
4. Conclusions
In this paper we have proposed a semiparametric maximum likelihood
estimator for both the intercept and slope parameters in a binary choice
model under symmetry and index restrictions. The estimator is consistent
and asymptotically normal. Furthermore, it attains the semiparametric e$ciency bound in Cosslett (1987) under the symmetry and independence restrictions.
A small Monte Carlo study is carried out to illustrate the usefulness of
our estimator. Compared with the estimator of Klein and Spady (1993),
which attains the semiparametric e$ciency bound in Chamberlain (1986) and
Cosslett (1987) under the independence restriction, we show that there are
possible e$ciency gains in imposing the additional symmetry restriction. At the
same time, however, our approach depends crucially on the validity of
the symmetry restriction. The symmetry requirement is a testable restriction.
One apparent approach is to use Hausman test by comparing our estimator and
the estimator of Klein and Spady (1993). Alternatively, conditional moment tests
(Newey, 1985) can be used since the conditional moment restriction
E(d D x)"G(v (x, h ), h ) holds almost surely only when the symmetry restriction
0 0
is satis"ed.
196
S. Chen / Journal of Econometrics 96 (2000) 183}199
Recently, Chamberlain (1986) derived the semiparametric e$ciency bound for
the slope parameters for the binary choice sample selection model under the
independence restriction. Various semiparametric estimators have been proposed for the slope parameters. In particular, the estimators by Ai (1997) and
Chen and Lee (1998) attain the e$ciency bound. In view of the results obtained
in the paper, there might be possible e$ciency gains in imposing an additional
symmetry restriction. This is a topic for future research.
Acknowledgements
I would like to thank Bo HonoreH , Roger Klein, Lung-Fei Lee and Jim Powell
and two anonymous referees for their helpful comments. I am also grateful to
Roger Klein for providing Gauss codes.
Appendix. Proof of Theorem 1
Theorem 1 is proved by following Chen and Lee (1998), Klein and Spady
(1993) and Lee (1994). Consistency follows from arguments similar to those in
Klein and Spady (1993) by showing that the semiparametric likelihood function
converges uniformly in h to a function uniquely maximized at h . To prove
0
asymptotic normality, we "rst consider a Taylor expansion to obtain
D
C
L2QK (h`, q( ) ~1 LQK (h , q( )
1
0 ,
Jn(hK !h )" !
Jn
0
Lh Lh@
Lh
(A.1)
where h` lies between hK and h . Similar to Lemma 5 of Klein and Spady (1993),
1
0
we can show that
C
D
L2QK (h`, q( )
1
LG LG
1 "E
!
#o (1).
1
Lh Lh@
Lh Lh@ P(v)(1!P(v))
For the gradient term in (A.1), similar to Lemma 3 of Klein and Spady (1993), we
have the following Taylor expansion for the likelihood trimming term
q( (hK ,eA)"q( (h , eA)#¸ (h , eA)(hK !h )
i 1
i 0
ni 0
1
0
# 1(hK !h )@Q (h`, eA)(hK !h ),
2 1
1
0
0 ni 2
where ¸ (h, eA) and Q (h, eA) represent the "rst- and second-order partial derivani
ni
tive functions and h` lies between hK and h . Thus, we obtain
2
1
0
LQK (h , q( )
d !GK
LGK
1 n
0 "
i
i
Jn
+ q( (hK , eA) i
i
1
G
K
(1!G
K
)
Lh
Lh
Jn i/1
i
i
"S #S (hK !h )#Jn(hK !h )@S (hK !h ),
n1
n2 1
0
1
0 n3 1
0
S. Chen / Journal of Econometrics 96 (2000) 183}199
197
where
d !GK
LGK
1 n
i
i,
+ q( (h , eA) i
S "
i 0
n1 Jn
GK (1!GK ) Lh
i
i
i/1
C
1 n
d !GK
LGK
i
i
+ ¸ (h , eA) i
S "
ni
0
n2
GK (1!GK ) Lh
Jn i/1
i
i
D
and
C
D
d !GK
LGK
1 n
i
i
S " + Q (h`, eA) i
ni 2
n3
GK (1!GK ) Lh
n
i
i
i/1
with GK "GK (v (x , h ), h ) and LGK /Lh"[LGK (v (x , h ), h )]/Lh . First, write
i
n
i 0 0
i
n
i 0 0
S as
n1
S "S #S #S ,
n1
n10
n11
n12
where
d !P(v ) LG(v (x , h , h )
1 n
i
i
i 0 0 ,
+ q (h )
S "
i 0 P(v )(1!P(v ))
n10 Jn
Lh
i
i
i/1
1 n
+ (d !P(v ))
S "
i
i
n11 Jn
i/1
C
q( (h , eA) LGK
LG(v (x , h ,h )
q (h )
i 0
i!
i 0 0
i 0
GK (1!GK ) Lh
Lh
P(v )(1!P(v ))
i
i
i
i
D
and
GK !P(v ) LGK
1 n
i
i
+ q( (h , eA) i
S "
i 0
n12 Jn
GK (1!GK ) Lh
i
i
i/1
with q (h)"q (h)q (h), for
i
1i
0i
q (h)"q(g (v (x , h), h), eA)q(g (!v (x , h), h), eA),
1i
1
i
1
i
and
q (h)"q(g (v (x , h), h), eA)q(g (!v (x , h), h), eA),
0i
0
i
0
i
Following the arguments in Lee (1994, pp. 381}384), which involves Taylor
expansion and repeated applications of Propositions 4 and 6 of Lee (1994),
we can show that S "o (1). For S , by applying Taylor expansion
n11
1
n12
and U-statistic projections as in Lee (1994) and Chen and Lee (1998), we can
198
S. Chen / Journal of Econometrics 96 (2000) 183}199
show that
GK !P(v ) LG(v (x , h ), h )
1 n
i
i
i 0 0 #o (1)
+ q (h )
S "
i 0 P(v )(1!P(v ))
n12 Jn
1
Lh
i
i
i/1
1 n
"
+ q (h )g(v )g(!v )(q(v )!q(!v ))(d !P(v ))#o (1),
i
i
i
i i
i
1
Jn i/1 i 0
where
C A K B A K BD
f (v)
Lv
Lv
q(v)"
E
v #E
!v
F(v)(1!F(v))[ g(v)#g(!v)]
Lh
Lh
.
Note that q(v)"q(!v), thus S "o (1). By similar arguments, we can show
n12
1
that
n~1@3S "o (1)
n2
1
hence, S (hK !h )"o (1). Finally, we turn to S . Following the arguments of
n2 1
0
1
n3
Lemma 5 in Klein and Spady (1993), we can show that
d !GK LGK
1 n
i
i "o (1).
+ n~1@6Q (h`,eA) i
ni
1
GK (1!GK ) Lh
n
i
i
i/1
Therefore, Jn(hK !h )@S (hK !h )"o (1). Combining the above results, we
1
0 n3 1
0
1
have
CC
LG LG
1
Jn(hK !h )" E
0
Lh Lh@ P(v)(1!P(v))
DD
~1
n
d !P(v ) LG(v (x , h ), h )
i
i
i 0 0 #o (1).
+ q (h )
1
Lh
Jn i/1 i 0 P(vi )(1!P(vi ))
1
Then Theorem 1 follows by applying a central limit theorem (Ser#ing, 1980,
p. 32). K
References
Ai, C., 1997. A semiparametric maximum likelihood estimator. Econometrica 65, 933}963.
Ahn, H., Ichimura, H., Powell, J.L., 1996. Simple estimators for monotone index models. Department of Economics, University of California at Berkeley.
Cavanagh, C., Sherman, B., 1998. Rank estimators for monotonic index models. Journal of Econometrics 84, 351}382.
Chamberlain, G., 1986. Asymptotic e$ciency in semiparametric models with censoring. Journal of
Econometrics 32, 189}218.
S. Chen / Journal of Econometrics 96 (2000) 183}199
199
Chen, S., 1998. Rank estimation of a location parameter in the binary choice model. Hong Kong
University of Science and Technology.
Chen, S., 1999a. Semiparametric estimation of a location parameter in the binary choice model.
Econometric Theory, 15, 77}98.
Chen, S., 1999b. Distribution-free estimation of the random coe$cient dummy endogenous variable
model. Journal of Econometrics 91, 171}199.
Chen, S., Lee, L.F., 1998. E$cient semiparametric scoring estimation of sample selection models.
Econometric Theory 14, 423}462.
Cosslett, S.R., 1983. Distribution-free maximum likelihood estimator of the binary choice model.
Econometrica 51, 765}782.
Cosslett, S.R., 1987. E$ciency bounds for distribution-free estimators of the binary choice and the
censored regression models. Econometrica 55, 559}587.
Han, A.K., 1987. Nonparametric analysis of a generalized regression model: the maximum rank
correlation estimator. Journal of Econometrics 35, 303}316.
HaK rdle, W., Stoker, T., 1989. Investigating smooth multiple regression by the method of average
derivative. Journal of the American Statistical Association 84, 986}995.
Horowitz, J.L., 1992. A smooth maximum score estimator for the binary response model. Econometrica 60, 505}531.
Horowitz, J.L., 1993. Semiparametric and nonparametric estimation of quantal response models. In:
Maddala. G.S., Rao., C.R., Vinod, H.D. (Eds.), Handbook of Statistics, Vol. 11.
Horowitz, J.L., HaK rdle, W., 1996. Direct semiparametric estimation of single-index models with
discrete covariates. Journal of the American Statistical Association 91, 1623}1640.
Ichimura, H., 1993. Semiparametric least squares (SLS) and weighted SLS estimation of single-index
models. Journal of Econometrics 58, 71}120.
Klein, R.W., Spady, R.S., 1993. An e$cient semiparametric estimator of the binary response model.
Econometrica 61, 387}421.
Lee, L.F., 1994. Semiparametric instrumental variable estimation of simultaneous equation sample
selection models. Journal of Econometrics 63, 341}388.
Lewbel, A., 1997. Semiparametric estimation of location and other discrete choice moments.
Econometric Theory 13, 32}51.
Manski, C.F., 1985. Semiparametric analysis of discrete response: asymptotic properties of the
maximum score estimator. Journal of Econometrics 27, 313}333.
Newey, W.K., 1985. Maximum likelihood speci"cation testing and conditional moment tests.
Econometrica 53, 1047}1070.
Powell, J.L., 1989. Semiparametric estimation of bivariate latent variable models. Social Research
Institute, University of Wisconsin, Madison.
Powell, J.L., Stock, J.H., Stoker, T.M., 1989. Semiparametric estimation of weighted average
derivatives. Econometrica 57, 1403}1430.
Sherman, R.P., 1993. The limiting distribution of the maximum rank correlation estimator. Econometrica 61, 123}138.
Ser#ing, R.S., 1980. Approximation Theorems of Mathematical Statistics. Chapman & and Hall,
New York.
                                            
                E$cient estimation of binary choice models
under symmetry
Songnian Chen*
Department of Economics, The Hong Kong University of Science and Technology, Clear Water Bay,
Kowloon, Hong Kong, People+s Republic of China
Received 1 July 1997; received in revised form 1 June 1999
Abstract
This paper proposes a semiparametric maximum likelihood estimator for both
the intercept and slope parameters in a binary choice model under symmetry and
index restrictions. The estimator attains the semiparametric e$ciency bound in Cosslett
(1987) under the symmetry and independence restrictions. Compared with the estimator
of Klein and Spady (1993), which attains the semiparametric e$ciency bound in
Chamberlain (1986), and Cosslett (1987) under the independence restriction, we
show that there are possible e$ciency gains in estimating the slope parameters by
imposing the additional symmetry restriction. A small Monte Carlo study is carried
out to illustrate the usefulness of our estimator. ( 2000 Elsevier Science S.A.
All rights reserved.
JEL classixcation: C31; C34
Keywords: Binary choice; Symmetry; Semiparametric estimation; E$ciency bound
1. Introduction
In this paper we consider the estimation of the binary choice model de"ned by
d"1Mv (x, h )'uN,
0
* Tel.: #852-2358-7602; fax: #852-2358-2084.
E-mail address: [email protected] (S. Chen)
0304-4076/00/$ - see front matter ( 2000 Elsevier Science S.A. All rights reserved.
PII: S 0 3 0 4 - 4 0 7 6 ( 9 9 ) 0 0 0 5 9 - 7
184
S. Chen / Journal of Econometrics 96 (2000) 183}199
where 1MAN is the usual indicator function of the event A, x is a vector of
exogenous variables, v (x, h ) is a given parametric function of x with an
0
unknown parameter vector h , and u is an unobservable error term. In most
0
applications v (x, h ) is assumed to be a linear function, then the model (the
0
linear binary choice model hereafter) becomes
d"1Ma #xb 'uN
0
0
with h "(a , b@ )@, where a and b are the intercept and slope parameters,
0
0
0
0 0
respectively.
For the binary choice model, traditionally, the logit and probit estimation
methods are the most popular approaches by restricting the error distribution to
parametric families. However, misspeci"cation of the parametric distributions,
in general, will result in inconsistent estimates for likelihood-based approaches.
In addition, speci"c functional forms for the error distribution cannot usually be
justi"ed by economic theory. This consideration has motivated recent interest in
estimating the binary choice model by semiparametric methods which do not
assume a parametric distribution for u. Most attention has been focused on the
linear binary choice model (see Horowitz, 1993, for a survey) in semiparametric
estimation literature. Under various weak distributional assumptions on the
error term u, a number of semiparametric estimators have been proposed for h
0
(Ahn et al., 1996; Cavanagh and Sherman, 1998; Chen and Lee, 1998; Cosslett,
1983; Han, 1987; HaK rdle and Stoker, 1989; Horowitz and HaK rdle, 1996;
Ichimura, 1993; Klein and Spady, 1993; Powell et al., 1989; Sherman, 1993),
among others, for estimating the slope parameters with no location restriction
on u, and Chen, 1998, 1999a; Horowitz, 1992; Lewbel, 1997; Manski, 1985 for
estimating the intercept and/or slope parameter with location restrictions imposed on u). Cosslett (1987) considered semiparametric e$ciency bounds
for the parameters in the binary choice model under the assumption that the
error term is independent of the regressors and various location restrictions.
The estimators by Chen and Lee (1998) and Klein and Spady (1993), attain the
e$ciency bound of Cosslett (1987) (and Chamberlain, 1986) for the slope
parameter under the independence restriction. Cosslett (1987) further pointed
out that under an additional symmetry restriction, not only is the location
parameter identi"ed, but there is possible e$ciency gain for estimating the slope
parameter. Therefore, exploiting the symmetry restriction could potentially lead
to more e$cient estimator for the slope parameter than the estimator of Klein
and Spady (1993).
Consistent estimation of the intercept term is also important for the binary
choice model (see Chen, 1998, 1999a and Lewbel, 1997 for detailed discussions).
In the context of the sample selection model, by imposing symmetry and index
restrictions, Chen (1999b) recently considered Jn-consistent estimation of both
the intercept and slope parameters for the outcome equation without the usual
exclusion restriction (see Chamberlain, 1986; Powell, 1989, etc.). However,
S. Chen / Journal of Econometrics 96 (2000) 183}199
185
Chen's (1999b) approach requires an initial Jn-consistent estimator for the
intercept term of the binary choice selection equation.
In this paper, we consider joint estimation of the intercept and slope
parameters under index and symmetry restrictions. Following Klein and
Spady (1993), we propose a semiparametric maximum likelihood estimator
with both the index and symmetry restrictions taken into account in formulating the semiparametric likelihood function. The resulting estimator attains the
e$ciency bound of Cosslett (1987) under the independence and symmetry
restrictions.
The article is organized as follows. Section 2 introduces our estimator and
considers its large sample properties. In Section 2, we also compare Cosslett's
e$ciency bounds under the independence restriction with and without the
symmetry restriction, and discuss possible e$ciency gain by imposing
the symmetry restriction. In Section 3 a small Monte Carlo study is carried out
to assess the "nite sample performance of the estimator. Section 4 concludes.
2. Motivation and the estimator
For the above binary choice model, it is well known that some scale and
location normalization is necessary in order to identify h . In parametric
0
models, such as the probit and logit models, scale normalization is accomplished
by setting the variance of u to one. In the semiparametric literature for the linear
binary choice model, two common scale normalization schemes are to set either
D b D "1 or the coe$cient of one component of b to one. Location restrictions
0
0
are imposed on various locational measures, for example, the conditional mean
or median, of u directly. In this paper, the location and scale restrictions are
imposed by setting v (x, h )"a #v (x, b ) with v (x, b )"v (x )#
0
0
1
0
1
0
1a 1
v (x , b ), and the error term u being symmetrically distributed conditional on x.
1b 2 0
Traditionally, the binary choice model is estimated by the maximum likelihood method, most notably for the logit and probit models. For a random
sample of size n, maximum likelihood estimator is de"ned by maximizing the
log-likelihood function
n
ln ¸ (h)" + [d ln F (v (x , h ))#(1!d ) ln(1!F (v (x , h )))],
n
i
x
i
i
x
i
i/1
where the error term u is assumed to have distribution F ( ) ) given x. However,
x
misspeci"cation of the parametric distribution function, in general, will lead to
inconsistent estimates. Consequently, various semiparametric estimators have
been proposed without a parametric speci"cation for the error distribution. In
particular, under the index restriction that the error distribution depends on
x only through v (x, b ), Klein and Spady (1993) proposed a semiparametric
1
0
186
S. Chen / Journal of Econometrics 96 (2000) 183}199
maximum likelihood estimator for b by maximizing
0
n
ln ¸ (b)" + q [d ln F (v (x , b))#(1!d ) ln(1!F (v (x , b)))],
,4
ni i
n 1 i
i
n 1 i
i/1
where F is a nonparametric estimator for the unknown distribution of u!a ,
n
0
and q are some trimming functions adopted for technical convenience. Since no
ni
location restriction is imposed in their case, they only provide an estimator for
b . Under the condition that the error term is independent of the regressors,
0
they further show that their estimator attains Cosslett's (1987) (and Chamberlain's, 1986) semiparametric e$ciency bound.
In this paper, we consider joint estimation of (a , b ) under a location
0 0
restriction in the form of conditional symmetry and an index restriction stronger
than the one in Klein and Spady (1993). More speci"cally, we assume that the
conditional density of u given x, f (u D x) is an even function of u, and depends on
x only through v2(x, h ); that is, f (!u D x)"f (u D x) and f (u D x)"f (u D v2(x, h ).
0
0
Following Klein and Spady (1993), we propose a semiparametric maximum
likelihood estimator; moreover, we take into account the symmetry as well as
the index restriction in formulating the semiparametric likelihood function.
Under our assumptions, the location parameter a can be consistently esti0
mated; furthermore, there are possible e$ciency gains in estimating b by
0
imposing the symmetry restriction.
To formulate the semiparametric likelihood function, let g(t, h) denote
the density of v (x, h) evaluated at t, P(t, h)"E(d D v (x, h)"t), P (t, h)"
0
E(1!d D v (x, h)"t), g (t, h)"P(t, h)g(t, h)
and
g (t, h)"P (t, h)g(t, h)"
1
0
0
g(t, h)!g (t, h) with P(t)"P(t, h ) and g(t)"g(t, h ). Notice that under the
1
0
0
symmetry and index restrictions, it can be easily shown that
P(t)"P (!t)
0
namely,
E(d D v (x, h )"t)"E((1!d) D v (x, h )"!t)
0
0
which, in turn, suggests that under our conditions G (t, h ) is a natural nonn
0
parametric kernel estimator for P(t), where G (t, h) is de"ned as the solution to
n
the following problem1:
C
A
B
A
n
v (x , h)!t
v (x , h)#t
i
i
min + (d !k)2H
#(1!d !k)2H
i
i
a
a
k i/1
n
n
1 I thank one referee for pointing this out.
BD
,
S. Chen / Journal of Econometrics 96 (2000) 183}199
187
where H is a kernel function and a is a sequence of bandwidth converging to
n
zero as the sample size increases to in"nity, that is,
+n [d H((v (x , h)!t)/a )#(1!d )H((v (x , h)#t)/a )]
i
n
i
i
n .
G (t, h)" i/1 i
n
+n [H((v (x , h)!t)/a )#H((v (x , h)#t)/a )]
i
n
i
n
i/1
It is clear that G (t, h) locally approximates (for h in a neighborhood of h )
n
0
G (t, h ). Also, g (t, h), g (t, h), and g(t, h) can be estimated nonparametrically by
n
0
1
0
g (t, h), g (t, h), and g (t, h), respectively, where
n1
n0
n
v (x , h)!t
1 n
i
+ dH
,
g (t, h)"
i
n1
a
na
n
n i/1
v (x , h)!t
1 n
i
+ (1!d )H
,
g (t, h)"
i
n0
a
na
n
n i/1
and g (t, h)"g (t, h)#g (t, h). Hence, G (t, h) can be rewritten as
n
n1
n0
n
g (t, h)#g (!t, h)
n0
.
G (t, h)" n1
n
g (t, h)#g (!t, h)
n
n
These nonparametric estimators will provide the basis for constructing the
semiparametric likelihood function.
Following Klein and Spady (1993) we introduce probability and likelihood
trimmings to deal with technical di$culties arising from small estimated densities. For the probability trimming, let
B
A
A
B
d (t, h),ae{[ez1 /(1#ez1 )]
z "[(ae{@5!g (t, h ))/ae{@4],
n
n
1n
n
n1
1
0
0
d (t, h),ae{[ez /(1#ez )]
z "[(ae{@5!g (t, h ))/ae{@4],
n
0n
n
n
n0
0
z"[(ae{@5!g (t, h ))/ae{@4],
d (t, h),ae{[ez/(1#ez)]
n
n
n1
n
n
then we de"ne the estimated probability function as
g (t, h)#g (!t, h)#d (t, h)#d (!t, h)
n0
1n
0n
GK (t, h)" n1
.
n
g (t, h)#g (!t, h)#d (t, h)#d (!t, h)
n
n
n
n
We now consider the likelihood trimming. Let hK be a preliminary consistent
1
estimate for h for which D hK !h D "O (n~1@3). The estimators mentioned
0
1
0
1
above satisfy this condition. Then the likelihood trimming function is de"ned as
q( (h)"q( (h)q( (h), where
i
1i
0i
q( (h)"q(g (v (x , h), h), eA)q(g (!v (x , h), h), eA),
1i
n1
i
n1
i
q( (h)"q(g (v (x , h), h), eA)q(g (!v (x , h), h), eA),
0i
n0
i
n0
i
where
q(s,e),M1#exp[(ae@5!s)/ae@4]N~1
n
n
188
S. Chen / Journal of Econometrics 96 (2000) 183}199
with eA(e@(1. As in Klein and Spady (1993), the likelihood trimming is more
severe than the probability trimming. Finally, we de"ne our semiparametric
maximum likelihood estimator, hK , by maximizing
QK (h, q( )"+ (q( (hK )/2)[d ln[GK 2(v (x , h), h]
n
i
n
i 1
i
# (1!d ) ln[(1!GK (v (x , h), h ))2].
(1)
i
n
i
Let Dr m(z) denote the rth-order partial of m(z) with respect to z and
z
D0m(z)"m(z). Let c denote a generic positive constant. We make the following
z
assumptions.
Assumption 1. The observations (d , x ), i"1, 2, 2 , n are i.i.d. The conditional
i i
density of u, f (u D x), depends on x only through v2(x, h ), and symmetric around
0
0, namely, f (u D x)"f (u D v (x, h ))"f (u D v2(x, h )) and f (!u D x)"f (u D x); and
0
0
f (u D v (x, h )"t) is four times continuously di!erentiable with respect to (u, t)
0
and its derivatives are uniformly bounded.
Assumption 2. The parameter vector h lies in a compact parameter space, H. The
true value of h, h , is in the interior of H.
0
Assumption 3. There exist P, P that do not depend on x such that
0(P)E(d D x)4P(1.
Assumption 4. The index function v (x, h) is smooth in that for h in a neighborhood of h and all s:
0
M D D*r+v (s; h) D , D LD*r+v (s; h)/Lx D N(c (r"0, 1, 2, 3, 4).
h
h
Also, the conditional density of x given x , f 1 2 (x ) is supported on [a, b] and
1
2 x @x 1
smooth in that for all x
D Dr f 1 2 (s) D (c (r"0, 1, 2, 3, 4; s3[a, b]).
s x @x
Assumption 5. The kernel function H(v) is a symmetric function that integrates
to one, four times continuously di!erentiable with bounded derivatives. Also,
H(v) is a higher-order kernel in that :v2H(v) dv"0, and :v4H(v) dv(R.
Assumption 6. The bandwidth sequence Ma N is chosen such that n~1@(6`2e{)
n
(a (n~1@(8~eA).
n
Assumption 7. The model is identi"ed in that for almost all x,
G(v (x, h ), h )"G(v (x, h ), h ) implies h "h ,
H H
0 0
H
0
S. Chen / Journal of Econometrics 96 (2000) 183}199
189
where
g (v (x, h), h)#g (!v (x, h), h)
0
G(v (x, h), h)" 1
.
g(v (x, h), h)#g(!v (x, h), h)
Assumption 1 describes the model and the data. The symmetry restriction is
a location normalization. Compared with the usual index restriction in the
literature that f (u D x)"f (u D v (x, h )) (see, e.g. Ichimura, 1993; Klein and Spady,
0
1993), our index restriction is slightly stronger; however, in both cases, the
unknown form of heteroscedasticity allowed is very limited. Assumptions 2}6
are similar to those in Klein and Spady (1993) (see Klein and Spady, 1993 for
detailed discussions); in particular, a higher-order bias reducing kernel is used
here. However, following Klein and Spady (1993), we can also adopt an adaptive
kernel function with local smoothing that has similar bias reduction properties.
Speci"cally, we set the bandwidth to be variable and data dependent. For
sample size n, let lK "(1/na )+n H((v (x , h)!t)/a ) be a preliminary density
j
np i/1
i
np
estimator with bandwidth a . For
0(e@(1, set the bandwidth for observation
np
j to a "a p( (h)¸K , where p( (h) is the sample standard deviation of v (x, h),
nj
n
j
¸K "M[ lK #(1!q( )/m]N
j
j
pj
with
q( "q( lK , ae{ )
pj
j np
and m being the geometric mean of the lK . As pointed out by Klein and Spady
j
(1993), the estimated probabilities using local smoothing are always positive,
whereas those based on higher-order kernels can be negative. Assumption 7 is
an identi"cation condition. Note that G(v (x, h ), h )"P(v (x, h )) under our
0 0
0
index and symmetry restrictions. Klein and Spady (1993) provided two sets of
models for which the slope parameters are identi"ed, allowing both monotonic
and nonmonotonic models. We "rst consider the identi"cation of the slope
parameter. For a "xed h"(a, b@)@, if
G(v (x, h), h)"G(v (x, h ), h )"P(v (x, h ))
0 0
0
for almost all x, then de"ne G (x, b)"G(a#v (x, b), (a, b@)@), which can be
a
1
viewed as a function of v (x, b), then by following the arguments of Klein and
1
Spady (1993) we can show that b"b if the assumptions in Theorems 1 or 2 of
0
Klein and Spady (1993) are satis"ed. Once the slope parameter is identi"ed, we
only need to show that
G(v (x, (a, b@ )@), (a, b@ )@)"G(v (x, h ), h )
0
0 0
0
for almost all x implies a"a . It is straightforward to show that under our
0
index and symmetry restrictions
G(v (x, (a, b@ )@), (a, b@ )@)"g (x, a)P(v (x, h ))
0
w1
0
0
# g (x, a)P(v (x, h #2(a!a ))),
w2
0
0
(2)
190
S. Chen / Journal of Econometrics 96 (2000) 183}199
where
g(v (x, (a, b@ )@), (a, b@ )@)
0
0
g (x, a)"
w1
g(v (x, (a, b@ )@), (a, b@ )@), (a, b@ )@)#g(!v (x, (a, b@ )@), (a, b@ )@)
0
0
0
0
0
g(v (x, h ))
0
"
g(v (x, h )#g(!v (x, h )#2a !2a)
0
0
0
and
g(!v (x, (a, b@ )@), (a, b@ )@)
0
0
g (x, a)"
w2
g(v (x, (a, b@ )@), (a, b@ )@)#g(!v (x, (a, b@ )@), (a, b@ )@)
0
0
0
0
g(!v (x, h )#2a !2a)
0
0
"
g(v (x, h )#g(!v (x, h )#2a !2a)
0
0
0
with g (x, a)#g (x, a)"1. Thus (2) is equivalent to
w1
w2
g (x, a)[P(v (x, h )#2(a!a ))!P(v (x, h ))]"0.
(3)
w2
0
0
0
When the support of v (x, h ) is large enough so that there exists an interval
0
[a , b ], we have g(t)g(!t#2a !2a)'0 for t3[a , b ] and (a, b )3H, then
v v
0
v v
0
identi"cation of a follows if
0
P(t)"P(t#2(a!a ))
(4)
0
for t 3[a , b ], implies a"a , which, in turn, obviously holds for monotonic
v v
0
models. Even many nonmonotonic models, (4) typically ensures the identi"cation of a ; for example, identi"cation of a follows easily from (4) when
0
0
P(t)'0.5 if and only if t'0, and dP(t)/dt changes sign only once for t'0.
Theorem 1. Under Assumptions 1}7, the estimator hK is consistent and asymptotically normal,
D
Jn(hK !h )P N(0, R),
0
where
GC DC
R"E
LG LG
Lh Lh@
1
P(v)(1!P(v))
DH
~1
with v"v (x, h ) and LG/Lh"[LG(v (x, h ), h )]/Lh.
0
0 0
Proof. See the appendix. h
We now turn to the issue of e$ciency. Klein and Spady (1993) have shown
that their semiparametric maximum likelihood estimator attains the asymptotic
e$ciency bound of Chamberlain (1986) and Cosslett (1987) associated with the
S. Chen / Journal of Econometrics 96 (2000) 183}199
191
independence restriction. We will show that by also taking the symmetry
restriction into account, our estimator attains the asymptotic e$ciency bound of
Cosslett (1987) associated with both the independence and symmetry restrictions.
Lemma 2. For the binary choice model under consideration here, the asymptotic
covariance of hK achieves the ezciency bound as specixed in Cosslett (1987) under
the condition that the error is independent of the regressors, and symmetrically
distributed around the origin.
Following the arguments of Lemma 2 of Chen and Lee (1998) and Klein and
Spady (1993), we can show that under the independence and symmetry restrictions
A K BD
C
L
Lv
Lv
v
[E(d D v (x , h)"v (x, h)] D
"f (v)
!E
i
h/h0
Lh
Lh
Lh
and
C
A K BD
Lv
Lv
L
[E(d D v (x , h)"!v (x, h)] D 0 "!f (v)
#E
!v
i
h/h
Lh
Lh
Lh
,
where Lv/Lh"[Lv (x, h )]/Lh. Therefore,
0
Lv
LG
"f (v)
!K(v) ,
Lh
Lh
A
B
where
C C KD
A K BD
Lv
Lv
1
K(v)"
g(v)E
v !g(!v)E
!v
Lh
Lh
g(v)#g(!v)
,
as speci"ed in Cosslett (1987). Therefore,
G
R~1"E
A
BA
f 2(v)
Lv
!K(v)
P(v)(1!P(v)) Lh
BH
Lv
@
!K(v) .
Lh
(5)
Then Lemma 2 follows by comparing (5) and Eq. (3.27) in Cosslett (1987).
Since the estimator by Klein and Spady bK attains the e$ciency bound under
,4
the independence restriction, while our estimator hK "(a( , bK @)@ attains the e$ciency bound under both the independence and symmetry restrictions, it is of
interest to examine whether the additional symmetry restriction leads to any
e$ciency gain in estimating b . From Cosslett (1987), the e$ciency bound for
0
b under the independence restriction is
0
Lv
Lv
Lv
Lv
@
f 2(v)
I "E
!E
!E
v
v
,
1
Lb
Lb
Lb
P(v)(1!P(v)) Lb
G
A
C K DBA
C K DB H
192
S. Chen / Journal of Econometrics 96 (2000) 183}199
while the corresponding e$ciency bound for b under the independence and
0
symmetry restrictions is ((I~1) )~1, where (I~1) is the lower right block of
2 bb
2 bb
I~1, with
2
BA
A
G
Lv
f 2(v)
I "E
!K(v)
2
P(v)(1!P(v)) Lh
BH
Lv
@
!K(v) .
Lh
By some algebraic manipulation, we can show that
TC KD
C KD
TC KD
U
T C KD U
Lv
Lv
v !K(v), E
v !K(v)
((I~1) )~1!I " E
2 bb
1
Lb
Lb
! E
U
Lv
v !K(v), gH SgH, gHT~1
Lb
] gH, E
Lv
v !K(v) ,
Lb
where gH"[2g(!v)]/[ g(v)#g(!v)], and S ) , ) T denotes an inner product
such that
C
D
f 2(v)
g g .
Sg , g T"E
1 2
P(v)(1!P(v)) 1 2
Consequently ((I~1) )~1!I , can be viewed as the corresponding norm of the
2 bb
1
projection residual of E[Lv/Lb D v]!K(v) onto gH, it is thus semi-de"nite, as
expected. When v (x, h) is linear, and x has a multivariate normal distribution, we
have E(x D v)"a#bv and E(x D !v)"a!bv, thus
C KD
E
C C K D C K DD
Lv
Lv
Lv
g(!v)
v !K(v)"
E
v !E
!v
Lb
Lb
Lb
g(v)#g(!v)
"cgH.
Therefore, in this special case ((I~1) )~1"I , and there will be no e$ciency
2 bb
1
gain in imposing the symmetry restriction. In fact, the semiparametric e$ciency
bound for the slope parameter under the independence restriction in this case
agrees with the parametric bound, as noted in Cosslett (1987). In general,
however,
C C K D C K DD
g(!v)
Lv
Lv
E
v !E
!v
g(v)#g(!v)
Lh
Lh
is of full rank, and there is indeed e$ciency gain in estimating the slope
parameter by imposing the additional symmetry restriction.
S. Chen / Journal of Econometrics 96 (2000) 183}199
193
3. A small Monte Carlo study
In this section we perform some Monte Carlo simulations to investigate the
"nite sample performance of our estimator. The data is generated according to
the following model:
d"1Ma #x #b x #u'0N
0
1
0 2
with (a , b )"(0, 1). Four di!erent designs are constructed by varying the
0 0
distributions of the regressors and error term.
Notice that given the estimator for b , bK , of Klein and Spady (1993), a can
0 ,4
0
also be estimated by a( , de"ned by maximizing QK (a, bK , q( ) in (1) with respect to
1c
n
,4
a. Here we consider the "nite sample performance of our estimator, (a( , bK ), the
probit estimator (a( , bK ), a( and bK de"ned above. We examine the e$ciency
1" 1" 1c
,4
loss relative to a correct parametric speci"cation and e$ciency gain in imposing
symmetry and resulting joint estimation.
The results from 300 replications for each design are presented with sample
sizes of 100 and 300. For each estimator under consideration, we report the
mean value (Mean), the standard deviation (SD), and the root mean square error
(RMSE). As in Klein and Spady (1993), we report results for the semiparametric
estimators obtained without probability or likelihood trimming since estimates
obtained are not sensitive to trimming. Also, as in Klein and Spady (1993),
adaptive local smoothing with standard normal kernel is used for the
semiparametric estimators, with local smoothing parameters de"ned without
any trimming. The nonstochastic window width a and the corresponding pilot
n
window component a are set to n~1@7 to satisfy the restriction speci"ed above.
np
Tables 1 and 2 report the results for two homoscedastic designs in which u is
drawn from N(0, 1) independent of (x , x ). In Homoscedastic Design I,
1 2
Table 1
Homoscedastic design I
n
Estimator
Mean
SD
RMSE
100
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
0.000
1.026
!0.003
1.037
!0.005
1.030
0.005
1.000
0.003
1.007
0.003
1.002
0.149
0.238
0.165
0.283
0.160
0.248
0.091
0.131
0.095
0.160
0.095
0.138
0.149
0.239
0.165
0.285
0.160
0.250
0.091
0.131
0.095
0.160
0.095
0.138
300
194
S. Chen / Journal of Econometrics 96 (2000) 183}199
both x and x are drawn from N(0, 1) independent of each other, while x is
1
2
1
still drawn from N(0, 1) and x is drawn independently from a standardized chi2
squared distribution with 3 degrees of freedom with zero mean and unit variance
in Homoscedastic Design II. The probit estimator, as expected, performs best in
both designs, while our estimator is competitive and compared favorably with
bK , even though earlier discussions on e$ciency suggests that both the es,4
timator of Klein and Spady (1993) and our estimator attain the parametric
e$ciency bound when (x , x ) is jointly normal. In both cases, a( is also quite
1 2
c1
competitive compared with our estimator for the intercept.
Tables 3 and 4 report the results for two heteroscedastic designs in which
u"0.7uH(1#0.5v#0.25v2) for v"a #x #b x with uH drawn from
0
1
0 2
Table 2
Homoscedastic design II
n
Estimator
Mean
SD
RMSE
100
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
!0.004
0.995
!0.007
1.017
!0.012
0.977
0.009
1.008
0.004
1.016
0.002
0.998
0.148
0.234
0.169
0.335
0.160
0.240
0.087
0.139
0.094
0.169
0.092
0.140
0.148
0.234
0.169
0.335
0.160
0.241
0.088
0.139
0.094
0.170
0.092
0.140
300
Table 3
Heteroscedastic design I
n
Estimator
Mean
SD
RMSE
100
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
!0.004
1.003
!0.010
1.007
!0.012
1.004
0.009
0.981
0.002
1.002
0.001
0.996
0.251
0.339
0.241
0.308
0.240
0.302
0.164
0.231
0.133
0.217
0.130
0.193
0.251
0.339
0.241
0.308
0.240
0.302
0.164
0.232
0.133
0.217
0.130
0.193
300
S. Chen / Journal of Econometrics 96 (2000) 183}199
195
Table 4
Heteroscedastic design II
n
Estimator
Mean
SD
RMSE
100
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
a(
1"
a(
1"
a(
#
a(
,4
a(
1
a(
2
!0.056
0.910
!0.009
0.987
!0.014
0.978
0.083
0.904
0.000
1.014
0.003
1.010
0.235
0.366
0.223
0.336
0.216
0.311
0.164
0.220
0.136
0.193
0.137
0.190
0.241
0.377
0.223
0.336
0.217
0.312
0.184
0.240
0.136
0.194
0.137
0.190
300
N(0, 1) independent of (x , x ). In Heteroscedastic Design I, (x , x ) is drawn
1 2
1 2
as in Homoscedastic Design I, and in Heteroscedastic Design II, (x , x ) is
1 2
drawn as in Homoscedastic Design II. In Heteroscedastic Design I, the probit
estimator still has small bias despite misspeci"cation, but its bias increases
signi"cantly in Heteroscedastic Design II. In both cases, the probit estimator
has the largest RMSE, whereas all the semiparametric estimators perform well.
4. Conclusions
In this paper we have proposed a semiparametric maximum likelihood
estimator for both the intercept and slope parameters in a binary choice
model under symmetry and index restrictions. The estimator is consistent
and asymptotically normal. Furthermore, it attains the semiparametric e$ciency bound in Cosslett (1987) under the symmetry and independence restrictions.
A small Monte Carlo study is carried out to illustrate the usefulness of
our estimator. Compared with the estimator of Klein and Spady (1993),
which attains the semiparametric e$ciency bound in Chamberlain (1986) and
Cosslett (1987) under the independence restriction, we show that there are
possible e$ciency gains in imposing the additional symmetry restriction. At the
same time, however, our approach depends crucially on the validity of
the symmetry restriction. The symmetry requirement is a testable restriction.
One apparent approach is to use Hausman test by comparing our estimator and
the estimator of Klein and Spady (1993). Alternatively, conditional moment tests
(Newey, 1985) can be used since the conditional moment restriction
E(d D x)"G(v (x, h ), h ) holds almost surely only when the symmetry restriction
0 0
is satis"ed.
196
S. Chen / Journal of Econometrics 96 (2000) 183}199
Recently, Chamberlain (1986) derived the semiparametric e$ciency bound for
the slope parameters for the binary choice sample selection model under the
independence restriction. Various semiparametric estimators have been proposed for the slope parameters. In particular, the estimators by Ai (1997) and
Chen and Lee (1998) attain the e$ciency bound. In view of the results obtained
in the paper, there might be possible e$ciency gains in imposing an additional
symmetry restriction. This is a topic for future research.
Acknowledgements
I would like to thank Bo HonoreH , Roger Klein, Lung-Fei Lee and Jim Powell
and two anonymous referees for their helpful comments. I am also grateful to
Roger Klein for providing Gauss codes.
Appendix. Proof of Theorem 1
Theorem 1 is proved by following Chen and Lee (1998), Klein and Spady
(1993) and Lee (1994). Consistency follows from arguments similar to those in
Klein and Spady (1993) by showing that the semiparametric likelihood function
converges uniformly in h to a function uniquely maximized at h . To prove
0
asymptotic normality, we "rst consider a Taylor expansion to obtain
D
C
L2QK (h`, q( ) ~1 LQK (h , q( )
1
0 ,
Jn(hK !h )" !
Jn
0
Lh Lh@
Lh
(A.1)
where h` lies between hK and h . Similar to Lemma 5 of Klein and Spady (1993),
1
0
we can show that
C
D
L2QK (h`, q( )
1
LG LG
1 "E
!
#o (1).
1
Lh Lh@
Lh Lh@ P(v)(1!P(v))
For the gradient term in (A.1), similar to Lemma 3 of Klein and Spady (1993), we
have the following Taylor expansion for the likelihood trimming term
q( (hK ,eA)"q( (h , eA)#¸ (h , eA)(hK !h )
i 1
i 0
ni 0
1
0
# 1(hK !h )@Q (h`, eA)(hK !h ),
2 1
1
0
0 ni 2
where ¸ (h, eA) and Q (h, eA) represent the "rst- and second-order partial derivani
ni
tive functions and h` lies between hK and h . Thus, we obtain
2
1
0
LQK (h , q( )
d !GK
LGK
1 n
0 "
i
i
Jn
+ q( (hK , eA) i
i
1
G
K
(1!G
K
)
Lh
Lh
Jn i/1
i
i
"S #S (hK !h )#Jn(hK !h )@S (hK !h ),
n1
n2 1
0
1
0 n3 1
0
S. Chen / Journal of Econometrics 96 (2000) 183}199
197
where
d !GK
LGK
1 n
i
i,
+ q( (h , eA) i
S "
i 0
n1 Jn
GK (1!GK ) Lh
i
i
i/1
C
1 n
d !GK
LGK
i
i
+ ¸ (h , eA) i
S "
ni
0
n2
GK (1!GK ) Lh
Jn i/1
i
i
D
and
C
D
d !GK
LGK
1 n
i
i
S " + Q (h`, eA) i
ni 2
n3
GK (1!GK ) Lh
n
i
i
i/1
with GK "GK (v (x , h ), h ) and LGK /Lh"[LGK (v (x , h ), h )]/Lh . First, write
i
n
i 0 0
i
n
i 0 0
S as
n1
S "S #S #S ,
n1
n10
n11
n12
where
d !P(v ) LG(v (x , h , h )
1 n
i
i
i 0 0 ,
+ q (h )
S "
i 0 P(v )(1!P(v ))
n10 Jn
Lh
i
i
i/1
1 n
+ (d !P(v ))
S "
i
i
n11 Jn
i/1
C
q( (h , eA) LGK
LG(v (x , h ,h )
q (h )
i 0
i!
i 0 0
i 0
GK (1!GK ) Lh
Lh
P(v )(1!P(v ))
i
i
i
i
D
and
GK !P(v ) LGK
1 n
i
i
+ q( (h , eA) i
S "
i 0
n12 Jn
GK (1!GK ) Lh
i
i
i/1
with q (h)"q (h)q (h), for
i
1i
0i
q (h)"q(g (v (x , h), h), eA)q(g (!v (x , h), h), eA),
1i
1
i
1
i
and
q (h)"q(g (v (x , h), h), eA)q(g (!v (x , h), h), eA),
0i
0
i
0
i
Following the arguments in Lee (1994, pp. 381}384), which involves Taylor
expansion and repeated applications of Propositions 4 and 6 of Lee (1994),
we can show that S "o (1). For S , by applying Taylor expansion
n11
1
n12
and U-statistic projections as in Lee (1994) and Chen and Lee (1998), we can
198
S. Chen / Journal of Econometrics 96 (2000) 183}199
show that
GK !P(v ) LG(v (x , h ), h )
1 n
i
i
i 0 0 #o (1)
+ q (h )
S "
i 0 P(v )(1!P(v ))
n12 Jn
1
Lh
i
i
i/1
1 n
"
+ q (h )g(v )g(!v )(q(v )!q(!v ))(d !P(v ))#o (1),
i
i
i
i i
i
1
Jn i/1 i 0
where
C A K B A K BD
f (v)
Lv
Lv
q(v)"
E
v #E
!v
F(v)(1!F(v))[ g(v)#g(!v)]
Lh
Lh
.
Note that q(v)"q(!v), thus S "o (1). By similar arguments, we can show
n12
1
that
n~1@3S "o (1)
n2
1
hence, S (hK !h )"o (1). Finally, we turn to S . Following the arguments of
n2 1
0
1
n3
Lemma 5 in Klein and Spady (1993), we can show that
d !GK LGK
1 n
i
i "o (1).
+ n~1@6Q (h`,eA) i
ni
1
GK (1!GK ) Lh
n
i
i
i/1
Therefore, Jn(hK !h )@S (hK !h )"o (1). Combining the above results, we
1
0 n3 1
0
1
have
CC
LG LG
1
Jn(hK !h )" E
0
Lh Lh@ P(v)(1!P(v))
DD
~1
n
d !P(v ) LG(v (x , h ), h )
i
i
i 0 0 #o (1).
+ q (h )
1
Lh
Jn i/1 i 0 P(vi )(1!P(vi ))
1
Then Theorem 1 follows by applying a central limit theorem (Ser#ing, 1980,
p. 32). K
References
Ai, C., 1997. A semiparametric maximum likelihood estimator. Econometrica 65, 933}963.
Ahn, H., Ichimura, H., Powell, J.L., 1996. Simple estimators for monotone index models. Department of Economics, University of California at Berkeley.
Cavanagh, C., Sherman, B., 1998. Rank estimators for monotonic index models. Journal of Econometrics 84, 351}382.
Chamberlain, G., 1986. Asymptotic e$ciency in semiparametric models with censoring. Journal of
Econometrics 32, 189}218.
S. Chen / Journal of Econometrics 96 (2000) 183}199
199
Chen, S., 1998. Rank estimation of a location parameter in the binary choice model. Hong Kong
University of Science and Technology.
Chen, S., 1999a. Semiparametric estimation of a location parameter in the binary choice model.
Econometric Theory, 15, 77}98.
Chen, S., 1999b. Distribution-free estimation of the random coe$cient dummy endogenous variable
model. Journal of Econometrics 91, 171}199.
Chen, S., Lee, L.F., 1998. E$cient semiparametric scoring estimation of sample selection models.
Econometric Theory 14, 423}462.
Cosslett, S.R., 1983. Distribution-free maximum likelihood estimator of the binary choice model.
Econometrica 51, 765}782.
Cosslett, S.R., 1987. E$ciency bounds for distribution-free estimators of the binary choice and the
censored regression models. Econometrica 55, 559}587.
Han, A.K., 1987. Nonparametric analysis of a generalized regression model: the maximum rank
correlation estimator. Journal of Econometrics 35, 303}316.
HaK rdle, W., Stoker, T., 1989. Investigating smooth multiple regression by the method of average
derivative. Journal of the American Statistical Association 84, 986}995.
Horowitz, J.L., 1992. A smooth maximum score estimator for the binary response model. Econometrica 60, 505}531.
Horowitz, J.L., 1993. Semiparametric and nonparametric estimation of quantal response models. In:
Maddala. G.S., Rao., C.R., Vinod, H.D. (Eds.), Handbook of Statistics, Vol. 11.
Horowitz, J.L., HaK rdle, W., 1996. Direct semiparametric estimation of single-index models with
discrete covariates. Journal of the American Statistical Association 91, 1623}1640.
Ichimura, H., 1993. Semiparametric least squares (SLS) and weighted SLS estimation of single-index
models. Journal of Econometrics 58, 71}120.
Klein, R.W., Spady, R.S., 1993. An e$cient semiparametric estimator of the binary response model.
Econometrica 61, 387}421.
Lee, L.F., 1994. Semiparametric instrumental variable estimation of simultaneous equation sample
selection models. Journal of Econometrics 63, 341}388.
Lewbel, A., 1997. Semiparametric estimation of location and other discrete choice moments.
Econometric Theory 13, 32}51.
Manski, C.F., 1985. Semiparametric analysis of discrete response: asymptotic properties of the
maximum score estimator. Journal of Econometrics 27, 313}333.
Newey, W.K., 1985. Maximum likelihood speci"cation testing and conditional moment tests.
Econometrica 53, 1047}1070.
Powell, J.L., 1989. Semiparametric estimation of bivariate latent variable models. Social Research
Institute, University of Wisconsin, Madison.
Powell, J.L., Stock, J.H., Stoker, T.M., 1989. Semiparametric estimation of weighted average
derivatives. Econometrica 57, 1403}1430.
Sherman, R.P., 1993. The limiting distribution of the maximum rank correlation estimator. Econometrica 61, 123}138.
Ser#ing, R.S., 1980. Approximation Theorems of Mathematical Statistics. Chapman & and Hall,
New York.