1u nu If none of the variables in X

continuous and discrete exogenous regressors and the endoge- nous regressors enter the model linearly. Then we propose local linear GMM estimates for the functional coefficients. 2.1 Functional Coefficient Representation We consider the following functional coefficient IV model Y i = g U c i , U d i ′ X i + ε i = d j =1 g j U c i , U d i X i,j + ε i , E ε i |Z i , U i = 0 a.s., 2.1 where Y i is a scalar random variable, g = g 1 , . . . , g d ′ , {g j } d j =1 are the unknown structural functions of interest, X i, 1 = 1, X i = X i, 1 , . . . , X i,d ′ is a d × 1 vector consisting of d − 1 endogenous regressors, U i = U c′ i , U d′ i ′ , U c i , and U d i denote a p c × 1 vector of continuous exogenous regressors and a p d × 1 vector of discrete exogenous regressors, respectively, Z i is a q z × 1 vector of IVs, and a.s. abbreviates almost surely. We assume that a random sample {Y i , X i , Z i , U i } n i=1 is observed. In the absence of U d i , 2.1 reduces to the model of CDXW

2006. If none of the variables in X

i are endogenous, the model becomes that of SCU 2009. As the latter authors demonstrated through the estimation of earnings function, it is important to allow the variables in the functional coefficients to include both continuous and discrete variables, where the discrete variables may represent race, profession, region, etc. 2.2 Local Linear GMM Estimation The orthogonality condition in 2.1 suggests that we can estimate the unknown functional coefficients via the principle of nonparametric generalized method of moments NPGMM, which is similar to the GMM of Hansen 1982 for parametric models. Let V i = Z ′ i , U ′ i ′ . It indicates that for any k × 1 vector function QV i , we have E [Q V i ε i |V i ] = E ⎡ ⎣Q V i ⎧ ⎨ ⎩ Y i − d j =1 g j U c i , U d i X i,j ⎫ ⎬ ⎭ |V i ⎤ ⎦ = 0. 2.2 Following Cai and Li 2008 , we propose an estimation proce- dure to combine the orthogonality condition in 2.2 with the idea of local linear fitting in the nonparametrics literature to estimate the unknown functional coefficients. Like Racine and Li 2004 , we use U d i,t to denote the tth component of U d i . U c i,t is similarly defined. Analogously, we let u d t and u c t denote the tth component of u d and u c , re- spectively, that is, u d = u d 1 , . . . , u d p d ′ and u c = u c 1 , . . . , u c p c ′ . We assume that U d i,t can take c t ≥ 2 different values, that is, U d i,t ∈ {0, 1, . . . , c t − 1} for t = 1, . . . , p d . Let u = u c , u d ∈ R p c × R p d . To define the kernel weight function, we focus on the case for which there is no natural ordering in U d i . Define l U d i,t , u d t , λ t = 1 if U d i,t = u d t , λ t if U d i,t = u d t , 2.3 where λ t is a bandwidth that lies on the interval [0, 1]. Clearly, when λ t = 0, lU d i,t , u d t , 0 becomes an indicator function, and λ t = 1, lU d i,t , u d t , 1 becomes a uniform weight function. We define the product kernel for the discrete random variables by L U d i , u d , λ = L λ U d i − u d = p d t =1 l U d i,t , u d t , λ t . 2.4 For the continuous random variables, we use w· to denote a univariate kernel function and define the product kernel function by W

h,iu

c = W h U c i − u c = p c t =1 h −1 t w U c i,t − u c t h t , where h = h 1 , . . . , h p c ′ denotes the p c -vector of smoothing parame- ters. We then define the kernel weight function K hλ,iu by K hλ,iu = W

h,iu

c L λ,iu d , 2.5 where L λ,iu d = LU d i , u d , λ . To estimate the unknown functional coefficients in model 2.1 via the local linear regression technique, we assume that {g j u c , u d , j = 1, . . . , d} are twice continuously differentiable with respect to u c . Denote by . g j u c , u d = ∂g j u c , u d ∂u c the p c × 1 vector of first-order derivatives of g j with respect to u c . Denote by .. g j u c , u d = ∂ 2 g j u c , u d ∂u c ∂u c′ the p c × p c matrix of second-order derivatives of g j with respect to u c . We use g j,ss u c , u d to denote the sth diagonal element of .. g j u c , u d . For any given u c and U c i in a neighborhood of u c , it follows from a first-order Taylor expansion of g j U c i , u d around u c , u d that d j =1 g j U c i , u d X i,j ≈ d j =1 g j u c , u d + . g j u c , u d ′ × U c i − u c X i,j = α u ′ ξ

i,u

, 2.6 where αu = g 1

u, . . . , g

d u, . g 1 u ′ , . . . , . g d u ′ ′ and ξ

i,u

= X i X i ⊗ U c i − u c are both dp c + 1 × 1 vectors. Motivated by the idea of local linear fitting, for the “global” instrument QV i we define its associated “local” version as Q

h,iu

= Q V i Q V i ⊗ U c i − u c h . 2.7 Clearly, the dimension of Q

h,iu

is kp c + 1 as QV i is a k × 1 vector. In view of the fact that the orthogonal- ity condition in 2.2 continues to hold when we replace QV i , V i by Q

h,iu

, U i , we approximate E[Q

h,iu

{Y i − d j =1 g j U c i , U d i X i,j }|U i = u] by its sample analog 1 n n i=1 Q

h,iu

Y i − α u ′ ξ

i,u

K hλ,iu = 1 n Q h u ′ K hλ u [Y − ξ u α] , where Y = Y 1 , . . . , Y n ′ , ξ u = ξ 1,u , . . . , ξ

n,u

′ , α = αu, K hλ u =diagK hλ, 1u , . . . , K hλ,nu , and Q h u=Q

h, 1u

, . . . , Q

h,nu

′ . To obtain estimates of g j and . g j , we can choose α Downloaded by [Universitas Maritim Raja Ali Haji] at 22:05 11 January 2016 to minimize the following local linear GMM criterion function 1 n Q h u ′ K hλ u Y − ξ u α ′ n u −1 × Q h u ′ K hλ u Y − ξ u α , 2.8 where n u is a symmetric k p c + 1 × k p c + 1 weight ma- trix that is positive definite for large n. Clearly, the solution to the above minimization problem is given by α n u; h, λ = ξ u ′ K hλ u Q h u n u −1 Q h u ′ K hλ u ξ u −1 × ξ u ′ K hλ u Q h u n u −1 Q h u ′ K hλ u Y. 2.9 Let e j,d 1+p c denote the d1 + p c × 1 unit vector with 1 at the jth position and 0 elsewhere. Let e j,p c ,d 1+p c denote the p c × d1 + p c selection matrix such that e j,p c ,d 1+p c α = . g j

u. Then the local linear GMM estimator of g