iu iu u . . . , g u

continuous and discrete exogenous regressors and the endoge- nous regressors enter the model linearly. Then we propose local linear GMM estimates for the functional coefficients. 2.1 Functional Coefficient Representation We consider the following functional coefficient IV model Y i = g U c i , U d i ′ X i + ε i = d j =1 g j U c i , U d i X i,j + ε i , E ε i |Z i , U i = 0 a.s., 2.1 where Y i is a scalar random variable, g = g 1 , . . . , g d ′ , {g j } d j =1 are the unknown structural functions of interest, X i, 1 = 1, X i = X i, 1 , . . . , X i,d ′ is a d × 1 vector consisting of d − 1 endogenous regressors, U i = U c′ i , U d′ i ′ , U c i , and U d i denote a p c × 1 vector of continuous exogenous regressors and a p d × 1 vector of discrete exogenous regressors, respectively, Z i is a q z × 1 vector of IVs, and a.s. abbreviates almost surely. We assume that a random sample {Y i , X i , Z i , U i } n i=1 is observed. In the absence of U d i , 2.1 reduces to the model of CDXW

2006. If none of the variables in X

i are endogenous, the model becomes that of SCU 2009. As the latter authors demonstrated through the estimation of earnings function, it is important to allow the variables in the functional coefficients to include both continuous and discrete variables, where the discrete variables may represent race, profession, region, etc. 2.2 Local Linear GMM Estimation The orthogonality condition in 2.1 suggests that we can estimate the unknown functional coefficients via the principle of nonparametric generalized method of moments NPGMM, which is similar to the GMM of Hansen 1982 for parametric models. Let V i = Z ′ i , U ′ i ′ . It indicates that for any k × 1 vector function QV i , we have E [Q V i ε i |V i ] = E ⎡ ⎣Q V i ⎧ ⎨ ⎩ Y i − d j =1 g j U c i , U d i X i,j ⎫ ⎬ ⎭ |V i ⎤ ⎦ = 0. 2.2 Following Cai and Li 2008 , we propose an estimation proce- dure to combine the orthogonality condition in 2.2 with the idea of local linear fitting in the nonparametrics literature to estimate the unknown functional coefficients. Like Racine and Li 2004 , we use U d i,t to denote the tth component of U d i . U c i,t is similarly defined. Analogously, we let u d t and u c t denote the tth component of u d and u c , re- spectively, that is, u d = u d 1 , . . . , u d p d ′ and u c = u c 1 , . . . , u c p c ′ . We assume that U d i,t can take c t ≥ 2 different values, that is, U d i,t ∈ {0, 1, . . . , c t − 1} for t = 1, . . . , p d . Let u = u c , u d ∈ R p c × R p d . To define the kernel weight function, we focus on the case for which there is no natural ordering in U d i . Define l U d i,t , u d t , λ t = 1 if U d i,t = u d t , λ t if U d i,t = u d t , 2.3 where λ t is a bandwidth that lies on the interval [0, 1]. Clearly, when λ t = 0, lU d i,t , u d t , 0 becomes an indicator function, and λ t = 1, lU d i,t , u d t , 1 becomes a uniform weight function. We define the product kernel for the discrete random variables by L U d i , u d , λ = L λ U d i − u d = p d t =1 l U d i,t , u d t , λ t . 2.4 For the continuous random variables, we use w· to denote a univariate kernel function and define the product kernel function by W

h,iu