j = 1, 2, X suppose that {Z

Figure 1. Curve estimates from single replications of the simulation study of Example 5.1. The solid curves are the true functions ηX ⊤ it θ , the dashed curves are the corresponding estimated functions η X ⊤ it θ , and the dots denote Y it − Z ⊤ it β − α i plotted against X ⊤ it θ . The online version of this figure is in color. where Z it = Z it, 1 , Z it, 2 ⊤ are two-dimensional iid over both i and t random vectors with independent components that have binary distribution with PZ it,j = 0 = PZ it,j = 1 =

0.5, j = 1, 2, X

it = X it , X i,t −1 , X i,t −2 ⊤ in which X it = 0.4X i,t −1 + x it and x it are iid over i and t and uniformly dis- tributed with x it ∼ U−1, 1, v it are iid over i and t with nor- mal distribution N0, 0.5 2 , α i = 0.5Z ∗ iA + u i for i = 1, . . . , n − 1, and α n = − n−1 i=1 α i , in which Z ∗ iA = 1 2T T t =1 Z it, 1 + Z it, 2 and u i iid ∼ N0, 0.2 2 . {Z it }, {x it }, {u i }, and {v it } are mutually independent. The true parameters of model 5.4 are β = 2, 1 ⊤ √ 5 and θ = 2, 1, 2 ⊤ 3, and the true link function is ηu = 2 exp{−3u 2 }. The means as well as the MSEs of the estimates of the param- eters over 200 replications are given in Table 2 . These results indicate that the SMAVE method estimates the parameters accu- rately, and its performance in terms of MSE improves as n or T increases. The estimates of the link function η· from typical realizations of sample sizes of n, T = 10, 20, 30 are given in Figure 2 . 5.2 Real Data Examples 5.2.1 U.S. Cigarette Demand. The first real data example is about the cigarette demand in 46 states of the United States over the period 1963–1992. The dataset is from the article by Baltagi, Griffin, and Xiong 2000 , who used a linear dynamic panel data model of the form ln C it = β + β 1 ln C i,t −1 + θ 1 ln DI it + θ 2 ln P it + θ 3 ln PN it + u it 5.4 to analyze the demand for cigarettes, where i = 1, . . . , 46, denotes the i-th state, t = 1, . . . , 29 denotes the t-th year, C it is the real per capita sales of cigarettes measured in packs, DI it is the real per capita disposable income, P it is the average retail price of a pack of cigarettes measured in real terms, PN it is the minimum real price of cigarettes in any neighboring state, and the disturbance term u it in model 5.4 is specified as u it = μ i + λ t + v it , 5.5 Downloaded by [Universitas Maritim Raja Ali Haji] at 22:12 11 January 2016 Table 2. Means and MSEs of the estimates of the parameters in Example 5.2 10 20 30 n\T True value Mean MSE ×10 −4 Mean MSE ×10 −4 Mean MSE ×10 −4 10 β 0,1 = 0.8944 0.8901 100.0000 0.8787 45.0000 0.8875 44.0000 β 0,2 = 0.4472 0.4422 105.0000 0.4538 48.0000 0.4484 36.0000 θ 0,1 = 0.6667 0.6683 13.0000 0.6612 5.7443 0.6642 4.7835 θ 0,2 = 0.3333 0.3281 27.0000 0.3400 14.0000 0.3320 9.1121 θ 0,3 = 0.6667 0.6635 15.0000 0.6668 7.6202 0.6684 4.3630 20 β 0,1 = 0.8944 0.9036 57.0000 0.8950 26.0000 0.8897 18.0000 β 0,2 = 0.4472 0.4460 52.0000 0.4499 33.0000 0.4473 19.0000 θ 0,1 = 0.6667 0.6651 6.5923 0.6639 4.8670 0.6662 2.2190 θ 0,2 = 0.3333 0.3299 16.0000 0.3308 9.4963 0.3291 4.4093 θ 0,3 = 0.6667 0.6679 4.8119 0.6693 3.8523 0.6686 2.1373 30 β 0,1 = 0.8944 0.9012 47.0000 0.8940 14.0000 0.8932 11.0000 β 0,2 = 0.4472 0.4505 37.0000 0.4484 17.0000 0.4495 14.0000 θ 0,1 = 0.6667 0.6662 4.9653 0.6647 2.3813 0.6671 1.0029 θ 0,2 = 0.3333 0.3299 14.0000 0.3323 4.7590 0.3316 3.3297 θ 0,3 = 0.6667 0.6669 5.2189 0.6685 0.40812 0.6667 1.0682 Figure 2. Curve estimates from single replications of the simulation study of Example 5.2. The solid curves are the true functions ηX ⊤ it θ , the dashed curves are the corresponding estimated functions η X ⊤ it θ , and the dots denote Y it − Z ⊤ it β − α i plotted against X ⊤ it θ . The online version of this figure is in color. Downloaded by [Universitas Maritim Raja Ali Haji] at 22:12 11 January 2016 Figure 3. From top to bottom: the scatterplots of Y against V 1, V 2, V 3 , and V 4 . where μ i denotes a state-specific effect and λ t denotes a year-specific effect, which can also be interpreted as a trend in time t. Due to the presence of the time-specific effect or trend λ t in all the variables, we first remove the trend from the log-transformed observations as in the article by Mammen, Støve, and Tjøstheim 2009 , Y it = ln C it − s C t , V 1 it = Y i,t −1 , V 2 it = ln DI it − s DI t , V 3 it = ln P it − s P t , V 4 it = ln PN it − s PN t , where s C t , s DI t , s P t , and s P N t are the nonparametric estimates of the trends in ln C it , ln DI it , ln P it , and ln PN it for i = 1, . . . , 46 and t = 1, . . . , 29. In Figure 3 , we give the scatterplots of Y against V 1, V 2, V 3, and V 4. It is clear from Figure 3 that Y exhibits strong linearity with V 1 i.e., the lagged variable of Y. For the other three covariates, their linearities with Y are not as strong as that for the lagged variable. Hence, we define Z it = V 1 it and X it = V 2 it , V 3 it , V 4 it ⊤ , and put Z it in the linear term and X it in the single-index term of the following model: Y it = Z it β + g X ⊤ it θ + α i + v it , 5.6 where θ = θ 1 , θ 2 , θ 3 ⊤ and α i is a state-specific effect that may include religion, race, tourism, tax, and education. α i corre- sponds to μ i in models 5.4 and 5.5 . Furthermore, as we detrended ln C it , ln DI it , ln P it , and ln PN it , the year-specific Figure 4. Estimated link function and its 95 confidence band for the cigarette demand data. Dots denote Y it − Z it β − α i plotted against X ⊤ it θ . The solid line denotes the estimated link function η X ⊤ it θ . The dash-dotted lines represent the 95 confidence band. The online version of this figure is in color. term λ t that appeared in model 5.4 and 5.5 is eliminated from model 5.6 . After applying the estimation method proposed in Section 2 to the data on Y it , Z it , X it , we obtain the estimates of the parameters in model 5.6 , which are summarized in Table 3 . The estimated curve of the link function as well as its 95 confidence band is given in Figure 4 . A comparison of the results in Table 3 with that in the article by Baltagi, Griffin, and Xiong 2000 indicates that our estimate of β is smaller than the estimate of the corresponding coefficient by Baltagi, Griffin, and Xiong 2000 , where a value of 0.90 from the OLS and a value of 0.91 from the GLS were obtained. In addition, compared with θ = 0.2112, −0.9404, 0.2665 ⊤ from the OLS and θ = 0.1602, −0.9503, 0.2669 ⊤ from the GLS in the article by Baltagi, Griffin, and Xiong 2000 , the absolute value of our estimate of θ 2 is smaller, while those of θ 1 and θ 3 are larger note that due to the identification condition θ = 1, one has to normalize the estimates of θ in model 5.4 before making comparisons. The computed coefficient of de- termination for model 5.6 is R 2 = 0.9698, which indicates a good fit to the data. 5.2.2 Foreign Direct Investment and Economic Growth. There has been a vast literature on the effect of FDI on the economic growth of the hosting country. Some recent research Kottaridi and Stengos 2010 found that the relationship be- tween FDI and economic growth is nonlinear. In this article, we focus our study on 22 OECD countries over the period 1970– 2000. According to data availability, the following countries are Table 3. Estimates of the parameters in the cigarette data example β θ 1 θ 2 θ 3 Detr. log-sales in Detr. log-disposable Detr. log-price Detr. log-min price Parameter previous year per capita income per pack in neighboring states Estimate 0.8480 0.2594 –0.8735 0.4119 SD 0.0073 0.0217 0.0099 0.0260 Downloaded by [Universitas Maritim Raja Ali Haji] at 22:12 11 January 2016 Table 4. Estimates of the parameters in the FDI data example β 1 β 2 θ 1 θ 2 θ 3 Parameter Log-GFCF Log-initial GDP Log-FDI Years schooling Population growth Estimate 4.6605 −1.2880 0.6905 0.3409 0.6380 SD 0.5235 0.1319 0.0280 0.0195 0.0403 selected: Australia, Austria, Belgium, Canada, Denmark, Fin- land, France, Germany, Greece, Iceland, Ireland, Japan, Repub- lic of Korea, Mexico, Netherlands, Norway, Portugal, Spain, Sweden, Turkey, the United Kingdom, and the United States. As is conventional in the literature, we use 5-year averages to reduce the impact of year-to-year fluctuations in output. We use the partially linear single-index structure to model the effects of FDI, human capital measured by average years of schooling, population growth, as well as domestic investment measured by gross fixed capital formation, GFCF on growth. This relaxes the restrictions on the form of effects while at the same time retaining much of the ease of interpretation of linear models. More specifically, we use the following model specification: Y it = Z ⊤ it β + η X ⊤ it θ + α i + v it , 1 ≤ i ≤ 21, 1 ≤ t ≤ 7, 5.7 where i denotes country i, t denotes the tth 5-year period, Y it denotes GDP per capita growth, Z it = Z it, 1 , Z it, 2 ⊤ with Z it, 1 denoting log of GFCF in percentages of GDP and Z it, 2 denot- ing log of GDP per capita at the beginning of the t-th 5-year period, X it = X it, 1 , X it, 2 , X it, 3 ⊤ with X it, 1 representing log- FDI in percentages of GDP, X it, 2 representing average years of schooling, and X it, 3 representing population growth. The data on FDI are obtained from the United Nations Con- ference on Trade and Development UNCTAD. Data on the other variables are obtained from the World Development Indi- cators WDI of the World Bank, and GDP per capita, FDI, and gross fixed formation are all measured in constant 2000 U.S. Figure 5. Estimated link function and its 95 confidence band for the FDI data. The solid line denotes the estimated link function η X ⊤ it θ . The dash-dotted lines represent the 95 confidence band. The online version of this figure is in color. dollars. Kottaridi and Stengos 2010 employed a partially linear model to the same data where a general nonparametric term gX it is used in place of ηX ⊤ it θ . They found that FDI inflows and human capital have nonlinear effects on growth in the OECD countries. As X it is three-dimensional, the use of the single-index term ηX ⊤ it θ here avoids the “curse of dimen- sionality” that would arise due to the sparsity of data available for the estimation of g· if gX it were used. Hence, the use of model 5.7 leads to more reliable estimates of the effects of the factors on economic growth. The estimated parameters and their standard deviations are given in Table 4 , and the estimated link function together with its 95 confidence band is shown in Fig- ure 5 . The confidence band is obtained by the plug-in method. We have also used the wild bootstrap method to calculate the confidence band, and the result is similar. The results indicate that GDP growth has a positive relation- ship with domestic investment whereas it is negatively related to the initial per capita income. Moreover, FDI, human capital, and population growth have overall positive relationships with GDP growth except that when the linear combination of these three variables is between 2.3 and 3.2, the relationships are reversed as can be seen from the trough at around 3.2 in the plot of the estimated link function. 6. CONCLUSIONS AND DISCUSSION This article has considered a partially linear single-index panel data model with fixed effects. A SMAVE method associ- ated with a dummy variable approach has been proposed to deal with the estimation of both the parametric and nonparametric components of the model. We have shown that the proposed es- timators are all asymptotically normally distributed regardless of whether the effects involved are random or fixed. We have then assessed the finite sample performance of the proposed estimation method through using both simulated and real data examples. In this article, we focus on the case where both n and T are very large and establish the asymptotic theory for the case of n, T → ∞ simultaneously. It is quite straightforward to extend our methodology and theory to the case where n is small but T is large, the analysis under which it is similar to the time series case. It is also possible to extend our methodology to the case where n is large but T is small. However, for this latter case, the asymptotic theory would be more complicated for the fixed effects model, and the asymptotic variance would rely on T. In terms of asymptotic theory, another important question is whether our estimators in Section 2 could achieve a semi- parametric efficiency bound in the panel data setting, as most of the existing semiparametric efficiency results are established under the cross-section framework see, e.g., Bickel et al. 1993 ; Carroll et al. 1997 . We will address this semiparametric Downloaded by [Universitas Maritim Raja Ali Haji] at 22:12 11 January 2016 efficiency issue in our future research as different techniques would be involved. Another interesting topic is to allow the existence of individ- ual effects for the parameters in both the linear and single-index components of our model. For example, we can consider β = β + β i , θ = θ + θ i , where β ⊤ i , θ ⊤ i ⊤ , 1 ≤ i ≤ n, are iid and follow a multivariate normal distribution with zero mean. Under such assumptions, our model is quite similar to the semiparametric mixed effects models discussed by Wu and Zhang 2006 . However, to con- struct consistent estimators of β , θ , and η·, the estimation methodology in Section 2 would need to be modified substan- tially. Hence, we will consider this issue in our future study. APPENDIX A: ASSUMPTIONS Let Z i = Z it : 1 ≤ t ≤ T , X i = X it : 1 ≤ t ≤ T , and V i = v it : 1 ≤ t ≤ T . To derive the consistency of the initial estimates β and θ , we need the following set of regularity con- ditions. A1 Z i , X i , V i , i = 1, . . . , n, are iid and {Z it , X it , v it : t ≥ 1} is a stationary α-mixing sequence with mixing coeffi- cient α i t for each i. Furthermore, there exists a positive coefficient function αt such that sup i α i t ≤ αt with αt ≤ C α t −γ , where C α 0 and γ 2+δ ∗ 2+δ 2δ−δ ∗ , in which δ is chosen such that EX it 2+δ ∞ and δ ∗ δ is chosen such that E[|v it | 2+δ ∗ ] ∞, · is the L 2 -distance. A2 The kernel function H ·: R p → R + is a bounded and Lipschitz continuous probability density function with a compact support. Furthermore, H x is symmetric and xx ⊤ H xdx is positive definite. A3 The density function f X · of X it is second-order continuous and has gradient f ′ X ·. Moreover, f X · is positive over its compact support X ∗ . A4 Let g 1 x := E[Z it |X it = x] and g 2 x := E[Z it Z ⊤ it |X it = x]. Both g 1 x and g 2 x have bounded and continuous derivatives. In addition, EZ it 2+δ ∞ and E{Z it − EZ it |X it Z it − EZ it |X it ⊤ } is a positive definite matrix, where δ is the same as defined in A1. A5 Additionally, suppose that {v it } is independent of {Z it , X it } with E[v it ] = 0 and 0 σ 2 := E[v 2 it ] ∞. A6 The link function η· has continuous derivatives of up to the second order. A7 The bandwidth h 1 involved in the multivariate weights satisfies h 1 → 0, log T T h p+2 1 = O1, nT 2γ −4p−3 h 2pγ +4p 2 +9p+2 1 log 2γ −4p+1 nT → ∞, where p is the dimension of X it , and γ and δ are the same as defined in A1. To establish asymptotic distribution for the final parametric estimators β and θ , we further need the following set of regu- larity conditions. B1 The kernel function K·: R → R + is a bounded and sym- metric probability density function with a compact support. Furthermore, K· is differentiable and has a continuous derivative. B2 The density function f θ · of X ⊤ it θ is positive and second- order continuous with respect to θ in a neighborhood of θ . Moreover, f θ · is positive over its compact support X ∗ θ . B3 The conditional expectation g 3 u := E[Z it |X ⊤ it θ = u] has a bounded and continuous derivative with respect to θ in a neighborhood of θ . B4 The bandwidth h 2 involved in the single-index weights satisfies lim n,T →∞ nT h 5 2 ∞. Furthermore, there exists a relationship between n and T, T δ ∗ δ+2δ+32p+16δp log 52+δ2+δ ∗ −2p nT n 4δδ ∗ +10δ ∗ −2δ−32p−16δp = o1. In A1, we assume that Z i , X i , V i , 1 ≤ i ≤ n are cross- sectional independent see, e.g., Su and Ullah 2006 ; Sun, Carroll, and Li 2009 and each time series is α-mixing depen- dent, which can be satisfied by many linear and nonlinear time series see such models discussed in Section 4 . Assumption A2 involves some mild conditions on the multivariate kernel function H ·. A3 and A4 are similar to the corresponding con- ditions by Xia and H¨ardle 2006 . Since α i are allowed to be correlated with X it , Z it , u it = α i + v it thus may be correlated with X it , Z it even though v it are independent of X it , Z it . Assumption A4 is needed to ensure that both β , θ and η· are identifiable and estimable. Meanwhile, the independence between {Z it , X it } and {v it } in A5 is imposed to simplify our proofs and it can be removed at the expense of more tedious proofs. A6 is a common condition for local linear estimation see, e.g., Fan and Gijbels 1996 ; Fan and Yao 2003 . We next show that the bandwidth restrictions in A7 are satisfied under mild conditions if we take h 1 ∼ nT −ϑ , 0 ϑ 1p + 2. It is easy to check that h 1 ∼ nT −ϑ = o1 and the second condi- tion in A7 is also satisfied when n = OT 1 ϑ p+2 −1 log 1 ϑ p+2 T . If we let p 1 = 2γ − 4p − 3, p 2 = 2pγ + 4p 2 + 9p + 2, and p 3 = 2γ − 4p + 1, the left-hand side of the last term in A7 becomes nT p 1 h p 2 1 log p 3 nT = nT p 1 −p 2 ϑ log p 3 nT , which tends to ∞ when p 1 p 2 ϑ . As ϑ 1p + 2, 2 − 2pϑ 0. By some elementary calculation, it is easy to show that if γ 4p 2 + 9p + 2ϑ 2 − 2pϑ + 4p + 3 2 − 2pϑ , then p 1 p 2 ϑ and thus the third condition in A7 holds. Assumptions B1–B3 are natural extensions of conditions C2, C4, and C5 in the article by Xia and H¨ardle 2006 . The rate of the bandwidth h 2 in B4 is optimal for pooled local linear estimators. In particular, if δ δ ∗ ≫ p, we can show that the Downloaded by [Universitas Maritim Raja Ali Haji] at 22:12 11 January 2016 second condition in B4 could include two cases: i the time series length T is larger than the cross-sectional dimension n, and ii the cross-sectional dimension n is larger than the time series length T. APPENDIX B: PROOF OF THEOREM 3.1 Define a x = ηx ⊤ θ , a it = ηX ⊤ it θ , b x = η ′ x ⊤ θ , and b it = η ′ X ⊤ it θ . Let a x , a it , b x , and b it be the local linear estima- tors obtained from Equation 2.6 using the set of multivariate weights in Equation 2.8 . Let e

x,∗