Figure 1. Curve estimates from single replications of the simulation study of Example 5.1. The solid curves are the true functions ηX
⊤ it
θ ,
the dashed curves are the corresponding estimated functions η
X
⊤ it
θ , and the dots denote Y
it
− Z
⊤ it
β − α
i
plotted against X
⊤ it
θ . The online version of this figure is in color.
where Z
it
= Z
it, 1
, Z
it, 2
⊤
are two-dimensional iid over both i and t random vectors with independent components that
have binary distribution with PZ
it,j
= 0 = PZ
it,j
= 1 =
0.5, j = 1, 2, X
it
= X
it
, X
i,t −1
, X
i,t −2 ⊤
in which X
it
= 0.4X
i,t −1
+ x
it
and x
it
are iid over i and t and uniformly dis-
tributed with x
it
∼ U−1, 1, v
it
are iid over i and t with nor-
mal distribution N0, 0.5
2
, α
i
= 0.5Z
∗ iA
+ u
i
for i = 1, . . . , n − 1, and α
n
= −
n−1 i=1
α
i
, in which Z
∗ iA
=
1 2T
T t =1
Z
it, 1
+ Z
it, 2
and u
i iid
∼ N0, 0.2
2
. {Z
it
}, {x
it
}, {u
i
}, and {v
it
} are mutually independent.
The true parameters of model 5.4
are β = 2, 1
⊤
√ 5
and θ = 2, 1, 2
⊤
3, and the true link function is ηu = 2 exp{−3u
2
}. The means as well as the MSEs of the estimates of the param-
eters over 200 replications are given in Table 2
. These results indicate that the SMAVE method estimates the parameters accu-
rately, and its performance in terms of MSE improves as n or
T increases. The estimates of the link function η· from typical realizations of sample sizes of n, T = 10, 20, 30 are given in
Figure 2 .
5.2 Real Data Examples
5.2.1 U.S. Cigarette Demand. The first real data example
is about the cigarette demand in 46 states of the United States over the period 1963–1992. The dataset is from the article by
Baltagi, Griffin, and Xiong 2000
, who used a linear dynamic panel data model of the form
ln C
it
= β + β
1
ln C
i,t −1
+ θ
1
ln DI
it
+ θ
2
ln P
it
+ θ
3
ln PN
it
+ u
it
5.4 to analyze the demand for cigarettes, where i = 1, . . . , 46,
denotes the i-th state, t = 1, . . . , 29 denotes the t-th year, C
it
is the real per capita sales of cigarettes measured in packs, DI
it
is the real per capita disposable income, P
it
is the average retail price of a pack of cigarettes measured in real terms,
PN
it
is the minimum real price of cigarettes in any neighboring state, and the disturbance term u
it
in model 5.4
is specified as
u
it
= μ
i
+ λ
t
+ v
it
, 5.5
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:12 11 January 2016
Table 2. Means and MSEs of the estimates of the parameters in Example 5.2 10
20 30
n\T True value
Mean MSE ×10
−4
Mean MSE ×10
−4
Mean MSE ×10
−4
10 β
0,1
= 0.8944 0.8901
100.0000 0.8787
45.0000 0.8875
44.0000 β
0,2
= 0.4472 0.4422
105.0000 0.4538
48.0000 0.4484
36.0000 θ
0,1
= 0.6667 0.6683
13.0000 0.6612
5.7443 0.6642
4.7835 θ
0,2
= 0.3333 0.3281
27.0000 0.3400
14.0000 0.3320
9.1121 θ
0,3
= 0.6667 0.6635
15.0000 0.6668
7.6202 0.6684
4.3630 20
β
0,1
= 0.8944 0.9036
57.0000 0.8950
26.0000 0.8897
18.0000 β
0,2
= 0.4472 0.4460
52.0000 0.4499
33.0000 0.4473
19.0000 θ
0,1
= 0.6667 0.6651
6.5923 0.6639
4.8670 0.6662
2.2190 θ
0,2
= 0.3333 0.3299
16.0000 0.3308
9.4963 0.3291
4.4093 θ
0,3
= 0.6667 0.6679
4.8119 0.6693
3.8523 0.6686
2.1373 30
β
0,1
= 0.8944 0.9012
47.0000 0.8940
14.0000 0.8932
11.0000 β
0,2
= 0.4472 0.4505
37.0000 0.4484
17.0000 0.4495
14.0000 θ
0,1
= 0.6667 0.6662
4.9653 0.6647
2.3813 0.6671
1.0029 θ
0,2
= 0.3333 0.3299
14.0000 0.3323
4.7590 0.3316
3.3297 θ
0,3
= 0.6667 0.6669
5.2189 0.6685
0.40812 0.6667
1.0682
Figure 2. Curve estimates from single replications of the simulation study of Example 5.2. The solid curves are the true functions ηX
⊤ it
θ ,
the dashed curves are the corresponding estimated functions η
X
⊤ it
θ , and the dots denote Y
it
− Z
⊤ it
β − α
i
plotted against X
⊤ it
θ . The online version of this figure is in color.
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:12 11 January 2016
Figure 3. From top to bottom: the scatterplots of Y against V 1, V 2,
V
3
, and V
4
.
where μ
i
denotes a state-specific effect and λ
t
denotes a year-specific effect, which can also be interpreted as a trend in
time t.
Due to the presence of the time-specific effect or trend λ
t
in all the variables, we first remove the trend from the log-transformed
observations as in the article by Mammen, Støve, and Tjøstheim 2009
, Y
it
= ln C
it
− s
C
t , V 1
it
= Y
i,t −1
, V 2
it
= ln DI
it
− s
DI
t , V
3
it
= ln P
it
− s
P
t , V 4
it
= ln PN
it
− s
PN
t , where s
C
t , s
DI
t , s
P
t , and s
P N
t are the nonparametric estimates of the trends in ln C
it
, ln DI
it
, ln P
it
, and ln
PN
it
for i = 1, . . . , 46 and t = 1, . . . , 29. In Figure 3
, we give the scatterplots of
Y against V 1, V 2, V 3, and V 4. It is clear from Figure 3
that Y exhibits strong linearity with V 1 i.e., the lagged
variable of Y. For the other three covariates, their linearities
with Y are not as strong as that for the lagged variable. Hence,
we define Z
it
= V 1
it
and X
it
= V 2
it
, V 3
it
, V 4
it ⊤
, and put Z
it
in the linear term and X
it
in the single-index term of the following model:
Y
it
= Z
it
β + g X
⊤ it
θ + α
i
+ v
it
, 5.6
where θ = θ
1
, θ
2
, θ
3 ⊤
and α
i
is a state-specific effect that may include religion, race, tourism, tax, and education. α
i
corre- sponds to μ
i
in models 5.4
and 5.5
. Furthermore, as we detrended ln C
it
, ln DI
it
, ln P
it
, and ln PN
it
, the year-specific
Figure 4. Estimated link function and its 95 confidence band for the cigarette demand data. Dots denote Y
it
− Z
it
β − α
i
plotted
against X
⊤ it
θ . The solid line denotes the estimated link function η
X
⊤ it
θ . The dash-dotted lines represent the 95 confidence band. The online
version of this figure is in color.
term λ
t
that appeared in model 5.4
and 5.5
is eliminated from model
5.6 .
After applying the estimation method proposed in Section 2
to the data on Y
it
, Z
it
, X
it
, we obtain the estimates of the parameters in model
5.6 , which are summarized in
Table 3 .
The estimated curve of the link function as well as its 95 confidence band is given in
Figure 4 .
A comparison of the results in Table 3
with that in the article by Baltagi, Griffin, and Xiong
2000 indicates that our estimate
of β is smaller than the estimate of the corresponding coefficient by Baltagi, Griffin, and Xiong
2000 , where a value of 0.90
from the OLS and a value of 0.91 from the GLS were obtained. In addition, compared with
θ = 0.2112, −0.9404, 0.2665
⊤
from the OLS and θ = 0.1602, −0.9503, 0.2669
⊤
from the GLS in the article by Baltagi, Griffin, and Xiong
2000 , the
absolute value of our estimate of θ
2
is smaller, while those of θ
1
and θ
3
are larger note that due to the identification condition θ = 1, one has to normalize the estimates of θ in model
5.4 before making comparisons. The computed coefficient of de-
termination for model 5.6
is R
2
= 0.9698, which indicates a good fit to the data.
5.2.2 Foreign Direct Investment and Economic Growth.
There has been a vast literature on the effect of FDI on the economic growth of the hosting country. Some recent research
Kottaridi and Stengos 2010
found that the relationship be- tween FDI and economic growth is nonlinear. In this article, we
focus our study on 22 OECD countries over the period 1970– 2000. According to data availability, the following countries are
Table 3. Estimates of the parameters in the cigarette data example β
θ
1
θ
2
θ
3
Detr. log-sales in Detr. log-disposable
Detr. log-price Detr. log-min price
Parameter previous year
per capita income per pack
in neighboring states Estimate
0.8480 0.2594
–0.8735 0.4119
SD 0.0073
0.0217 0.0099
0.0260
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:12 11 January 2016
Table 4. Estimates of the parameters in the FDI data example β
1
β
2
θ
1
θ
2
θ
3
Parameter Log-GFCF
Log-initial GDP Log-FDI
Years schooling Population growth
Estimate 4.6605
−1.2880 0.6905
0.3409 0.6380
SD 0.5235
0.1319 0.0280
0.0195 0.0403
selected: Australia, Austria, Belgium, Canada, Denmark, Fin- land, France, Germany, Greece, Iceland, Ireland, Japan, Repub-
lic of Korea, Mexico, Netherlands, Norway, Portugal, Spain, Sweden, Turkey, the United Kingdom, and the United States.
As is conventional in the literature, we use 5-year averages to reduce the impact of year-to-year fluctuations in output. We use
the partially linear single-index structure to model the effects of FDI, human capital measured by average years of schooling,
population growth, as well as domestic investment measured by gross fixed capital formation, GFCF on growth. This relaxes
the restrictions on the form of effects while at the same time retaining much of the ease of interpretation of linear models.
More specifically, we use the following model specification:
Y
it
= Z
⊤ it
β + η
X
⊤ it
θ + α
i
+ v
it
, 1 ≤ i ≤ 21, 1 ≤ t ≤ 7,
5.7 where
i denotes country i, t denotes the tth 5-year period, Y
it
denotes GDP per capita growth, Z
it
= Z
it, 1
, Z
it, 2
⊤
with Z
it, 1
denoting log of GFCF in percentages of GDP and Z
it, 2
denot- ing log of GDP per capita at the beginning of the
t-th 5-year
period, X
it
= X
it, 1
, X
it, 2
, X
it, 3
⊤
with X
it, 1
representing log- FDI in percentages of GDP, X
it, 2
representing average years of schooling, and X
it, 3
representing population growth. The data on FDI are obtained from the United Nations Con-
ference on Trade and Development UNCTAD. Data on the other variables are obtained from the World Development Indi-
cators WDI of the World Bank, and GDP per capita, FDI, and gross fixed formation are all measured in constant 2000 U.S.
Figure 5. Estimated link function and its 95 confidence band for the FDI data. The solid line denotes the estimated link function
η X
⊤ it
θ . The dash-dotted lines represent the 95 confidence band. The online
version of this figure is in color.
dollars. Kottaridi and Stengos 2010
employed a partially linear model to the same data where a general nonparametric
term gX
it
is used in place of ηX
⊤ it
θ . They found that FDI
inflows and human capital have nonlinear effects on growth in the OECD countries. As X
it
is three-dimensional, the use of
the single-index term ηX
⊤ it
θ here avoids the “curse of dimen-
sionality” that would arise due to the sparsity of data available for the estimation of g· if gX
it
were used. Hence, the use of model
5.7 leads to more reliable estimates of the effects of the
factors on economic growth. The estimated parameters and their standard deviations are given in
Table 4 , and the estimated link
function together with its 95 confidence band is shown in Fig-
ure 5 . The confidence band is obtained by the plug-in method.
We have also used the wild bootstrap method to calculate the confidence band, and the result is similar.
The results indicate that GDP growth has a positive relation- ship with domestic investment whereas it is negatively related to
the initial per capita income. Moreover, FDI, human capital, and population growth have overall positive relationships with GDP
growth except that when the linear combination of these three variables is between 2.3 and 3.2, the relationships are reversed
as can be seen from the trough at around 3.2 in the plot of the estimated link function.
6. CONCLUSIONS AND DISCUSSION
This article has considered a partially linear single-index panel data model with fixed effects. A SMAVE method associ-
ated with a dummy variable approach has been proposed to deal with the estimation of both the parametric and nonparametric
components of the model. We have shown that the proposed es- timators are all asymptotically normally distributed regardless
of whether the effects involved are random or fixed. We have then assessed the finite sample performance of the proposed
estimation method through using both simulated and real data examples.
In this article, we focus on the case where both n and T are
very large and establish the asymptotic theory for the case of n, T → ∞ simultaneously. It is quite straightforward to extend
our methodology and theory to the case where n is small but
T is large, the analysis under which it is similar to the time series case. It is also possible to extend our methodology to the
case where n is large but T is small. However, for this latter
case, the asymptotic theory would be more complicated for the fixed effects model, and the asymptotic variance would rely on
T. In terms of asymptotic theory, another important question is whether our estimators in Section
2 could achieve a semi-
parametric efficiency bound in the panel data setting, as most of the existing semiparametric efficiency results are established
under the cross-section framework see, e.g., Bickel et al. 1993
; Carroll et al.
1997 . We will address this semiparametric
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:12 11 January 2016
efficiency issue in our future research as different techniques would be involved.
Another interesting topic is to allow the existence of individ- ual effects for the parameters in both the linear and single-index
components of our model. For example, we can consider β = β
+ β
i
, θ = θ + θ
i
, where β
⊤ i
, θ
⊤ i
⊤
, 1 ≤ i ≤ n, are iid and follow a multivariate normal distribution with zero mean. Under such assumptions,
our model is quite similar to the semiparametric mixed effects models discussed by Wu and Zhang
2006 . However, to con-
struct consistent estimators of β , θ
, and η·, the estimation
methodology in Section 2
would need to be modified substan- tially. Hence, we will consider this issue in our future study.
APPENDIX A: ASSUMPTIONS
Let Z
i
= Z
it
: 1 ≤ t ≤ T , X
i
= X
it
: 1 ≤ t ≤ T , and V
i
= v
it
: 1 ≤ t ≤ T . To derive the consistency of the initial estimates
β and θ , we need the following set of regularity con-
ditions.
A1 Z
i
, X
i
, V
i
, i = 1, . . . , n, are iid and {Z
it
, X
it
, v
it
: t ≥ 1} is a stationary α-mixing sequence with mixing coeffi-
cient α
i
t for each i. Furthermore, there exists a positive
coefficient function αt such that sup
i
α
i
t ≤ αt with αt ≤ C
α
t
−γ
, where C
α
0 and γ
2+δ
∗
2+δ 2δ−δ
∗
, in which δ is chosen such that
EX
it 2+δ
∞ and δ
∗
δ is chosen such that
E[|v
it
|
2+δ
∗
] ∞, · is the L
2
-distance.
A2 The kernel function H ·: R
p
→ R
+
is a bounded and Lipschitz continuous probability density function with a
compact support. Furthermore, H x is symmetric and xx
⊤
H xdx is positive definite.
A3 The density function f
X
· of X
it
is second-order continuous and has gradient f
′ X
·. Moreover, f
X
· is positive over its compact support X
∗
.
A4 Let g
1
x := E[Z
it
|X
it
= x] and g
2
x := E[Z
it
Z
⊤ it
|X
it
=
x]. Both g
1
x and g
2
x have bounded and continuous
derivatives. In addition,
EZ
it 2+δ
∞ and
E{Z
it
− EZ
it
|X
it
Z
it
− EZ
it
|X
it ⊤
} is a positive definite matrix, where δ is the same as defined
in A1. A5 Additionally, suppose that {v
it
} is independent of
{Z
it
, X
it
} with E[v
it
] = 0 and 0 σ
2
:= E[v
2 it
] ∞.
A6 The link function η· has continuous derivatives of up to
the second order.
A7 The bandwidth h
1
involved in the multivariate weights satisfies
h
1
→ 0, log T
T h
p+2 1
= O1, nT
2γ −4p−3
h
2pγ +4p
2
+9p+2 1
log
2γ −4p+1
nT → ∞,
where p is the dimension of X
it
, and γ and δ are the same
as defined in A1.
To establish asymptotic distribution for the final parametric estimators
β and θ , we further need the following set of regu-
larity conditions.
B1 The kernel function K·: R → R
+
is a bounded and sym- metric probability density function with a compact support.
Furthermore, K· is differentiable and has a continuous derivative.
B2 The density function f
θ
· of X
⊤ it
θ is positive and second- order continuous with respect to θ in a neighborhood of θ
. Moreover, f
θ
· is positive over its compact support X
∗
θ .
B3 The conditional expectation g
3
u := E[Z
it
|X
⊤ it
θ = u] has a bounded and continuous derivative with respect to θ in a
neighborhood of θ .
B4 The bandwidth h
2
involved in the single-index weights satisfies
lim
n,T →∞
nT h
5 2
∞. Furthermore, there exists a relationship between
n and T, T
δ
∗
δ+2δ+32p+16δp
log
52+δ2+δ
∗
−2p
nT n
4δδ
∗
+10δ
∗
−2δ−32p−16δp
= o1.
In A1, we assume that Z
i
, X
i
, V
i
, 1 ≤ i ≤ n are cross- sectional independent see, e.g., Su and Ullah
2006 ; Sun,
Carroll, and Li 2009
and each time series is α-mixing depen- dent, which can be satisfied by many linear and nonlinear time
series see such models discussed in Section 4
. Assumption
A2 involves some mild conditions on the multivariate kernel function H ·. A3 and A4 are similar to the corresponding con-
ditions by Xia and H¨ardle 2006
. Since α
i
are allowed to be
correlated with X
it
, Z
it
, u
it
= α
i
+ v
it
thus may be correlated
with X
it
, Z
it
even though v
it
are independent of X
it
, Z
it
.
Assumption A4 is needed to ensure that both β , θ
and η· are identifiable and estimable. Meanwhile, the independence
between {Z
it
, X
it
} and {v
it
} in A5 is imposed to simplify our
proofs and it can be removed at the expense of more tedious proofs. A6 is a common condition for local linear estimation
see, e.g., Fan and Gijbels 1996
; Fan and Yao 2003
. We next
show that the bandwidth restrictions in A7 are satisfied under mild conditions if we take h
1
∼ nT
−ϑ
, 0 ϑ 1p + 2. It is easy to check that h
1
∼ nT
−ϑ
= o1 and the second condi-
tion in A7 is also satisfied when n = OT
1 ϑ
p+2
−1
log
1 ϑ
p+2
T .
If we let p
1
= 2γ − 4p − 3, p
2
= 2pγ + 4p
2
+ 9p + 2, and p
3
= 2γ − 4p + 1, the left-hand side of the last term in A7
becomes nT
p
1
h
p
2
1
log
p
3
nT =
nT
p
1
−p
2
ϑ
log
p
3
nT ,
which tends to ∞ when p
1
p
2
ϑ . As ϑ 1p + 2, 2 −
2pϑ 0. By some elementary calculation, it is easy to show that if
γ 4p
2
+ 9p + 2ϑ 2 − 2pϑ
+ 4p + 3
2 − 2pϑ ,
then p
1
p
2
ϑ and thus the third condition in A7 holds.
Assumptions B1–B3 are natural extensions of conditions C2,
C4, and C5 in the article by Xia and H¨ardle 2006
. The rate of the bandwidth h
2
in B4 is optimal for pooled local linear
estimators. In particular, if δ δ
∗
≫ p, we can show that the
Downloaded by [Universitas Maritim Raja Ali Haji] at 22:12 11 January 2016
second condition in B4 could include two cases: i the time series length
T is larger than the cross-sectional dimension n, and ii the cross-sectional dimension
n is larger than the time series length
T. APPENDIX B: PROOF OF THEOREM 3.1
Define a
x
= ηx
⊤
θ , a
it
= ηX
⊤ it
θ , b
x
= η
′
x
⊤
θ , and
b
it
= η
′
X
⊤ it
θ . Let
a
x
, a
it
, b
x
, and b
it
be the local linear estima- tors obtained from Equation
2.6 using the set of multivariate
weights in Equation 2.8
. Let e
x,∗