j = 1, . . . , r and i = 1, . . . , N . Because r is unknown in prac-
tice, it is estimated using the Bai and Ng 2002 Bayes informa- tion criterion—type criterion,
ˆr = arg min
≤r≤r
max
log ˆσ
2
r + r log
NT N
+ T N
+ T NT
, A.5
where ˆσ
2
r = NT
−1 T
t =2
N i
=1
ˆe
2 i,t
, to give an estimate of r. Bai and Ng 2002 proved the consistency of
ˆr when r ≤ r
max
and N, T → ∞.
In the case of general deterministics we proceed as follows.
Suppose that x
i,t
is an m
i
× 1 vector of regressors. Then the model 9 can be written as
y
NT ×1
= X
NT ×m
β
m ×1
+ z
NT ×1
, A.6
where y = y
′ 1
, . . . , y
′ N
′
, X = diagX
1
, . . . , X
N
, β = β
′ 1
, . . . , β
′ N
′
, z = z
′ 1
, . . . , z
′ N
′
, and m =
N i
=1
m
i
and y
i
= y
i, 1
, . . . , y
i,T ′
, X
i
= x
i, 1
, . . . , x
i,T ′
, and z
i
= z
i, 1
, . . . , z
i,T ′
. The factor model 10 can be written as
Z
T ×N
= F
T ×r
r ×N
+ E
T ×N
, A.7
where Z = z
1
, . . . , z
N
, F = f
1
, . . . , f
r ′
, = λ
1
, . . . , λ
N
,
and E = e
1
, . . . , e
N
, where e
i
= e
i, 1
, . . . , e
i,T ′
for each i =
1, . . . , N. Estimation of the factor model begins by first differencing
of A.6 and writing
y
NT −1×1
= XC
NT −1×m
β
C m
×1
+
z
NT −1×1
, where β
C
= C
′
C
−1
C
′
β and C is an m × m
matrix cho-
sen to exclude columns of X corresponding to constant terms in X so that XC has full column rank m
≤ m. The residuals from this regression can be written as
z = y − XC ˆβ
C
, where ˆ
β
C
= C
′
X
′
XC
−1
C
′
X
′
y
is the usual OLS es- timator and the T
− 1 × N matrix
Z
is defined to satisfy
z
= vec
Z
. The estimated factors can then be written as
F =
Z ˆ
Ŵ, where ˆ Ŵ is the N
× r matrix of eigenvectors cor- responding to the largest r eigenvalues of
Z
′
Z . The esti-
mated idiosyncratic components are then E
= Z
− F ˆ
, where ˆ
= F
′
F
−1
F
′
Z . Taking the partial sums of
F
and E
gives the component estimates ˆ F and ˆ
E, with corre-
sponding rT
− 1 × 1 and NT − 1 × 1 vectors ˆf = vec ˆF
and
ˆe = vec ˆE. The deterministic regressions for the estimated factors ˆf, pro-
ceed as follows. Because ˆf can be written as ˆf
= ˆŴ
′
⊗ I
T −1
z ,
it is necessary to regress ˆf on ˆ
Ŵ
′
⊗ I
T −1
X . But this re-
gressor matrix may not have full column rank, in which case it is sufficient to regress ˆf on X
f
= ˆŴ
′
⊗ I
T −1
XC
f
, where
C
f
is a matrix chosen such that X
f
has full column rank and its columns form a basis for the vector space containing the
columns of ˆ Ŵ
′
⊗ I
T −1
X
. In practice, if ˆ Ŵ
′
⊗ I
T −1
X
has less
than full column rank, then a simple choice for C
f
is the ma- trix of eigenvectors corresponding to the nonzero eigenvalues
of X
′
ˆ Ŵ ˆ
Ŵ
′
⊗ I
T −1
X . The residuals from the regression of ˆf
on X
f
are denoted by
ˆf. The corresponding T − 1 × r ma-
trix
ˆF is defined to satisfy ˆf= vecˆF. Each of the r columns
of
ˆF is standardized by its sample standard deviation to give the
T
− 1 × r matrix ˜F, whose individual elements are denoted
by ˜f
j,t
, j = 1, . . . , r and t = 2, . . . , T.
The deterministic regressions for the estimated idiosyncratic components
ˆe proceed similarly. We can write E
= Z
− Z ˆ
Ŵ ˆ , because
F =
Z ˆ Ŵ, so
ˆe = I
NT −1
− ˆ
′
ˆŴ
′
⊗ I
T −1
ˆz. Therefore, it is necessary to regress ˆe on I
NT −1
− ˆ
′
ˆŴ
′
⊗ I
T −1
X . If this regressor matrix does not have full
rank, then it is replaced by X
e
= I
NT −1
− ˆ
′
ˆŴ
′
⊗I
T −1
XC
e
,
where, like C
f
, C
e
is chosen by principal components or other
means so that the columns of X
e
provide a basis for those of
I
NT −1
− ˆ
′
ˆŴ
′
⊗ I
T −1
X . The residuals from the regression
of
ˆe on X
e
are denoted by
ˆe, and the columns of the correspond-
ing matrix
ˆE [i.e.,ˆe= vecˆ E] and standardized by their respec-
tive standard deviations to give ˜ E with individual elements
˜e
i,t
. To calculate ˜S
F k
, define, in 8, ˜a
k,t
=
r j
=1
˜f
j,t
˜f
j,t −k
+
N i
=1
˜e
i,t
˜e
i,t −k
. Then ˜ C
k
and ˆω{˜a
k,t
} are calculated as described after 8. The bias correction term
˜c in 8 is given by ˜c = T −k
−12
tr
X
′ f
X
f
T
−1
ˆ{w
f ,s
}+X
′ e
X
e
T
−1
ˆ{w
e,s
} ,
where w
f
= X
f
⊙ ˆfι
′ m
f
= {w
′ f ,s
}
rT −1
s =1
and w
e
= X
e
⊙ ˆeι
′ m
e
= {w
′ e,s
}
NT −1
s =1
, where ⊙ is the Hadamard product, ι
m
is the m × 1
unit vector, and m
f
and m
e
are the column dimensions of X
f
and X
e
. In application, r is replaced by ˆr obtained from A.5.
In some applications, it may be the case that x
i,t
= x
t
for all i, in which case the estimation of the factor model simplifies
and begins with the estimation of the OLS regressions of y
i,t
on x
t
for each i = 1, . . . , N. If x
t
contains a constant, then
the corresponding element of x
t
is deleted. The OLS resid- uals,
ˆz
i,t
, are arranged in the T − 1 × N matrix
Z . The
model 10 is estimated by principal components as discussed at the start of this section, with Y replaced by
Z . The re-
sulting estimated components, ˆf
j,t
and ˆe
i,t
for j = 1, . . . , r and
i
= 1, . . . , N are individually regressed on x
t
to give residuals that are then each standardized to have unit standard deviation.
This gives the N + r × 1 vector ˜f
1,t
, . . . , ˜f
r,t
, ˜e
1,t
, . . . , ˜e
N,t ′
of standardized residuals, from which ˜S
F k
is then calculated. A.3 Proof of Theorem 2
In the notation defined for the general deterministic regres- sion in Section A.2, the residuals
z can be written as
z = z − XCC
′
X
′
XC
−1
C
′
X
′
z.
Taking partial sums of z
gives
ˆz = z − XCC
′
X
′
XC
−1
C
′
X
′
z,
apart from some asymptotically negligible initial value effects, so that z and X now have NT
− 1 rows. The partial sum of
f
= vec
F = ˆŴ
′
⊗ I
T −1
z gives
ˆf = ˆŴ
′
⊗ I
T −1
z − X
f
B
f
,
where X
f
= ˆŴ
′
⊗I
T −1
XC
f
and B
f
= C
′ f
C
f −1
C
′ f
CC
′
X
′
× XC
−1
C
′
X
′
z . Thus regressing ˆf on X
f
will remove the
X
f
B
f
term from the residuals, giving
ˆf= ¯P
f
ˆf = ¯P
f
ˆ Ŵ
′
⊗ I
T −1
z,
A.8
Downloaded by [Universitas Maritim Raja Ali Haji] at 23:56 12 January 2016
where ¯ P
f
= I
rT −1
− X
f
X
′ f
X
f −1
X
′ f
= I
rT −1
− P
f
. The cor- responding matrix
ˆF satisfiesˆf= vecˆF, and the standardized
matrix ˜
F is found by ˜ F
= ˆ F ˆ
G
−1 f
, where ˆ
G
f
is an r × r diag-
onal matrix containing the sample standard deviations of the columns of
ˆF on the diagonal. Thus ˜f = ˆG
−1 f
⊗ I
T −1
ˆf.
The partial sum of e
= ˆA z
, where ˆ A
= I
NT −1
− ˆ
′
ˆŴ
′
⊗ I
T −1
, gives
ˆe = ˆAz − X
e
B
e
,
where X
e
= I
NT −1
− ˆ
′
ˆŴ
′
⊗ I
T −1
XC
e
and B
e
= C
′ e
× C
e −1
C
′ e
CC
′
X
′
XC
−1
C
′
X
′
z , so regressing
ˆe on X
e
will remove X
e
. This leaves
ˆe= ¯P
e
ˆe = ¯P
e
ˆAz,
A.9 where ¯
P
e
is the orthogonal projection on X
e
. The correspond- ing matrix
ˆE satisfiesˆe= vecˆ E, and the standardized matrix ˜
E
is found by ˜ E
= ˆ E ˆ
G
−1 e
, where ˆ G
e
is an r × r diagonal matrix
containing the sample standard deviations of the columns of
ˆE
on the diagonal. These steps show that the appropriate regres- sions of ˆf on X
f
and
ˆe on X
e
remove the effects of the initial deterministic regression in first differences.
Because the model is estimated in differences, it follows that under both the null and the alternative,
Z
′
ZT =
+ O
p
T
−12
and hence ˆ Ŵ
= Ŵ + O
p
T
−12
, where Ŵ is the matrix of eigenvectors corresponding to the largest r
eigenvalues of . Thus ˆ
= ˆŴ
′
Z
′
Z ˆ Ŵ
−1
ˆŴ
′
Z
′
Z =
Ŵ
′
Ŵ
−1
Ŵ +O
p
T
−12
. Recalling the definitions of
ˆf
and
ˆe in A.8 and A.9, consider ¯f = ˆŴ
′
⊗ I
T −1
z and
¯e = ˆAz. We can write ¯f
′
, ¯e
′ ′
= vecW ˆC, where W = Z
−12
, and
ˆC =
12
ˆ Ŵ, ˆ
P, where ˆ P