8 J
.F. Kiviet, G.D.A. Phillips Economics Letters 66 2000 7 –15
estimator is upwards biased. This is the main theoretical result in the paper. Given that an explicit expression for the bias approximation is obtained, a bias correction can routinely be applied.
2. Model and notation
Consider a general static simultaneous equation model containing G equations which may be written as
A9 y 1 B9x 5 e , t 5 1, . . . ,T,
1
t t
t
where y is a G 3 1 vector of endogenous variables, x is a K 3 1 vector of strongly exogenous
t t
variables which we shall treat as non-stochastic, and e is a G 3 1 vector of structural disturbances.
t
A9and B9 are respectively, G 3 G and G 3 K matrices of structural coefficients. With T observations on the above system, we may write
YA 1 XB 5 E 2
where Y is a T 3 G matrix of observations on the endogenous variables, X is a T 3 K matrix of observations on the exogenous variables, and E is a T 3 G matrix of structural disturbances. We shall
be particularly concerned with that part of the system 2 which relates to the first equation. The reduced form of the system includes
Y 5 XP 1 V 3
1 1
1
where Y 5 y :Y , X 5 X :X , P 5p :P and V 5 v :V .P is a K 3 g 1 1 matrix of reduced
1 1
2 1
2 1
1 2
1 1
2 1
form parameters and V is a T 3 g 1 1matrix of reduced form disturbances. In addition, the
1
following assumptions are made: •
The rows of V are independently and normally distributed with mean vector 09 and non-singular
1
covariance matrix V 5 v .
h j
ij
• The T 3 K matrix X is of rank K , T , and the elements of the K 3 K matrix X9X are OT .
• The first equation of system 1 is overidentified with the order of overidentification, L, being at
least 2. This ensures that the first two moments of 2SLS exist; see Kinal 1980.
3. Asymptotic approximations
The first equation of 2 may be written as y 5 Y b 1 X g 1 e
4
1 2
1 1
where y and Y are, respectively, a T 3 1 vector and a T 3 g matrix of observations on g endogenous
1 2
variables and X is a T 3 k matrix of observations on k non-stochastic exogenous variables. The
1
vectors of unknown parameters b and g are, respectively, g 3 1 and k 3 1 while e is a T 3 1 vector
1
J .F. Kiviet, G.D.A. Phillips Economics Letters 66 2000 7 –15
9
2
of independently and identically distributed normal random variables with mean zero and variance s . The 2SLS estimators of the unknown parameters of 4 are given by
21
ˆ ˆ ˆ ˆ
9 9
9
b Y Y Y X
Y
2 2
2 1
2
5 y
5
S D S D S D
1.
ˆ
9 9
9
g X Y X X
X
1 2
1 1
1 21
ˆ ˆ
where Y 5XP 5XX9X X9Y is the T 3 g matrix of fitted values obtained in the regression of Y
2 2
2 2
on X. From 5 we may write the estimation error as
21
ˆ ˆ ˆ ˆ
9 9
b b
Y Y Y X Y
2 2
2 1
2
2 5
e . 6
S D S D S D S D
1
ˆ
9 9
9
g g
X Y X X X
1 2
1 1
1
In what follows it will be convenient to re-write 4 in the form y 5 Z a 1 e
7
1 1
1
where Z 5 Y :X and a 5 b 9,g 99. The 2SLS estimator may then be written as
1 2
1 21
ˆ ˆ ˆ
9
a 5 Z Z Z y 8
1 1
1 1
ˆ ˆ
where Z 5 Y :X is a T 3 g 1 k matrix of regressors at the second stage of the 2SLS procedure.
1 2
1
Before stating the approximations that are the focus of interest, we shall define the following: ¯
¯ ¯
Z 5 Y :X is a T 3 g 1 k non-stochastic matrix where Y 5 EY ,
1 2
1 2
2 21
¯ ¯ ¯
9 9
Y Y Y X
2 2
2 1
21
¯ ¯
9
Q 5 5 Z Z ,
S D
1 1
¯
9 9
X Y X X
1 2
1 1
1 1
2 2
˘ ]
9
]
9
E Z e 5 E V e 5 s t 9,09 5 s c
f g
f g
f g
1 1 2 1
T T
˘ where V 5 V :0 has the last k columns zero and c 5 t 9,09 is g 1 k 3 1 with the last k elements
f g
2 2
zero, 1
2
1 ]
s tt9 EW 9W
˘ ˘ ]
9
C 5 EV V , C 5
and C 5 T
2 2
1 2
3 4
3 4
T where W 5 V 2 e t 9 with W distributed independently of e , see Nagar 1959, and
2 1
1
C 5 C 1 C .
1 2
With the above definitions we may state the following:
21
• 2SLS bias to order T
: Nagar 1959, p. 579.
10 J
.F. Kiviet, G.D.A. Phillips Economics Letters 66 2000 7 –15
2 21
Ea 2 a 5 s L 2 1Qc 1 oT 9
22
• 2SLS mean squared error to order T
: Nagar 1959, p. 579.
2 2
E a 2 aa 2 a9 5 s Q 1 s trCQ 2 2L 2 1trC Q Q h
j
f g
1 2
2 22
1 s L 2 3L 1 4QC Q 2 L 2 2QCQ 1 oT
10
f g
1 21
• Bias of the residual variance estimator to order T
: Nagar 1961, p. 240.
2 2
2 21
Es 2 s 5 2 s 2L 2 1trQC 2 trQC 1 oT 11
f g
1
e9e
2
]]]] where s 5
and e 5 y 2 Z a is a T 3 1 vector of 2SLS residuals.
1 1
T 2 g 1 k These are slight adaptations of the published results which we shall use later in the paper. In fact
Nagar 1961 deflates the sum of squared residuals by T and, as a result, the estimator is biased to
21
order T . We prefer to use the less biased version: see also Kiviet and Phillips 1998.
4. The bias of the asymptotic variance estimator