Directory UMM :Data Elmu:jurnal:E:Economics Letters:Vol66.Issue1.Jan2000:
www.elsevier.com / locate / econbase
The bias of the 2SLS variance estimator
a b ,
*
Jan F. Kiviet , Garry D.A. Phillips
aTinbergen Institute and Faculty of Economics and Econometrics, University of Amsterdam, Amsterdam, The Netherlands
b
School of Business and Economics, University of Exeter, Streatham Court, Exeter EX4 4PU, UK Accepted 8 April 1999
Abstract
In simultaneous equation models the two stage least squares (2SLS) estimator of the coefficients, though consistent, is biased in general and the nature of this bias has given rise to a good deal of research. However, little if any attention has been given to the bias that arises when an estimate of the asymptotic variance is used to approximate the small sample variance. In this paper we use asymptotic expansions to show that, in general, the asymptotic variance estimator has an upwards bias. 2000 Elsevier Science S.A. All rights reserved. Keywords: 2SLS estimation; Nagar expansions; Asymptotic variance; Variance estimation bias
JEL classification: C30
1. Introduction
The seminal paper of Nagar (1959) presented approximations for the bias of the 2SLS estimator to
21 22
order T and for the mean squared error to the order of T , where T is the sample size. By subtracting the square of the bias approximation from the mean squared error approximation we
22
obtain an estimator for the variance to the order of T . Little use seems to have been made of this particular approximation; indeed, the problem of bias in the estimation of the coefficient estimator variance seems generally to have been neglected. However, this approximation can be used to explore the bias of the estimated asymptotic variance as an estimator of the small sample variance. By finding
22 an approximation for the expectation of the asymptotic variance estimator to the order of T , and comparing it with the variance approximation of the same order, we may deduce immediately the
22
approximate bias in the asymptotic variance estimator. This bias, which is of order T , is found to be non-negative for all coefficients in the 2SLS estimator showing that, in general, the traditional
*Corresponding author. Tel.: 144-1392-263-241; fax: 144-1392-263-242. E-mail address: [email protected] (G.D.A. Phillips)
0165-1765 / 00 / $ – see front matter 2000 Elsevier Science S.A. All rights reserved. P I I : S 0 1 6 5 - 1 7 6 5 ( 9 9 ) 0 0 2 3 3 - 5
(2)
estimator is upwards biased. This is the main theoretical result in the paper. Given that an explicit expression for the bias approximation is obtained, a bias correction can routinely be applied.
2. Model and notation
Consider a general static simultaneous equation model containing G equations which may be written as
A9yt1B9xt5et, t51, . . . ,T, (1)
where y is a Gt 31 vector of endogenous variables, x is a Kt 31 vector of strongly exogenous variables which we shall treat as non-stochastic, and et is a G31 vector of structural disturbances.
A9and B9 are respectively, G3G and G3K matrices of structural coefficients. With T observations
on the above system, we may write
YA1XB5E (2)
where Y is a T3G matrix of observations on the endogenous variables, X is a T3K matrix of
observations on the exogenous variables, and E is a T3G matrix of structural disturbances. We shall
be particularly concerned with that part of the system (2) which relates to the first equation. The reduced form of the system includes
Y15XP11V1 (3)
where Y 5( y :Y ), X5(X :X ), P 5(p :P ) and V 5(v :V ).P is a K3( g11) matrix of reduced
1 1 2 1 2 1 1 2 1 1 2 1
form parameters and V is a T1 3( g11)matrix of reduced form disturbances. In addition, the following assumptions are made:
• The rows of V are independently and normally distributed with mean vector 01 9 and non-singular covariance matrix V5
h j
vij.• The T3K matrix X is of rank K (,T ), and the elements of the K3K matrix X9X are O(T ).
• The first equation of system (1) is overidentified with the order of overidentification, L, being at least 2. This ensures that the first two moments of 2SLS exist; see Kinal (1980).
3. Asymptotic approximations
The first equation of (2) may be written as
y15Y2b1X1g1e1 (4)
where y and Y are, respectively, a T1 2 31 vector and a T3g matrix of observations on g endogenous
variables and X is a T1 3k matrix of observations on k non-stochastic exogenous variables. The
(3)
2 of independently and identically distributed normal random variables with mean zero and variances . The 2SLS estimators of the unknown parameters of (4) are given by
21
ˆ ˆ ˆ
9
9
ˆ9
b* Y Y Y X2 2 2 1 Y2
5 y (5)
S D S
9
ˆ9
D S D
9
1.g* X Y X X1 2 1 1 X1
21
ˆ ˆ
where Y25XP25X(X9X ) X9Y is the T2 3g matrix of fitted values obtained in the regression of Y2
on X. From (5) we may write the estimation error as
21
ˆ ˆ ˆ
9
ˆ9
b* b Y Y Y X2 2 2 1 Y2
2 5 e . (6)
S D S D S
9
ˆ9
D S D
9
1g* g X Y X X1 2 1 1 X1
In what follows it will be convenient to re-write (4) in the form
y15Z1a1e1 (7)
where Z15(Y :X ) and2 1 a5(b9,g9)9. The 2SLS estimator may then be written as
21
ˆ ˆ ˆ
9
a*5(Z Z )1 1 Z y1 1 (8)
ˆ ˆ
where Z15(Y :X ) is a T2 1 3( g1k) matrix of regressors at the second stage of the 2SLS procedure.
Before stating the approximations that are the focus of interest, we shall define the following:
¯ ¯ ¯
Z15(Y :X ) is a T2 1 3( g1k) non-stochastic matrix where Y25E(Y ),2
21
¯ ¯
9
¯9
Y Y2 2 Y X2 1 21
¯ ¯
9
Q5
S
9
¯9
D
5(Z Z )1 1 ,X Y1 2 X X1 1
1 1 ˘ 2 2
]TE Zf
9
1 1e g5]TE Vf
2 19
eg
5s tf 9,09 5g s c˘
where V25(V :0) has the last k columns zero and2 c5ft9,09gis ( g1k)31 with the last k elements zero,
1 2
1 ˘ ˘ s tt9 0 ]E(W9W ) 0
]
9
C5TE(V V ), C2 2 15
3
4
and C253
T4
0 0 0 0
where W5V22e t1 9 with W distributed independently of e1, see Nagar (1959), and
C5C11C .2
With the above definitions we may state the following:
21
(4)
2 21
E(a*2a)5s (L21)Qc1o(T ) (9)
22
• 2SLS mean squared error to order T : Nagar (1959, p. 579).
2 2
E (ha*2a)(a*2a)9 5j s Q1s
f
tr(CQ )22(L21)tr(C Q ) Q1g
2 2 22
1s
f
(L 23L14)QC Q1 2(L22)QCQg
1o(T ) (10)21
• Bias of the residual variance estimator to order T : Nagar (1961, p. 240).
2 2 2 21
E(s*2s )5 2s
f
2(L21)tr(QC )1 2tr(QC )g
1o(T ) (11)e9*e*
2 ]]]]
where s*5T2( g1k) and e*5y12Z1a* is a T31 vector of 2SLS residuals.
These are slight adaptations of the published results which we shall use later in the paper. In fact Nagar (1961) deflates the sum of squared residuals by T and, as a result, the estimator is biased to
21
order T . We prefer to use the less biased version: see also Kiviet and Phillips (1998).
4. The bias of the asymptotic variance estimator
Subtracting the outer product of (9) from (10), we may deduce an approximation to the variance of 2SLS as follows:
2 2 22
Var(a*)5s Q1s h[tr(CQ )22(L21)tr(C Q )]Q1 2(L23)QC Q1 2(L22)QCQj1o(T ). (12)
2 21 2
ˆ ˆ ˆ
In practice the estimated asymptotic variance, Var(a*)5s*(Z9Z ) , where s* is defined in (11), is used to estimate the variance in finite samples and it is the bias of this estimator which is the main focus of interest in this paper. However, we shall first consider the bias of a non-operational estimator
2 21 2
˜ ˆ ˆ
of the variance given by Var(a*)5s (Z9Z ) where s is known. Since none of the resulting bias 2
can be attributed to the estimator of s , a consideration of this case will be helpful in analysing the source of the bias in the estimated asymptotic variance. In Appendix 1 the following result is proved:
2 21
˜ ˆ ˆ
Lemma 1. The expected value of the non-operational variance estimator Var(a*)5s (Z9Z ) can
be approximated as
2 2 22
˜
E Var(
f
a*)g
5s Q1s ftr(CQ )Q2(L22)QCQg1o(T ). (13)The result of this lemma, combined with (12), leads to the following.
22
˜
Theorem 1. The bias of the non-operational variance estimator Var(a*), to order T , is given by
2 22
˜
(5)
Notice that tr(C Q )Q and QC Q are both positive semi-definite matrices where tr(C Q )Q1 1 1 $QC Q;1 see Kadane (1971, p. 728). Hence the bias matrix above is positive semi-definite for L$2. However,
L$2 is a requirement for the variances to exist so that, in general, the non-operational variance
22
estimator is biased upwards to order T . Next we examine the bias of the asymptotic variance estimator. In Appendix 2 we show the following result.
2 21
ˆ ˆ ˆ
9
Lemma 2. The expected value of the asymptotic variance estimator Var(a*)5s (Z Z ) to the * 1 1
22
order of T is given by
2 2 22
ˆ
E
f
(Var(a*)g
5s Q1sf
2(L21)QCQ14QC Q1 22(L21)tr(QC )Q1 12tr(QC )Qg
1o(T ). (15) An approximation for the bias can now be readily obtained. Combining the result in Lemma 2 with the approximation in (12) gives the next theorem.2 21
ˆ ˆ ˆ
Theorem 2. The bias of the asymptotic variance estimator Var(a*)5s (Z9Z ) , to the order of *
22
T , is given by
2 22
ˆ
E
f
Var(a*)2Var(a*)g
5sf
tr(QC )Q1(L11)QC Q1g
1o(T ). (16)Noting that both tr(QC )Q and QC Q are positive semi-definite, it is clear that the estimated1 asymptotic variance is, in general, biased upwards to the order of the approximation. Comparing 2 Theorems 1 and 2, it is seen that the direction of the bias is unaltered by the need to estimate s although the bias expression itself changes. It does not appear possible to make general statements
ˆ
about the relative magnitudes of the two biases. This is partly because the bias of Var(a*) depends on
˜
the matrix C whereas the bias of Var(2 a*) does not.
ˆ
Given that an explicit expression for the bias of the asymptotic variance estimator Var(a*) has been found, a bias corrected estimator can be obtained straightforwardly. Estimates are available for the relevant terms in the bias approximation so that an estimate of the bias can be obtained which is then subtracted from the original estimator. We shall not pursue the matter further in this paper however.
5. Conclusion
This paper shows that the traditional variance estimator for the 2SLS coefficient estimator is, in
22
general, biased upwards to order T . This is a surprisingly strong result for it applies whatever the data and parameter set. The magnitude and the effects of this bias have not been studied. We can, of course, speculate that 2SLS confidence intervals will be conservative in small samples but this is not a clear cut matter since there is also a coefficient estimator bias. In fact, given this bias, it is possible that the upward bias in the variance might lead to improved confidence interval coverage. This issue, the option to correct for the coefficient bias, and the effects on hypothesis tests etc, requires further study.
(6)
Appendix A
2 21 22
˜ ˆ ˆ
9
Here we derive the expected value of Var(a*)5s (Z Z )1 1 to the order T . We need only
2 ˆ ˆ
consider the inverse matrix since s is constant. First we examine the matrix Z15(Y :X ) where2 1
21 21
ˆ
Y25X(X9X ) X9Y25XP21X(X9X ) X9V . Noting that this last term is stochastic, it is seen that we2
ˆ
may write the matrix Z as the sum of two matrices, one stochastic and one non-stochastic, as follows:1
21 21
ˆ ˘ ¯
Z15(X(X9X ) X9V :O )2 1(XP2:X )1 5X(X9X ) X9V21Z1
21
¯ ¯ ¯
˘
where we have put V25(V :O ) and Z2 15(XP2:X ). Noting that X(X1 9X ) X9Z15Z we may write:1
21 21
ˆ ˆ
9
¯ ¯9
˘9
¯ ¯9
˘ ˘9
˘ ˘9
¯ ¯9
˘ ˘9
¯ ˘Z Z1 15Z Z1 11V Z2 11Z V1 21V X(X2 9X ) X9V25Q 1V Z2 11Z V1 21V MV2 2
21
¯ ¯ ¯
˘
9
9
˘ ˘9
˘5
f
I1(V Z2 11Z V )Q1 2 1V MV Q Q2 2g
21 ¯ ¯ ¯ 21
9
where Q 5Z Z and M1 1 5X(X9X ) X9. Inverting both sides yields:
21 21
ˆ ˆ
9
˘9
¯ ¯9
˘ ˘ ¯ ˘(Z Z )1 1 5Q I
f
1(V Z2 11Z V )Q1 2 1V MV Q2 2g
¯ ¯ ¯ ¯ ¯ ¯ ¯
˘
9
9
˘ ˘9
˘ ˘9
9
˘ ˘9
9
˘5Q I
f
2(V Z2 11Z V )Q1 2 2V MV Q2 2g
1Q (V Zf
2 11Z V )Q(V Z1 2 2 11Z V )Q1 2g
22
1o (Tp ) (A.1)
1
]
2 21
¯ ¯ 2 ¯
˘
9
9
˘ ˘9
˘where (V Z2 11Z V )Q is O (T1 2 p ) and V MV Q is O (T2 2 p ). Here the inverse matrix has been
21
ˆ ˆ
9
expanded in terms of decreasing orders of smallness. To obtain E (Z Z )
f
1 1g
we shall take expectations of each of the relevant terms in (A1) as follows:¯ ¯ ¯
˘
9
9
˘ ˘• E Q(V Z
f
2 11Z V )Q1 2g
50 since Q and Z are fixed and E(V )1 2 50.¯ ¯ ¯ ¯
˘
9
˘ ˘9
˘• E QV MV Q
f
2 2g
5QE(V MV )Q2 2 5tr(M )QCQ5K QCQ where tr(M )5K, see Kiviet and Phillips(1996, p. 166),
¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
˘
9
9
˘ ˘9
9
˘ ˘9
˘9
˘9
9
˘• E Q(V Z
f
2 11Z V )Q(V Z1 2 2 11Z V )Q1 2g
5E Q(V Z )Q(V Z )Qf
2 1 2 1 1Q(V Z )Q(Z V )Q2 1 1 2¯
9
˘ ˘9
¯ ¯9
˘ ¯9
˘1Q(Z V )Q(V Z )Q1 2 2 1 1Q(Z V )Q(Z V )Q1 2 1 2
g
˘On noting that we may write V25(W1e t1 9:0) where W ande1 are independent, see Nagar (1959), we may use the results in Mikhail (1972) or Kiviet and Phillips (1996, p. 166) to show directly that:
¯ ¯ ¯ ¯
˘
9
˘9
˘9
9
˘E Q(V Z )Q(V Z )Q
f
2 1 2 1g
5QCQ, E Q(V Z )Q(Z V )Qf
2 1 1 2g
5( g1k)QCQ¯
9
˘ ˘9
¯ ¯9
˘ ¯9
˘E Q(Z V )Q(V Z )Q
f
1 2 2 1g
5tr(QC )Q, E Q(Z V )Q(Z V )Qf
1 2 1 2g
5QCQ.Adding terms we find that in (iii)
¯ ¯ ¯ ¯
˘
9
9
˘ ˘9
9 9
˘E Q(V Z
f
2 11Z V )Q(V Z1 2 2 11Z V )Q1 2g
5( g1k12)QCQ1tr(QC )Q. (A.2)2
Finally, using (i)–(iii) above, we have found that the expected value ofs times (A1) is given by:
2 ˆ ˆ
9
2 22(7)
Appendix B
2 21
ˆ ˆ ˆ
9
In this Appendix we prove Lemma 2. Thus we derive the expected value of Var(a*)5s*(Z Z )1 1
22 2
to o(T ) where s* is given in (11). To find the required expansion we first note that the 2SLS residual vector is given by
*
e1 5y12Z1a*5e12Z (1 a*2a) and the sum of squared residuals is then
9
*
9
9
9
e *e1 1 5e e1 122(a*2a)9Z1 1e 1(a*2a)9Z Z (1 1 a*2a). The disturbance variance estimator is
9
*
9
9
9
e *e1 1 e e1 1 (a*2a)9Z1 1e (a*2a)9Z Z (1 1 a*2a)
2 ]]]] ]]]] ]]]]] ]]]]]]]]
s*5 5 22 1 (B.1)
T2( g1k) T2( g1k) T2( g1k) T2( g1k)
21 / 2 21
where the first term is O (1), the second is O (Tp p ) and the last is O (Tp ).
21
ˆ ˆ
9
The inverse matrix (Z Z )1 1 is similarly expanded in terms of decreasing stochastic order of magnitude in (A.1) as
21
ˆ ˆ
9
¯9
˘ ˘9
¯ ˘9
¯ ˘ ¯9
˘ ˘9
¯ ¯9
˘ ˘9
¯(Z Z )1 1 5Q2Q(Z V1 21V Z )Q2 1 2QV MV Q2 2 1Q(Z V1 21V Z )Q(Z V2 1 1 21V Z )Q2 1
22
1o (Tp ) (B.2)
21 23 / 2 22
where the successive terms are O(T ), O (Tp ) and O (Tp ) respectively. 2 ˆ ˆ
9
21To find the appropriate expansion for s*(Z Z )1 1 we combine (B.1) and (B.2) to yield
9
e e1 1
2 ˆ ˆ
9
21 ¯ ˘ ˘ ¯ ˘ ¯ ˘]]]]
9
9
9
s*(Z Z )1 1 5
f
Q2Q(Z V1 21V Z )Q2 1 2QV MV Q2 2 T2( g1k)2
¯
9
˘ ˘9
¯ ¯9
˘ ˘9
¯ ]]]]9
1Q(Z V1 21V Z )Q(Z V2 1 1 21V Z )Q2 1
g
2 (a*2a)9Z1 1eT2( g1k)
9
(a*2a)9Z Z (1 1 a*2a) 22
¯
9
˘ ˘9
¯ ]]]]]]]]3
f
Q2Q(Z V1 21V Z )Q2 1g
1 Q1o (Tp ). (B.3)T2( g1k)
We shall take expectations of each of these terms. However the analysis of the first is simplified by noting that
9
9
e e1 1 e e1 1 2 2
]]]]5
S
]]]]2sD
1sT2( g1k) T2( g1k)
21 / 2
9
where the first part on the right is O (Tp ). Hence when (e e1 1) /(T2( g1k)) multiplies terms which
22 2
are O (Tp ), it can be replaced bys and we shall still get an approximation to the expectation of the desired order. We thus consider
(8)
9
e e1 1
¯ ˘ ˘ ¯ ˘ ¯ ˘
]]]]
9
9
9
E
H
f
Q2Q(Z V1 21V Z )Q2 1 2QV MV Q2 2g
J
T2( g1k)e9e1 ¯ ˘ ˘ ¯ ¯ ˘ ˘ ¯
]]]]
9
9
9
9
1E
H
f
Q(Z V1 21V Z )Q(Z V2 1 1 21V Z )Q2 1g
J
T2( g1k)e9e1 2 2
¯ ¯ ¯ ¯ ¯
˘ ˘ ˘ ˘ ˘ ˘
]]]]
9
9
9
9
5E
H
T2( g1k)QJ
1Eh
sf
2QV MV Q2 2g
j
1Eh
sf
Q(Z9V21V Z )Q(Z V2 1 1 21V Z )Q2 1g
j
where terms involving products of an odd number of zero mean normal random variables have been ignored. The first two expected values are;9
e e1 1 2 ( g1k)
]]]] ]]]]
E
H
QJ S
5s 11D
QT2( g1k) T2( g1k)
2
9
¯ 2 ¯ 2E
h
sf
2QV MV Q2 2g
j
5 2s trM. QCQ5 2s K QCQ, (B.4)see Kiviet and Phillips (1996, p. 166).
The third term can be readily evaluated as in (A.2) so that
2 ¯
9
˘ ˘9
¯ ¯9
˘ ˘9
¯ 2 2E
h
sf
Q(Z V1 21V Z )Q(Z V2 1 1 21V Z )Q2 2g
j
5s ( g1k12)QCQ1s tr(QC )Q (B.5) Gathering terms from (B.4) and (B.5), we have found the expectation of the first term of (B.3). Next we consider the second term in (B.3). To analyse this term we first note the expansion21
¯
9
˘9
¯ ¯9
˘ ¯9
˘9
¯ ¯9
(a*2a)5Q(Z1 1e 1V M2 e1)2QZ V QZ1 2 1 1e 2QV Z QZ2 1 1 1e 1o (Tp ) (B.6)
¯ ˘
9
9
9
which is given in Nagar (1959, p. 582). Using this result, and noting that Z1 1e 5Z1 1e 1V2 1e where 1 / 2
the right hand side terms are, respectively, O (Tp ) and O (T ), we may writep
¯ ¯ ¯
˘
9
9
9
9
e9V QZ e e Z QZ e
1 ¯ ˘ ˘ ¯ 2 1 1 1 1 1 1
]]]](a*2a)9Z1 1
9
ef
Q2Q(Z V9
1 21V Z )Q29
1g
5]]]]Q1]]]]QT2( g1k) T2( g1k) T2( g1k)
¯ ¯ ¯ ¯ ¯ ¯
˘ ˘ ˘ ˘ ˘ ˘ ˘
9
9
9
9
9
9
9
9
9
e1V QV M2 2 e1 e1V QZ V QZ2 1 2 9e1 e1V QV Z QZ2 2 1 1 1e e1V QZ2 1 1e
¯ ˘
]]]] ]]]]]] ]]]]]] ]]]]
9
1 T2( g1k) Q1 T2( g1k) Q1 T2( g1k) Q1T2( g1k)Q(Z V1 2
22
¯ ˘
9
1V Z )Q2 1 1o (Tp ). (B.7)
The first of these terms involves a product of an odd number of normal random variables with zero mean and so has expected value zero. Taking expected values for the remaining terms we have;
¯ ¯
9
9
e1Z QZ1 1 1e 2 ( g1k)
]]]] ]]]]
E
H
J
Q5s QT2( g1k) T2( g1k)
¯ ˘
9
˘9
e9V QV M2 2 e1 2 TKtr(QC )1
]]]]] ]]]]
E
H
J
Q5s Q(9)
¯ ¯
˘ ˘
9
9
9
e1V QZ V QZ2 1 2 1 1e 2 T tr(QC )1
]]]]]] ]]]]
E
H
2J
Q5 2s QT2( g1k) T2( g1k)
¯ ¯
˘ ˘
9
9
e1V QV Z QZ2 2 1 9e1 2T( g1k)tr(QC )1
]]]]]] ]]]]]
E
H
2J
Q5 2s QT2( g1k) T2( g1k)
2
¯ ˘
9
9
e1V QZ2 1 1e ¯ ˘ ˘ ¯ 2s
]]]]
9
9
]]]]E
H
2 Q(Z V1 21V Z )Q2 1J
5 2f
(T12)QC Q1 1QC Q2g
T2( g1k) T2( g1k)
2
2s T 22
]]]]
5 2 QC Q1 1o (Tp ). (B.8)
T2( g1k)
Here the results for evaluating the second, third and fourth terms are given in Nagar (1961, p. 242).
˘
An evaluation of the last term proceeds from putting V25W1e t1 9 where W and e1 are independent. The required analysis is straightforward but lengthy and so is not included here. The authors will provide details on request. Collecting the terms in (B.8) and multiplying by 22 yields the expectation of the second term of (B.3). To complete the analysis we need the expected value of the last term in (B.3). From Nagar (1961, p. 243) we find that
2
9
(a*2a)9Z Z (1 1 a*2a) s fg1k1Ttr(QC )g 21
]]]]]]]] ]]]]]]]
E
H
QJ
5 Q1o(T ). (B.9)T2( g1k) T2( g1k)
Collecting the various terms we have the result given in Lemma 2.
References
Kadane, J.B., 1971. Comparison of k-class estimators when the disturbances are small. Econometrica 39, 723–737. Kinal, T.W., 1980. The existence of moments of k-class estimators. Econometrica 48, 241–249.
Kiviet, J.F., Phillips, G.D.A., 1996. The bias of the ordinary least squares estimator in simultaneous equation models. Economics Letters 53, 161–167.
Kiviet, J.F., Phillips, G.D.A., 1998. Degrees of freedom adjustment for disturbance variance estimators in dynamic regression models. Econometrics Journal 1, 44–70.
Mikhail, W.M., 1972. The bias of the two stage least squares estimator. Journal of the American Statistical Association 67, 625–627.
Nagar, A.L., 1959. The bias and moment matrix of the general k-class estimators of the parameters in simultaneous equations. Econometrica 27, 575–595.
(1)
2 21
E(a*2a)5s (L21)Qc1o(T ) (9)
22
• 2SLS mean squared error to order T : Nagar (1959, p. 579).
2 2
E (ha*2a)(a*2a)9 5j s Q1s
f
tr(CQ )22(L21)tr(C Q ) Q1g
2 2 22
1s
f
(L 23L14)QC Q1 2(L22)QCQg
1o(T ) (10) 21• Bias of the residual variance estimator to order T : Nagar (1961, p. 240).
2 2 2 21
E(s*2s )5 2s
f
2(L21)tr(QC )1 2tr(QC )g
1o(T ) (11)e9*e*
2 ]]]]
where s*5T2( g1k) and e*5y12Z1a* is a T31 vector of 2SLS residuals.
These are slight adaptations of the published results which we shall use later in the paper. In fact Nagar (1961) deflates the sum of squared residuals by T and, as a result, the estimator is biased to
21
order T . We prefer to use the less biased version: see also Kiviet and Phillips (1998).
4. The bias of the asymptotic variance estimator
Subtracting the outer product of (9) from (10), we may deduce an approximation to the variance of 2SLS as follows:
2 2 22
Var(a*)5s Q1s h[tr(CQ )22(L21)tr(C Q )]Q1 2(L23)QC Q1 2(L22)QCQj1o(T ). (12)
2 21 2
ˆ ˆ ˆ
In practice the estimated asymptotic variance, Var(a*)5s*(Z9Z ) , where s* is defined in (11), is used to estimate the variance in finite samples and it is the bias of this estimator which is the main focus of interest in this paper. However, we shall first consider the bias of a non-operational estimator
2 21 2
˜ ˆ ˆ
of the variance given by Var(a*)5s (Z9Z ) where s is known. Since none of the resulting bias
2
can be attributed to the estimator of s , a consideration of this case will be helpful in analysing the source of the bias in the estimated asymptotic variance. In Appendix 1 the following result is proved:
2 21
˜ ˆ ˆ
Lemma 1. The expected value of the non-operational variance estimator Var(a*)5s (Z9Z ) can
be approximated as
2 2 22
˜
E Var(
f
a*)g
5s Q1s ftr(CQ )Q2(L22)QCQg1o(T ). (13)The result of this lemma, combined with (12), leads to the following.
22
˜
Theorem 1. The bias of the non-operational variance estimator Var(a*), to order T , is given by
2 22
˜
(2)
Notice that tr(C Q )Q and QC Q are both positive semi-definite matrices where tr(C Q )Q1 1 1 $QC Q;1
see Kadane (1971, p. 728). Hence the bias matrix above is positive semi-definite for L$2. However,
L$2 is a requirement for the variances to exist so that, in general, the non-operational variance 22
estimator is biased upwards to order T . Next we examine the bias of the asymptotic variance estimator. In Appendix 2 we show the following result.
2 21
ˆ ˆ ˆ
9
Lemma 2. The expected value of the asymptotic variance estimator Var(a*)5s (Z Z ) to the
* 1 1
22
order of T is given by
2 2 22
ˆ
E
f
(Var(a*)g
5s Q1sf
2(L21)QCQ14QC Q1 22(L21)tr(QC )Q1 12tr(QC )Qg
1o(T ). (15) An approximation for the bias can now be readily obtained. Combining the result in Lemma 2 with the approximation in (12) gives the next theorem.2 21
ˆ ˆ ˆ
Theorem 2. The bias of the asymptotic variance estimator Var(a*)5s (Z9Z ) , to the order of
*
22
T , is given by
2 22
ˆ
E
f
Var(a*)2Var(a*)g
5sf
tr(QC )Q1(L11)QC Q1g
1o(T ). (16)Noting that both tr(QC )Q and QC Q are positive semi-definite, it is clear that the estimated1
asymptotic variance is, in general, biased upwards to the order of the approximation. Comparing
2
Theorems 1 and 2, it is seen that the direction of the bias is unaltered by the need to estimate s although the bias expression itself changes. It does not appear possible to make general statements
ˆ
about the relative magnitudes of the two biases. This is partly because the bias of Var(a*) depends on
˜
the matrix C whereas the bias of Var(2 a*) does not.
ˆ
Given that an explicit expression for the bias of the asymptotic variance estimator Var(a*) has been found, a bias corrected estimator can be obtained straightforwardly. Estimates are available for the relevant terms in the bias approximation so that an estimate of the bias can be obtained which is then subtracted from the original estimator. We shall not pursue the matter further in this paper however.
5. Conclusion
This paper shows that the traditional variance estimator for the 2SLS coefficient estimator is, in 22
general, biased upwards to order T . This is a surprisingly strong result for it applies whatever the data and parameter set. The magnitude and the effects of this bias have not been studied. We can, of course, speculate that 2SLS confidence intervals will be conservative in small samples but this is not a clear cut matter since there is also a coefficient estimator bias. In fact, given this bias, it is possible that the upward bias in the variance might lead to improved confidence interval coverage. This issue, the option to correct for the coefficient bias, and the effects on hypothesis tests etc, requires further study.
(3)
Appendix A
2 21 22
˜ ˆ ˆ
9
Here we derive the expected value of Var(a*)5s (Z Z )1 1 to the order T . We need only
2 ˆ ˆ
consider the inverse matrix since s is constant. First we examine the matrix Z15(Y :X ) where2 1
21 21
ˆ
Y25X(X9X ) X9Y25XP21X(X9X ) X9V . Noting that this last term is stochastic, it is seen that we2
ˆ
may write the matrix Z as the sum of two matrices, one stochastic and one non-stochastic, as follows:1
21 21
ˆ ˘ ¯
Z15(X(X9X ) X9V :O )2 1(XP2:X )1 5X(X9X ) X9V21Z1
21
¯ ¯ ¯
˘
where we have put V25(V :O ) and Z2 15(XP2:X ). Noting that X(X1 9X ) X9Z15Z we may write:1
21 21
ˆ ˆ
9
¯ ¯9
˘9
¯ ¯9
˘ ˘9
˘ ˘9
¯ ¯9
˘ ˘9
¯ ˘Z Z1 15Z Z1 11V Z2 11Z V1 21V X(X2 9X ) X9V25Q 1V Z2 11Z V1 21V MV2 2
21
¯ ¯ ¯
˘
9
9
˘ ˘9
˘5
f
I1(V Z2 11Z V )Q1 2 1V MV Q Q2 2g
21 ¯ ¯ ¯ 21
9
where Q 5Z Z and M1 1 5X(X9X ) X9. Inverting both sides yields:
21 21
ˆ ˆ
9
˘9
¯ ¯9
˘ ˘ ¯ ˘(Z Z )1 1 5Q I
f
1(V Z2 11Z V )Q1 2 1V MV Q2 2g
¯ ¯ ¯ ¯ ¯ ¯ ¯
˘
9
9
˘ ˘9
˘ ˘9
9
˘ ˘9
9
˘5Q I
f
2(V Z2 11Z V )Q1 2 2V MV Q2 2g
1Q (V Zf
2 11Z V )Q(V Z1 2 2 11Z V )Q1 2g
22
1o (Tp ) (A.1)
1 ]
2 21
¯ ¯ 2 ¯
˘
9
9
˘ ˘9
˘where (V Z2 11Z V )Q is O (T1 2 p ) and V MV Q is O (T2 2 p ). Here the inverse matrix has been 21
ˆ ˆ
9
expanded in terms of decreasing orders of smallness. To obtain E (Z Z )
f
1 1g
we shall take expectations of each of the relevant terms in (A1) as follows:¯ ¯ ¯
˘
9
9
˘ ˘• E Q(V Z
f
2 11Z V )Q1 2g
50 since Q and Z are fixed and E(V )1 2 50.¯ ¯ ¯ ¯
˘
9
˘ ˘9
˘• E QV MV Q
f
2 2g
5QE(V MV )Q2 2 5tr(M )QCQ5K QCQ where tr(M )5K, see Kiviet and Phillips(1996, p. 166),
¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
˘
9
9
˘ ˘9
9
˘ ˘9
˘9
˘9
9
˘• E Q(V Z
f
2 11Z V )Q(V Z1 2 2 11Z V )Q1 2g
5E Q(V Z )Q(V Z )Qf
2 1 2 1 1Q(V Z )Q(Z V )Q2 1 1 2¯
9
˘ ˘9
¯ ¯9
˘ ¯9
˘1Q(Z V )Q(V Z )Q1 2 2 1 1Q(Z V )Q(Z V )Q1 2 1 2
g
˘On noting that we may write V25(W1e t1 9:0) where W ande1 are independent, see Nagar (1959), we may use the results in Mikhail (1972) or Kiviet and Phillips (1996, p. 166) to show directly that:
¯ ¯ ¯ ¯
˘
9
˘9
˘9
9
˘E Q(V Z )Q(V Z )Q
f
2 1 2 1g
5QCQ, E Q(V Z )Q(Z V )Qf
2 1 1 2g
5( g1k)QCQ¯
9
˘ ˘9
¯ ¯9
˘ ¯9
˘E Q(Z V )Q(V Z )Q
f
1 2 2 1g
5tr(QC )Q, E Q(Z V )Q(Z V )Qf
1 2 1 2g
5QCQ.Adding terms we find that in (iii)
¯ ¯ ¯ ¯
˘
9
9
˘ ˘9
9 9
˘E Q(V Z
f
2 11Z V )Q(V Z1 2 2 11Z V )Q1 2g
5( g1k12)QCQ1tr(QC )Q. (A.2)2
Finally, using (i)–(iii) above, we have found that the expected value ofs times (A1) is given by:
2 ˆ ˆ
9
2 22(4)
Appendix B
2 21
ˆ ˆ ˆ
9
In this Appendix we prove Lemma 2. Thus we derive the expected value of Var(a*)5s*(Z Z )1 1
22 2
to o(T ) where s* is given in (11). To find the required expansion we first note that the 2SLS residual vector is given by
*
e1 5y12Z1a*5e12Z (1 a*2a) and the sum of squared residuals is then
9
*
9
9
9
e *e1 1 5e e1 122(a*2a)9Z1 1e 1(a*2a)9Z Z (1 1 a*2a). The disturbance variance estimator is
9
*
9
9
9
e *e1 1 e e1 1 (a*2a)9Z1 1e (a*2a)9Z Z (1 1 a*2a)
2 ]]]] ]]]] ]]]]] ]]]]]]]]
s*5 5 22 1 (B.1)
T2( g1k) T2( g1k) T2( g1k) T2( g1k)
21 / 2 21
where the first term is O (1), the second is O (Tp p ) and the last is O (Tp ). 21
ˆ ˆ
9
The inverse matrix (Z Z )1 1 is similarly expanded in terms of decreasing stochastic order of magnitude in (A.1) as
21
ˆ ˆ
9
¯9
˘ ˘9
¯ ˘9
¯ ˘ ¯9
˘ ˘9
¯ ¯9
˘ ˘9
¯(Z Z )1 1 5Q2Q(Z V1 21V Z )Q2 1 2QV MV Q2 2 1Q(Z V1 21V Z )Q(Z V2 1 1 21V Z )Q2 1
22
1o (Tp ) (B.2)
21 23 / 2 22
where the successive terms are O(T ), O (Tp ) and O (Tp ) respectively.
2 ˆ ˆ
9
21To find the appropriate expansion for s*(Z Z )1 1 we combine (B.1) and (B.2) to yield
9
e e1 1
2 ˆ ˆ
9
21 ¯ ˘ ˘ ¯ ˘ ¯ ˘]]]]
9
9
9
s*(Z Z )1 1 5
f
Q2Q(Z V1 21V Z )Q2 1 2QV MV Q2 2 T2( g1k)2
¯
9
˘ ˘9
¯ ¯9
˘ ˘9
¯ ]]]]9
1Q(Z V1 21V Z )Q(Z V2 1 1 21V Z )Q2 1
g
2 (a*2a)9Z1 1eT2( g1k)
9
(a*2a)9Z Z (1 1 a*2a) 22
¯
9
˘ ˘9
¯ ]]]]]]]]3
f
Q2Q(Z V1 21V Z )Q2 1g
1 Q1o (Tp ). (B.3)T2( g1k)
We shall take expectations of each of these terms. However the analysis of the first is simplified by noting that
9
9
e e1 1 e e1 1 2 2
]]]]5
S
]]]]2sD
1sT2( g1k) T2( g1k)
21 / 2
9
where the first part on the right is O (Tp ). Hence when (e e1 1) /(T2( g1k)) multiplies terms which
22 2
are O (Tp ), it can be replaced bys and we shall still get an approximation to the expectation of the desired order. We thus consider
(5)
9
e e1 1¯ ˘ ˘ ¯ ˘ ¯ ˘
]]]]
9
9
9
E
H
f
Q2Q(Z V1 21V Z )Q2 1 2QV MV Q2 2g
J
T2( g1k)e9e1 ¯ ˘ ˘ ¯ ¯ ˘ ˘ ¯
]]]]
9
9
9
9
1E
H
f
Q(Z V1 21V Z )Q(Z V2 1 1 21V Z )Q2 1g
J
T2( g1k)e9e1 2 2
¯ ¯ ¯ ¯ ¯
˘ ˘ ˘ ˘ ˘ ˘
]]]]
9
9
9
9
5E
H
T2( g1k)QJ
1Eh
sf
2QV MV Q2 2g
j
1Eh
sf
Q(Z9V21V Z )Q(Z V2 1 1 21V Z )Q2 1g
j
where terms involving products of an odd number of zero mean normal random variables have been ignored. The first two expected values are;
9
e e1 1 2 ( g1k)
]]]] ]]]]
E
H
QJ S
5s 11D
QT2( g1k) T2( g1k)
2
9
¯ 2 ¯ 2E
h
sf
2QV MV Q2 2g
j
5 2s trM. QCQ5 2s K QCQ, (B.4)see Kiviet and Phillips (1996, p. 166).
The third term can be readily evaluated as in (A.2) so that
2 ¯
9
˘ ˘9
¯ ¯9
˘ ˘9
¯ 2 2E
h
sf
Q(Z V1 21V Z )Q(Z V2 1 1 21V Z )Q2 2g
j
5s ( g1k12)QCQ1s tr(QC )Q (B.5) Gathering terms from (B.4) and (B.5), we have found the expectation of the first term of (B.3). Next we consider the second term in (B.3). To analyse this term we first note the expansion21
¯
9
˘9
¯ ¯9
˘ ¯9
˘9
¯ ¯9
(a*2a)5Q(Z1 1e 1V M2 e1)2QZ V QZ1 2 1 1e 2QV Z QZ2 1 1 1e 1o (Tp ) (B.6)
¯ ˘
9
9
9
which is given in Nagar (1959, p. 582). Using this result, and noting that Z1 1e 5Z1 1e 1V2 1e where
1 / 2
the right hand side terms are, respectively, O (Tp ) and O (T ), we may writep
¯ ¯ ¯
˘
9
9
9
9
e9V QZ e e Z QZ e
1 ¯ ˘ ˘ ¯ 2 1 1 1 1 1 1
]]]](a*2a)9Z1 1
9
ef
Q2Q(Z V9
1 21V Z )Q29
1g
5]]]]Q1]]]]QT2( g1k) T2( g1k) T2( g1k)
¯ ¯ ¯ ¯ ¯ ¯
˘ ˘ ˘ ˘ ˘ ˘ ˘
9
9
9
9
9
9
9
9
9
e1V QV M2 2 e1 e1V QZ V QZ2 1 2 9e1 e1V QV Z QZ2 2 1 1 1e e1V QZ2 1 1e
¯ ˘
]]]] ]]]]]] ]]]]]] ]]]]
9
1 T2( g1k) Q1 T2( g1k) Q1 T2( g1k) Q1T2( g1k)Q(Z V1 2
22
¯ ˘
9
1V Z )Q2 1 1o (Tp ). (B.7)
The first of these terms involves a product of an odd number of normal random variables with zero mean and so has expected value zero. Taking expected values for the remaining terms we have;
¯ ¯
9
9
e1Z QZ1 1 1e 2 ( g1k)
]]]] ]]]]
E
H
J
Q5s QT2( g1k) T2( g1k)
¯ ˘
9
˘9
e9V QV M2 2 e1 2 TKtr(QC )1
]]]]] ]]]]
E
H
J
Q5s Q(6)
¯ ¯
˘ ˘
9
9
9
e1V QZ V QZ2 1 2 1 1e 2 T tr(QC )1
]]]]]] ]]]]
E
H
2J
Q5 2s QT2( g1k) T2( g1k)
¯ ¯
˘ ˘
9
9
e1V QV Z QZ2 2 1 9e1 2T( g1k)tr(QC )1
]]]]]] ]]]]]
E
H
2J
Q5 2s QT2( g1k) T2( g1k)
2
¯ ˘
9
9
e1V QZ2 1 1e ¯ ˘ ˘ ¯ 2s
]]]]
9
9
]]]]E
H
2 Q(Z V1 21V Z )Q2 1J
5 2f
(T12)QC Q1 1QC Q2g
T2( g1k) T2( g1k)
2
2s T 22
]]]]
5 2 QC Q1 1o (Tp ). (B.8)
T2( g1k)
Here the results for evaluating the second, third and fourth terms are given in Nagar (1961, p. 242).
˘
An evaluation of the last term proceeds from putting V25W1e t1 9 where W and e1 are independent. The required analysis is straightforward but lengthy and so is not included here. The authors will provide details on request. Collecting the terms in (B.8) and multiplying by 22 yields the expectation of the second term of (B.3). To complete the analysis we need the expected value of the last term in (B.3). From Nagar (1961, p. 243) we find that
2
9
(a*2a)9Z Z (1 1 a*2a) s fg1k1Ttr(QC )g 21
]]]]]]]] ]]]]]]]
E
H
QJ
5 Q1o(T ). (B.9)T2( g1k) T2( g1k)
Collecting the various terms we have the result given in Lemma 2.
References
Kadane, J.B., 1971. Comparison of k-class estimators when the disturbances are small. Econometrica 39, 723–737. Kinal, T.W., 1980. The existence of moments of k-class estimators. Econometrica 48, 241–249.
Kiviet, J.F., Phillips, G.D.A., 1996. The bias of the ordinary least squares estimator in simultaneous equation models. Economics Letters 53, 161–167.
Kiviet, J.F., Phillips, G.D.A., 1998. Degrees of freedom adjustment for disturbance variance estimators in dynamic regression models. Econometrics Journal 1, 44–70.
Mikhail, W.M., 1972. The bias of the two stage least squares estimator. Journal of the American Statistical Association 67, 625–627.
Nagar, A.L., 1959. The bias and moment matrix of the general k-class estimators of the parameters in simultaneous equations. Econometrica 27, 575–595.