07350015%2E2014%2E907093

Journal of Business & Economic Statistics

ISSN: 0735-0015 (Print) 1537-2707 (Online) Journal homepage: http://www.tandfonline.com/loi/ubes20

Identification and Efficient Estimation of
Simultaneous Equations Network Models
Xiaodong Liu
To cite this article: Xiaodong Liu (2014) Identification and Efficient Estimation of Simultaneous
Equations Network Models, Journal of Business & Economic Statistics, 32:4, 516-536, DOI:
10.1080/07350015.2014.907093
To link to this article: http://dx.doi.org/10.1080/07350015.2014.907093

Accepted author version posted online: 24
Apr 2014.

Submit your article to this journal

Article views: 288

View related articles


View Crossmark data

Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=ubes20
Download by: [Universitas Maritim Raja Ali Haji], [UNIVERSITAS MARITIM RA JA ALI HA JI
TANJUNGPINANG, KEPULAUAN RIAU]

Date: 11 January 2016, At: 20:22

Downloaded by [Universitas Maritim Raja Ali Haji], [UNIVERSITAS MARITIM RAJA ALI HAJI TANJUNGPINANG, KEPULAUAN RIAU] at 20:22 11 January 2016

Identification and Efficient Estimation of
Simultaneous Equations Network Models
Xiaodong LIU
Department of Economics, University of Colorado, Boulder, CO 80309 (xiaodong.liu@colorado.edu)
This article considers identification and estimation of social network models in a system of simultaneous
equations. We show that, with or without row-normalization of the social adjacency matrix, the network
model has different equilibrium implications, needs different identification conditions, and requires different estimation strategies. When the adjacency matrix is not row-normalized, the variation in the Bonacich
centrality across nodes in a network can be used as an IV to identify social interaction effects and improve
estimation efficiency. The number of such IVs depends on the number of networks. When there are many

networks in the data, the proposed estimators may have an asymptotic bias due to the presence of many
IVs. We propose a bias-correction procedure for the many-instrument bias. Simulation experiments show
that the bias-corrected estimators perform well in finite samples. We also provide an empirical example to
illustrate the proposed estimation procedure.
KEY WORDS: Many-instrument bias; Social networks.

1.

INTRODUCTION

Since the seminal work by Manski (1993), social network
models have attracted a lot of attention (see Blume et al. 2011,
for a recent survey). In a social network model, agents interact with each other through network connections, which are
captured by a social adjacency matrix. According to Manski
(1993), an agent’s choice may be influenced by peers’ choices
(the endogenous effect), by peers’ exogenous characteristics (the
contextual effect), and/or by the common environment of the
network (the correlated effect). It is the main interest of social
network research to identify and estimate the different social
interaction effects.

Manski (1993) considered a linear-in-means model, where
the endogenous effect is based on the rational expectation of the
choices of all agents in the network. Manski showed that the
linear-in-means specification suffers from the “reflection problem” so that endogenous and contextual effects cannot be separately identified. Lee (2007) introduced a model with multiple
networks where an agent is equally influenced by all the other
agents in the same network. Lee’s social network model can be
identified by the variation in network sizes. The identification,
however, can be weak if all of networks are large. Bramoullé,
Djebbari, and Fortin (2009) generalized Lee’s social network
model to a general local-average model, where endogenous and
contextual effects are represented, respectively, by the average
choice and average characteristics of an agent’s connections (or
friends).1 Based on the important observation that in a social
network, an agent’s friend’s friend may not be a (direct) friend
of that agent, Bramoullé, Djebbari, and Fortin (2009) used the
intransitivity in network connections as an exclusion restriction
to identify different social interaction effects.
Liu and Lee (2010) considered the estimation of the localaggregate social network model, where the endogenous effect
is given by the aggregate choice of an agent’s friends. They
showed that, for the local-aggregate model, different positions


of the agents in a network captured by the Bonacich (1987) centrality can be used as an additional instrumental variable (IV)
to improve estimation efficiency. Liu, Patacchini, and Zenou
(2014) gave the identification condition for the local-aggregate
model and showed that the condition is in general weaker than
that for the local-average model derived by Bramoullé, Djebbari, and Fortin (2009). They also proposed a J test for the
specification of network models.
The aforementioned papers focus on single-equation network
models involving only one activity. However, in real life, an
agent’s decision usually involves more than one activity. For example, a student may need to balance time between study and extracurriculars and a firm may need to allocate resources between
production and R&D (König, Liu, and Zenou 2014). In a recent
article, Cohen-Cole, Liu, and Zenou (2012) considered the identification and estimation of local-average network models in
the framework of simultaneous equations. Besides endogenous,
contextual, and correlated effects as in single-equation network
models, the simultaneous equations network model also incorporates the simultaneity effect, where an agent’s choice in a
certain activity may depend on his/her choice in a related activity, and the cross-activity peer effect, where an agent’s choice
in a certain activity may depend on peers’ choices in a related
activity. Cohen-Cole, Liu, and Zenou (2012) derived the identification conditions for the various social interaction effects and
generalize the spatial two-stage least-squares (2SLS) and threestage least-squares (3SLS) estimators in Kelejian and Prucha
(2004) to estimate the simultaneous equations network model.

In this article, we consider the identification and efficient
estimation of the local-aggregate network model in a system of simultaneous equations. We show that, similar to the
single-equation network model, the Bonacich centrality provides additional information to achieve model identification and
to improve estimation efficiency. We derive the identification

1Bramoullé,

Djebbari, and Fortin (2009) indicated that the first part of Proposition 1 applies to arbitrary (non-row-normalized) adjacency matrices.
516

© 2014 American Statistical Association
Journal of Business & Economic Statistics
October 2014, Vol. 32, No. 4
DOI: 10.1080/07350015.2014.907093

Downloaded by [Universitas Maritim Raja Ali Haji], [UNIVERSITAS MARITIM RAJA ALI HAJI TANJUNGPINANG, KEPULAUAN RIAU] at 20:22 11 January 2016

Liu: Identification and Efficient Estimation of Simultaneous Equations Network Models

condition for the local-aggregate simultaneous equations network model, and show that the condition can be weaker than

that for the local-average model. For efficient estimation, we
suggest 2SLS and 3SLS estimators using additional IVs based
on the Bonacich centrality of each network. As the number of
such IVs depends on the number of networks, the 2SLS and
3SLS estimators would have an asymptotic many-instrument
bias (Bekker 1994) when there are many networks in the data.
Hence, we propose a bias-correction procedure based on the estimated leading-order term of the asymptotic bias. Monte Carlo
experiments show that the bias-corrected estimators perform
well in finite samples. We also provide an empirical example to
illustrate the proposed estimation procedure.
The rest of the article is organized as follows. Section 2
introduces a network game which motivates the specification of
the econometric model presented in Section 3. Section 4 derives
the identification conditions and Section 5 proposes 2SLS and
3SLS estimators for the model. The regularity assumptions and
detailed proofs are given in the Appendix. Monte Carlo evidence
on the finite sample performance of the proposed estimators is
given in Section 6. Section 7 provides an empirical example.
Section 8 briefly concludes.
2.


THEORETICAL MODEL

2.1 The Network Game

u(y1i , y2i ) =

+ λ∗11
+

λ∗21

+

1
1
2
2
− φ2∗ y2i
+ φ ∗ y1i y2i

− φ1∗ y1i
2
2
n
gij y2i y2j
gij y1i y1j + λ∗22

π2i∗ y2i

n

j =1

n

j =1

As in the standard linear-quadratic utility for a single activity
model (Ballester, Calvó-Armengol, and Zenou 2006), π1i∗ and
π2i∗ capture ex ante individual heterogeneity. The cross-effects

between own efforts for different activities are given by
∂ 2 u(y1i , y2i )
= φ∗.
∂y1i ∂y2i
The cross-effects between own and peer efforts for the same
activity are
∂ 2 u(y1i , y2i )
= λ∗11 gij
∂y1i ∂y1j

and

which may indicate strategic substitutability or complementarity
depending on the signs of λ∗11 and λ∗22 . The cross-effects between
own and peer efforts for different activities are given by
∂ 2 u(y1i , y2i )
= λ∗21 gij
∂y1i ∂y2j

and


∂ 2 u(y1i , y2i )
= λ∗12 gij ,
∂y2i ∂y1j

which may indicate strategic substitutability or complementarity
depending on the signs of λ∗21 and λ∗12 .
From the first-order conditions of utility maximization, we
have the best-response functions:
n
n
gij y2j + π1i ,
gij y1j + λ21
y1i = φ1 y2i + λ11
j =1

j =1

gij y1i y2j +


λ∗12

n

j =1

gij y2i y1j .

(1)

(2)

y2i = φ2 y1i + λ22

n

j =1

gij y2j + λ12

n

j =1

gij y1j + π2i ,
(3)

where φ1 = φ ∗ /φ1∗ , φ2 = φ ∗ /φ2∗ , λ11 = λ∗11 /φ1∗ , λ22 = λ∗22 /φ2∗ ,
λ21 = λ∗21 /φ1∗ , λ12 = λ∗12 /φ2∗ , π1i = π1i∗ /φ1∗ , π2i = π2i∗ /φ2∗ . In
(2) and (3), agent i’s best-response effort of a certain activity depends on the aggregate efforts of his/her friends of
that activity and a related activity. Therefore, we call this
model the local-aggregate network game. In matrix form, the
best-response functions are
Y1 = φ1 Y2 + λ11 GY1 + λ21 GY2 + 1 ,

(4)

Y2 = φ2 Y1 + λ22 GY2 + λ12 GY1 + 2 ,

(5)

where Yk = (yk1 , . . . , ykn )′ and k = (πk1 , . . . , πkn )′ for k =
1, 2.
The reduced-form equations of (4) and (5) are
SY1 = (I − λ22 G)1 + (φ1 I + λ21 G)2 ,
SY2 = (I − λ11 G)2 + (φ2 I + λ12 G)1 ,

2
1
3

∂ 2 u(y1i , y2i )
= λ∗22 gij ,
∂y2i ∂y2j

j =1

Suppose there is a finite set of agents N = {1, . . . , n} in a
network. We keep track of social connections in the network
through its adjacency matrix G = [gij ]. If the network is undirected, then G is symmetric with gij = 1 if i and j are connected
and gij = 0 otherwise. If the network is directed, then G may
be asymmetric with gij = 1 if an arc is directed from i to j and
gij = 0 otherwise. Note that the undirected network can be considered as a special case of the directed network. Let G∗ = [gij∗ ] ,

with gij∗ = gij / nj=1 gij , denote the row-normalized adjacency
matrix such that each row of G∗ adds up to 1. Figure 1 gives an
example of G and G∗ for an undirected star-shaped network.
Given the network structure represented by G, agent i chooses
y1i and y2i , the respective efforts of two related activities, to
maximize the following linear quadratic utility function:
π1i∗ y1i

517

4
Figure 1. An example of G and G∗ for an undirected star-shaped network.

Downloaded by [Universitas Maritim Raja Ali Haji], [UNIVERSITAS MARITIM RAJA ALI HAJI TANJUNGPINANG, KEPULAUAN RIAU] at 20:22 11 January 2016

518

Journal of Business & Economic Statistics, October 2014

where I is a conformable identity matrix and
S = (1 − φ1 φ2 )I − (λ11 + λ22 + φ1 λ12 + φ2 λ21 )G
+ (λ11 λ22 − λ12 λ21 )G2 .

(6)

If S is nonsingular (a sufficient condition for the nonsingularity of S is |ϕ1 ϕ2 | + |λ11 + λ22 + ϕ1 λ12 + ϕ2 λ21 | · ||G||∞ +
|λ11 λ22 − λ12 λ21 | · ||G||2∞ < 1, where || · ||∞ is the row-sum
matrix norm), then the local-aggregate network game has a
unique Nash equilibrium in pure strategies with the equilibrium
efforts given by
Y1∗ = S −1 [(I − λ22 G)1 + (φ1 I + λ21 G)2 ],

(7)

[(I − λ11 G)2 + (φ2 I + λ12 G)1 ].

(8)

Y2∗

=S

−1

2.2 Local Aggregate Versus Local Average: Equilibrium
Comparison
In a recent article, Cohen-Cole, Liu, and Zenou (2012) considered a network game with utility function
1
1
2
2
− φ2∗ y2i
+ φ ∗ y1i y2i
u(y1i , y2i ) = π1i∗ y1i + π2i∗ y2i − φ1∗ y1i
2
2
n
n


gij∗ y2i y2j
gij∗ y1i y1j + λ∗22
+λ∗11
+

λ∗21

n

g ∗ y1i y2j +λ∗12
j =1 ij

n

g ∗ y2i y1j .
j =1 ij

(9)

From the first-order conditions of maximizing (9), the bestresponse functions of the network game are
n
n
y1i = φ1 y2i + λ11
gij∗ y1j + λ21
gij∗ y2j + π1i ,
j =1

y2i = φ2 y1i + λ22

or, in matrix form,

n

j =1

j =1

gij∗ y2j + λ12

n

j =1

gij∗ y1j + π2i ,

Y1 = φ1 Y2 + λ11 G∗ Y1 + λ21 G∗ Y2 + 1 ,

(10)

Y2 = φ2 Y1 + λ22 G∗ Y2 + λ12 G∗ Y1 + 2 .

(11)



As G is row-normalized, in (10) and (11), agent i’s bestresponse effort of a certain activity depends on the average
efforts of his/her friends of that activity and a related activity.
Therefore, we call this model the local-average network game.
Cohen-Cole, Liu, and Zenou (2012) showed that, if S ∗ is nonsingular, where


c1 = [(1 − λ22 )π1 + (φ1 + λ21 )π2 ]/[(1 − φ1 φ2 )
− (λ11 + λ22 + φ1 λ12 + φ2 λ21 ) + (λ11 λ22 − λ12 λ21 )],
c2 = [(1 − λ11 )π2 + (φ2 + λ12 )π1 ]/[(1 − φ1 φ2 )
− (λ11 + λ22 + φ1 λ12 + φ2 λ21 ) + (λ11 λ22 − λ12 λ21 )].
Thus, for the local-average network game, the equilibrium efforts and payoffs are the same for all agents. On the other hand,
for the local-aggregate network game, it follows from (7) and
(8) that

j =1

j =1

1 and 2 , agents with different positions in the network would
have different equilibrium payoffs. On the other hand, the localaverage game is based on the mechanism of social conformism.
Liu, Patacchini, and Zenou (2014) showed that the best-response
function of the local-average network game can be derived from
a setting where an agent will be punished if he deviates from the
“social norm” (the average behavior of his friends). Therefore,
if the agents are identical ex ante, they would behave the same
in equilibrium.
To illustrate this point, suppose the agents in a network are
ex ante identical such that 1 = π1 ln and 2 = π2 ln , where
π1 , π2 are constant scalars and ln is an n × 1 vector of ones. As
G∗ ln = G∗2 ln = ln , it follows from (12) and (13) that Y1∗ = c1 ln
and Y2∗ = c2 ln , where



S = (1 − φ1 φ2 )I − (λ11 + λ22 + φ1 λ12 + φ2 λ21 )G

Y1∗ = S −1 [(I − λ22 G)π1 + (φ1 I + λ21 G)π2 ]ln ,

Y2∗ = S −1 [(I − λ11 G)π2 + (φ2 I + λ12 G)π1 ]ln .

Thus, the agents would have different equilibrium efforts and
payoffs if Gln is not proportional to ln , that is agents have
different number of friends.
Therefore, the local-aggregate and local-average network
games have different equilibrium and policy implications. Understanding the different microfoundations of the two models,
including their different equilibrium implications, may help
us to understand which model is more appropriate for a certain economic context.2 The following sections show that the
econometric model for the local-aggregate network game has
some interesting features that requires different identification
conditions and estimation methods from those for the localaverage model studied by Cohen-Cole, Liu, and Zenou (2012).
3.

ECONOMETRIC MODEL

3.1 The Local-Aggregate Simultaneous Equations
Network Model

then the network game with payoffs (9) has a unique Nash
equilibrium in pure strategies given by

The econometric network model follows the best-response
functions (4) and (5). Suppose the n observations in the data are
partitioned into r̄ networks, with nr agents in the rth network.
For the rth network, let

Y1∗ = S ∗−1 [(I − λ22 G∗ )1 + (φ1 I + λ21 G∗ )2 ],

(12)

[(I − λ11 G )2 + (φ2 I + λ12 G )1 ].

(13)

1,r = Xr β1 + Gr Xr γ1 + α1,r lnr + ǫ1,r ,

+ (λ11 λ22 − λ12 λ21 )G∗2 ,

Y2∗

=S

∗−1





Although the best-response functions of the local-aggregate
and local-average network games share similar functional forms,
they have different implications. As pointed out by Liu, Patacchini, and Zenou (2014), in the local-aggregate game, even
if agents are ex ante identical in terms of individual attributes

2,r = Xr β2 + Gr Xr γ2 + α2,r lnr + ǫ2,r .
2It may not always be obvious to see whether the local-average or local-aggregate

model is more appropriate just by economic intuition. Sometimes, a specification
test is needed. Liu, Patacchini, and Zenou (2014) extended the J test to network
models.

Downloaded by [Universitas Maritim Raja Ali Haji], [UNIVERSITAS MARITIM RAJA ALI HAJI TANJUNGPINANG, KEPULAUAN RIAU] at 20:22 11 January 2016

Liu: Identification and Efficient Estimation of Simultaneous Equations Network Models

where Xr is an nr × kx matrix of exogenous variables, Gr is the
adjacency matrix of network r, lnr is an nr × 1 vector of ones,
and ǫ1,r , ǫ2,r are nr × 1 vectors of disturbances. The specification of 1,r and 2,r is quite common for network models (see,
e.g., Bramoullé, Djebbari, and Fortin 2009; Calvó-Armengol,
Patacchini, and Zenou 2009; Liu and Lee 2010). In spatial
econometrics literature, this specification is known as the spatial
Durbin model (LeSage and Pace 2009). Essentially, this model
specification imposes exclusion restrictions such that 1,r and
2,r do not depend on the characteristics of indirect connections. It follows by (4) and (5) that Y1,r and Y2,r , which are
nr × 1 vectors of observed choices of two related activities for
the agents in the rth network, are given by
Y1,r = φ1 Y2,r + λ11 Gr Y1,r + λ21 Gr Y2,r + Xr β1 + Gr Xr γ1
+ α1,r lnr + ǫ1,r ,
Y2,r = φ2 Y1,r + λ22 Gr Y2,r + λ12 Gr Y1,r + Xr β2 + Gr Xr γ2
+ α2,r lnr + ǫ2,r .
Let diag{As } denote a “generalized” block diagonal matrix with diagonal blocks being ns × ms matrices As ’s. For

′ ′
) , X = (X1′ , . . . , Xr̄′ )′ , αk =
k = 1, 2, let Yk = (Yk,1
, . . . , Yk,r̄



)′ , L = diag{lnr }r̄r=1 and
(αk,1 , . . . , αk,r̄ ) , ǫk = (ǫk,1 , . . . , ǫk,r̄

G = diag{Gr }r=1 . Then, for all the r̄ networks,
Y1 = φ1 Y2 + λ11 GY1 + λ21 GY2 + Xβ1 + GXγ1 + Lα1 + ǫ1 ,
(14)
Y2 = φ2 Y1 + λ22 GY2 + λ12 GY1 + Xβ2 + GXγ2 + Lα2 + ǫ2 .
(15)




For ǫ1 = (ǫ11 , . . . , ǫ1n ) and ǫ2 = (ǫ21 , . . . , ǫ2n ) , we assume
2
2
E(ǫ1i ) = E(ǫ2i ) = 0, E(ǫ1i
) = σ12 , and E(ǫ2i
) = σ22 . Furthermore, we allow the disturbances of the same agent to be
correlated across equations by assuming E(ǫ1i ǫ2i ) = σ12 and
E(ǫ1i ǫ2j ) = 0 for i = j . When S given by (6) is nonsingular, the
reduced-form equations of the model are
Y1 = S −1 [X(φ1 β2 + β1 ) + GX(λ21 β2 − λ22 β1 + φ1 γ2 + γ1 )
+ G2 X(λ21 γ2 − λ22 γ1 ) + L(φ1 α2 + α1 )
+ GL(λ21 α2 − λ22 α1 )] + S −1 u1 ,

Y2 = S

−1

(16)

[X(φ2 β1 + β2 ) + GX(λ12 β1 − λ11 β2 + φ2 γ1 + γ2 )

+ G2 X(λ12 γ1 − λ11 γ2 ) + L(φ2 α1 + α2 )
+ GL(λ12 α1 − λ11 α2 )] + S −1 u2 ,

(17)

where
u1 = (I − λ22 G)ǫ1 + (φ1 I + λ21 G)ǫ2 ,

(18)

u2 = (I − λ11 G)ǫ2 + (φ2 I + λ12 G)ǫ1 .

(19)

In this model, we allow network-specific effects α1,r and α2,r
to depend on X and G by treating α1 and α2 as r̄ × 1 vectors of unknown parameters (as in a fixed-effect panel data
model). When the number of network r̄ is large, we may have
the “incidental parameter” problem (Neyman and Scott 1948).
To avoid this problem, we transform (14) and (15) using a
deviation from group mean projector J = diag{Jr }r̄r=1 where

519

Jr = Inr − n1r lnr ln′ r . This transformation is analogous to the
“within” transformation for fixed-effect panel data models. As
JL = 0, the transformed equations are
JY1 = φ1 JY2 + λ11 JGY1 + λ21 JGY2 + JXβ1
+ JGXγ1 + J ǫ1 ,

(20)

JY2 = φ2 JY1 + λ22 JGY2 + λ12 JGY1 + JXβ2
+ JGXγ2 + J ǫ2 .

(21)

Our identification results and estimation methods are based on
the transformed model.
3.2

Identification Challenges

Analogous to the local-average simultaneous equations network model studied by Cohen-Cole, Liu, and Zenou (2012), the
local-aggregate simultaneous equations network model given
by (14) and (15) incorporates (within-activity) endogenous effects, contextual effects, simultaneity effects, cross-activity peer
effects, network-correlated effects, and cross-activity correlated
effects. It is the main purpose of this article to establish identification conditions and propose efficient estimation methods for
the various social interaction effects.
3.2.1 Endogenous Effect and Contextual Effect. The endogenous effect, where an agent’s choice may depend on choices
of his/her friends in the same activity, is captured by the coefficients λ11 and λ22 . The contextual effect, where an agent’s
choice may depend on the exogenous characteristics of his/her
friends, is captured by γ1 and γ2 .
The nonidentification of a social interaction model caused by
the coexistence of those two effects is known as the “reflection
problem” (Manski 1993). For example, in a linear-in-means
model, where an agent is equally affected by all the other agents
in the network and by nobody outside the network, the mean of
endogenous regressor is perfectly collinear with the exogenous
regressors. Hence, endogenous and contextual effects cannot be
separately identified.
In reality, an agent may not be evenly influenced by all the
other agents in a network. In a network model, it is usually assumed that an agent is directly influenced only by his/her friends.
Note that, if individuals i, j are friends and j, k are friends, it
does not necessarily imply that i, k are also friends. Thus, the
intransitivity in network connections provides an exclusion restriction to identify the model. Bramoullé, Djebbari, and Fortin
(2009) showed that if intransitivities exist in a network so that
I, G∗ , G∗2 are linearly independent, then the characteristics of
an agent’s second-order (indirect) friends G∗2 X can be used as
IVs to identify the endogenous effect from the contextual effect
in the local-average model.3
On the other hand, when Gr does not have constant row
sums, the number of friends represented by the element of Gr lnr
varies across agents. For a local-aggregate model, Liu and Lee
(2010) showed that the Bonacich (1987) centrality, which has
Gr lnr as the leading-order term, can also be used as an IV
3A

stronger identification condition is needed if the network fixed effect is also
included in the model.

Downloaded by [Universitas Maritim Raja Ali Haji], [UNIVERSITAS MARITIM RAJA ALI HAJI TANJUNGPINANG, KEPULAUAN RIAU] at 20:22 11 January 2016

520

for the endogenous effect. For the a local-aggregate seemingly
unrelated regression (SUR) network model with fixed network
effect, we show in the following section that identification is
possible through the intransitivity in network connections and/or
the variation in Bonacich centrality.
3.2.2 Simultaneity Effect and Cross-Activity Peer Effect.
The simultaneity effect, where an agent’s choice in an activity may depend on his/her choice in a related activity, can be
seen in the coefficients φ1 and φ2 . The cross-activity peer effect,
where an agent’s choice may depend on those of his/her friends
in a related activity, is represented by the coefficients λ21 and
λ12 .
For a standard simultaneous equations model without social
interaction effects, simultaneity leads to a well-known identification problem and the usual remedy is to impose exclusion restrictions on the exogenous variables. Cohen-Cole, Liu,
and Zenou (2012) showed that, with the simultaneity effect or
the cross-activity peer effect (but not both), the local-average
network model can be identified without imposing any exclusion restrictions on X, as long as J, J G∗ , J G∗2 , J G∗3 are linearly independent. In this article, we show that, by exploiting
the variation in Bonacich centrality, the local-aggregate network model with the simultaneity effect or the cross-activity
peer effect (but not both) can be identified under weaker
conditions.
However, neither the intransitivity in G nor the variation in
Bonacich centrality would be enough to identify the simultaneous equations network model with both simultaneity and crossactivity peer effects. One possible approach to achieve identification is to impose exclusion restrictions on X. We show that,
with exclusion restrictions on X, the local-aggregate network
model with both simultaneity and cross-activity peer effects can
be identified under weaker conditions than the local-average
model.
3.2.3 Network Correlated Effect and Cross-Activity Correlated Effect. Furthermore, the structure of the simultaneous
equations network model is flexible enough to allow us to incorporate two types of correlated effects.
First, the network fixed effect given by α1,r and α2,r captures the network correlated effect where agents in the same
network may behave similarly as they have similar unobserved
individual characteristics or they face similar institutional environment. Therefore, the network fixed-effect serves as a (partial)
remedy for the selection bias that originates from the possible
sorting of agents with similar unobserved characteristics into a
network.
Second, in the simultaneous equations network model, the
error terms of the same agent is allowed to be correlated across
equations. The correlation structure of the error term captures
the cross-activity correlated effect so that the choices of the
same agent in related activities could be correlated. As our identification results are based on the mean of reduce-form equations, they are not affected by the correlation structure of the
error term. However, for estimation efficiency, it is important
to take into account the correlation in the disturbances. The
estimators proposed in this article extend the generalized spatial 3SLS estimator in Kelejian and Prucha (2004) to estimate
the simultaneous equations network model in the presence of
many IVs.

Journal of Business & Economic Statistics, October 2014

4.

IDENTIFICATION RESULTS

Among the regularity assumptions listed in Appendix A, Assumption 4 is a sufficient condition for identification of the
simultaneous equations network model. Let Z1 and Z2 denote
the matrices of right-hand side (RHS) variables of (14) and (15).
For Assumption 4 to hold, E(J Z1 ) and E(J Z2 ) need to have full
column rank for large enough n. In this section, we provide
sufficient conditions for E(J Z1 ) to have full column rank. The
sufficient conditions for E(J Z2 ) to have full column rank can
be analogously derived.
In this article, we focus on the case where Gr does not have
constant row sums for some network r. When Gr has constant
column sums for all r , the equilibrium implication of the localaggregate network game is similar to that of the local-average
network game and the identification conditions are analogous to
those given in Cohen-Cole, Liu, and Zenou (2012). Henceforth,
let ρ and η (possibly with subscripts) denote some generic constant scalars that may take different values for different uses.
For ease of presentation, we assume G and X are nonstochastic.
Furthermore, in this section, we assume X is a column vector.

4.1 Identification of the SUR Network Model
First, we consider the seemingly unrelated regression (SUR)
network model where φ1 = φ2 = λ21 = λ12 = 0. Thus, (4) and
(5) become
Y1 = λ11 GY1 + Xβ1 + GXγ1 + Lα1 + ǫ1 ,

(22)

Y2 = λ22 GY2 + Xβ2 + GXγ2 + Lα2 + ǫ2 .

(23)

For the SUR network model, an agent’s choice is still allowed
to be correlated with his/her own choices in related activities
through the correlation structure of the disturbances. When φ1 =
φ2 = λ21 = λ12 = 0, it follows from the reduced-form equation
(16) that
E(Y1 ) = (I − λ11 G)−1 (Xβ1 + GXγ1 + Lα1 ).

(24)

For (22), let Z1 = [GY1 , X, GX]. Identification of (22) requires E(J Z1 ) = [E(J GY1 ), J X, J GX] to have full column
rank. As (I − λ11 G)−1 = I + λ11 G(I − λ11 G)−1 and (I −
λ11 G)−1 G = G(I − λ11 G)−1 , it follows from (24) that
E(J GY1 ) = J GXβ1 + J G2 (I − λ11 G)−1 X(λ11 β1
+ γ1 ) + J G(I − λ11 G)−1 Lα1 .

If λ11 β1 + γ1 = 0, J G2 (I − λ11 G)−1 X, with the leading order
term J G2 X, can be used as IVs for the endogenous regressor
J GY1 . On the other hand, if Gr does not have constant row sums
for all r = 1, . . . , r̄, then J G(I − λ11 G)−1 L = 0. As pointed
out by Liu and Lee (2010), the Bonacich centrality given by
G(I − λ11 G)−1 L, with the leading order term GL, can be used
as additional IVs for identification. The following proposition

Downloaded by [Universitas Maritim Raja Ali Haji], [UNIVERSITAS MARITIM RAJA ALI HAJI TANJUNGPINANG, KEPULAUAN RIAU] at 20:22 11 January 2016

Liu: Identification and Efficient Estimation of Simultaneous Equations Network Models

1

2

3

Figure 2. An example where the local-aggregate model can be identified by Proposition 1(ii).

gives a sufficient condition for E(J Z1 ) to have full column
rank.
Proposition 1. Suppose Gr has nonconstant row sums for
some network r. For Equation (22), E(J Z1 ) has full column
rank if
(i) α1,r = 0 or λ11 β1 + γ1 = 0, and Inr , Gr , G2r are linearly
independent; or
(ii) G2r = ρ1 Inr + ρ2 Gr and α1,r (1 − ρ2 λ11 − ρ1 λ211 ) = 0.
The identification condition for the local-aggregate SUR
model given in Proposition 1 is, in general, weaker than that for
the local-average SUR model. As pointed out by Bramoullé,
Djebbari, and Fortin (2009),4 identification of the local-average
model requires the linear independence of I, G∗ , G∗2 , G∗3 . Consider a dataset with r̄ networks, where all networks in the data
are represented by the graph in Figure 1. For the r̄ networks,
G = diag{Gr }r̄r=1 where Gr is given by the adjacency matrix
in Figure 1. For the row-normalized adjacency matrix G∗ , it is
easy to see that G∗3 = G∗ . Therefore, it follows by Proposition 5 of Bramoullé, Djebbari, and Fortin (2009) that the localaverage SUR model is not identified. On the other hand, for
the network in Figure 1, Gr has nonconstant row sums and
I4 , Gr , G2r are linearly independent. Hence, by Proposition 1(i),
the local-aggregate SUR model can be identified if α1,r = 0 or
λ11 β1 + γ1 = 0.
Figure 2 gives an example where the condition in Proposition
1(ii) is satisfied. For the directed graph in Figure 2, it is easy to
see that G2r = ρ1 I3 + ρ2 Gr for ρ1 = ρ2 = 0. As the row sums
of Gr are not constant, by Proposition 1(ii), the local-aggregate
SUR model is identified if α1,r = 0.
The following corollary shows that for a network with symmetric adjacency matrix Gr , the local-aggregate SUR model can
be identified if Gr has nonconstant row sums.
Corollary 1. Suppose λ11 β1 + γ1 = 0 or α1,r = 0 for some
network r. Then, for Equation (22), E(J Z1 ) has full column
rank if Gr is symmetric and has nonconstant row sums.
4.2 Identification of the Simultaneous Equations
Network Model
4.2.1 Identification Under the Restrictions λ21 = λ12 = 0.
For the simultaneous equations network model, besides the endogenous, contextual, and correlated effects, first we incorporate
4As the identification conditions given in Bramoullé, Djebbari, and Fortin (2009)

are based on the mean of reduce-form equations, they are not affected by the
correlation structure of the error term. Hence, they can be applied to the SUR
model.

521

the simultaneity effect so that an agent’s choice is allowed to depend on his/her own choice in a related activity. However, we assume that an agent’s choice is not affected by the friends’ choices
in related activities. Under the restrictions λ21 = λ12 = 0, (14)
and (15) become
Y1 = φ1 Y2 + λ11 GY1 + Xβ1 + GXγ1 + Lα1 + ǫ1 , (25)
Y2 = φ2 Y1 + λ22 GY2 + Xβ2 + GXγ2 + Lα2 + ǫ2 . (26)
For (25), let Z1 = [Y2 , GY1 , X, GX]. For identification of the
simultaneous equations model, E(J Z1 ) is required to have full
column rank for large enough n. The following proposition gives
sufficient conditions for E(J Z1 ) to have full column rank.
Proposition 2. Suppose Gr has nonconstant row sums and
Inr , Gr , G2r are linearly independent for some network r.
• When [lnr , Gr lnr , G2r lnr ] has full column rank, E(J Z1 ) of
Equation (25) has full column rank if
(i) Inr , Gr , G2r , G3r are linearly independent and D1 given
by (B.1) has full rank; or
(ii) G3r = ρ1 Inr + ρ2 Gr + ρ3 G2r and D1∗ given by (B.2)
has full rank.
• When G2r lnr = η1 lnr + η2 Gr lnr , E(J Z1 ) of Equation (25)
has full column rank if


(iii) Inr , Gr , G2r , G3r are linearly independent and D1
given by (B.3) has full rank; or

(iv) G3r = ρ1 Inr + ρ2 Gr + ρ3 G2r and D1 given by (B.4)
has full rank.
Cohen-Cole, Liu, and Zenou (2012) showed that a sufficient condition to identify the local-average simultaneous
equations model under the restrictions λ21 = λ12 = 0 requires
that J, J G∗ , J G∗2 , J G∗3 are linearly independent. The sufficient conditions to identify the restricted local-aggregate
simultaneous equations model given by Proposition 2 are
weaker in general. Consider a dataset, where all networks
are given by the graph in Figure 3. It is easy to see that,
for the row-normalized adjacency matrix G∗ = diag{G∗r }r̄r=1 ,
where G∗r is given in Figure 3, G∗3 = − 14 I + 14 G∗ + G∗2 .
Therefore, the condition to identify the local-average model
does not hold. On the other hand, for the rth network in
the data, G2r l5 = 4l5 + Gl5 . As the row sums of Gr are
not constant and I5 , Gr , G2r , G3r are linearly independent,
the local-aggregate model can be identified according to
Proposition 2(iii).
For another example where identification is possible for the
local-aggregate model but not for the local-average model, let us
revisit the network given by the graph in Figure 1. For a dataset
with r̄ such networks, as G∗3 = G∗ , the condition to identify the
local-average simultaneous equations model given by CohenCole, Liu, and Zenou (2012) does not hold. On the other hand,
for the adjacency matrix without row-normalization, G3r = 3Gr
and G2r l4 = 3l4 . As the row sums of Gr are not constant and
I4 , Gr , G2r are linearly independent, the local-aggregate simultaneous equations model can be identified according to Proposition 2(iv).

Downloaded by [Universitas Maritim Raja Ali Haji], [UNIVERSITAS MARITIM RAJA ALI HAJI TANJUNGPINANG, KEPULAUAN RIAU] at 20:22 11 January 2016

522

Journal of Business & Economic Statistics, October 2014

5

2

1
3

4

Figure 3. An example where the local-aggregate model can be identified by Proposition 2(iii).

Figure 4 provides an example where the condition in Proposition 2(ii) is satisfied. For the directed network in Figure 4,
G3r = 0. As [l3 , Gr l3 , G2r l3 ] has full column rank and I3 , Gr , G2r
are linearly independent, the local-aggregate simultaneous equations model can be identified according to Proposition 2(ii).
4.2.2 Identification Under the Restrictions φ1 = φ2 = 0.
Next, let us consider the simultaneous equations model where an
agent’s choice is allowed to depend on his/her friends’ choices
of the same activity and a related activity. This specification
incorporates the endogenous, contextual, correlated, and crossactivity peer effects, but excludes the simultaneity effect. Under
the restrictions φ1 = φ2 = 0, (14) and (15) become
Y1 = λ11 GY1 + λ21 GY2 + Xβ1 + GXγ1 + Lα1 + ǫ1 , (27)
Y2 = λ22 GY2 + λ12 GY1 + Xβ2 + GXγ2 + Lα2 + ǫ2 . (28)
For (27), let Z1 = [GY1 , GY2 , X, GX]. The following proposition gives sufficient conditions for E(J Z1 ) to have full column
rank.
Proposition 3. Suppose Gr has nonconstant row sums and
Inr , Gr , G2r are linearly independent for some network r.
• When [lnr , Gr lnr , G2r lnr ] has full column rank, E(J Z1 ) of
Equation (27) has full column rank if
(i) Inr , Gr , G2r , G3r are linearly independent and D2 given
by (B.5) has full rank; or
(ii) G3r = ρ1 Inr + ρ2 Gr + ρ3 G2r and D2∗ given by (B.6)
has full rank.
• When G2r lnr = η1 lnr + η2 Gr lnr , E(J Z1 ) of Equation (27)
has full column rank if


(iii) Inr , Gr , G2r , G3r are linearly independent and D2
given by (B.7) has full rank; or

(iv) G3r = ρ1 Inr + ρ2 Gr + ρ3 G2r and D2 given by (B.8)
has full rank.

For the local-average simultaneous equations model under the restrictions φ1 = φ2 = 0, Cohen-Cole, Liu, and Zenou
(2012) gave a sufficient identification condition that requires
J, J G∗ , J G∗2 , J G∗3 to be linearly independent. The sufficient
identification conditions for the local-aggregate model given by
Proposition 3 are weaker in general. As explained in the preceding section, for the network given by the graph in Figures 1 or
3, the identification condition for the local-average model does
not hold. On the other hand, if Gr is given by Figure 3, the localaggregate model can be identified according to Proposition 3(iii)
since the row sums of Gr are not constant, G2r l5 = 4l5 + Gl5 ,
and I5 , Gr , G2r , G3r are linearly independent. Similarly, if Gr is
given by Figure 1, the local-aggregate model can be identified
according to Proposition 3 (iv) since the row sums of Gr are not
constant, G2r l4 = 3l4 , G3r = 3Gr , and I4 , Gr , G2r are linearly
independent.
4.2.3 Nonidentification of the General Simultaneous Equations Model. For the general simultaneous equations model
given by (14) and (15), the various social interaction effects cannot be separately identified through the mean of the RHS variables without imposing any exclusion restrictions. This is because E(Z̄1 ) and E(Z̄2 ), where Z̄1 = [Y2 , GY1 , GY2 , X, GX, L]
and Z̄2 = [Y2 , GY1 , GY2 , X, GX, L], do not have full column
rank as shown in the following proposition.
Proposition 4. For (14) and (15), E(Z̄1 ) and E(Z̄2 ) do not
have full column rank.
Proposition 4 shows that, for the general simultaneous equations model with both simultaneity and cross-activity peer effects, exploiting the intransitivities in social connections and/or
variations in Bonacich centrality does not provide enough
exclusion restrictions for identification. One way to achieve
identification is to impose exclusion restrictions on the exogenous variables. Consider the following model
Y1 = φ1 Y2 + λ11 GY1 + λ21 GY2 + X1 β1 + GX1 γ1

1

2

3

+ Lα1 + ǫ1 ,
Y2 = φ2 Y1 + λ22 GY2 + λ12 GY1 + X2 β2 + GX2 γ2
+ Lα2 + ǫ2 ,

Figure 4. An example where the local-aggregate model can be identified by Proposition 2(ii).

(29)

(30)

where, for ease of presentation, we assume X1 , X2 are vectors and [X1 , X2 ] has full column rank. From the reduced-form

Downloaded by [Universitas Maritim Raja Ali Haji], [UNIVERSITAS MARITIM RAJA ALI HAJI TANJUNGPINANG, KEPULAUAN RIAU] at 20:22 11 January 2016

Liu: Identification and Efficient Estimation of Simultaneous Equations Network Models

J Y2 = J Z2 δ2 + J ǫ2 . From the reduced-form Equations (16)
and (17), we have

Equations (16) and (17), we have
E(Y1 ) = S

−1

2

[X1 β1 + GX1 (γ1 − λ22 β1 ) − G X1 λ22 γ1

JZ1 = E(JZ1 ) + U1 = F1 + U1 ,

+ X2 φ1 β2 + GX2 (λ21 β2 + φ1 γ2 ) + G2 X2 λ21 γ2
+L(α1 + φ1 α2 ) + GL(λ21 α2 − λ22 α1 )],

+ X1 φ2 β1 + GX1 (λ12 β1 + φ2 γ1 ) + G2 X1 λ12 γ1
(32)

where S is given by (6). For (29), let Z1 =
[Y2 , GY1 , GY2 , X1 , GX1 ]. The following proposition gives
sufficient conditions for E(J Z1 ) to have full column rank.
Proposition 5. Suppose Gr has nonconstant row sums for
some network r.
• When [lnr , Gr lnr , G2r lnr ] has full column rank, E(J Z1 ) of
Equation (29) has full column rank if
(i) Inr , Gr , G2r , G3r are linearly independent and D3 given
by (B.9) has full rank; or
(ii) Inr , Gr , G2r are linearly independent, G3r = ρ1 Inr +
ρ2 Gr + ρ3 G2r and D3∗ given by (B.10) has full rank.
• When G2r lnr = η1 lnr + η2 Gr lnr , E(J Z1 ) of Equation (29)
has full column rank if


(iii) Inr , Gr , G2r , G3r are linearly independent and D3
given by (B.11) has full rank;
(iv) Inr , Gr , G2r are linearly independent, G3r = ρ1 Inr +

ρ2 Gr + ρ3 G2r and D3 given by (B.12) has full rank;
or

(v) G2r = η1 Inr + η2 Gr and D3 given by (B.13) has full
rank.
For the general local-average simultaneous equations model,
Cohen-Cole, Liu, and Zenou (2012) provided a sufficient identification condition that requires J, J G∗ , J G∗2 to be linearly
independent. Suppose G∗ = diag{G∗r }r̄r=1 where G∗r is given by
Figure 1. It is easy to see that J G∗2 = −J G∗ . Therefore, the
identification condition for the local-average model does not
hold. On the other hand, as Gr given by Figure 1 has nonconstant row sums and I4 , Gr , G2r are linearly independent, the
local-aggregate model given by (29) and (30) can be identified
according to Proposition 5(iv).
5.

ESTIMATION

5.1 The 2SLS Estimator With Many IVs
The general simultaneous equations model given by (29) and
(30 ) can be written more compactly as
Y1 = Z1 δ1 + Lα1 + ǫ1 and Y2 = Z2 δ2 + Lα2 + ǫ2 ,

J Z2 = E(JZ2 ) + U2 = F2 + U2 ,

(31)

E(Y2 ) = S −1 [X2 β2 + GX2 (γ2 − λ11 β2 ) − G2 X2 λ11 γ2
+L(α2 + φ2 α1 ) + GL(λ12 α1 − λ11 α2 )],

523

(33)

where
Z1 = [Y2 , GY1 , GY2 , X1 , GX1 ],
Z2 = [Y1 , GY2 ,
GY1 , X2 , GX2 ],
δ1 = (φ1 , λ11 , λ21 , β1′ , γ1′ )′ ,
and
δ2 =
(φ2 , λ22 , λ12 , β2′ , γ2′ )′ . As J L = 0, the within transformation with projector J gives J Y1 = J Z1 δ1 + J ǫ1 and

(34)

where
F1 = J [E(Y2 ), GE(Y1 ), GE(Y2 ), X1 , GX1 ],
F2 =
J [E(Y1 ), GE(Y2 ), GE(Y1 ), X2 , GX2 ],
U1 = J [S −1 u2 ,
GS −1 u1 , GS −1 u2 , 0],
and
U2 = J [S −1 u1 , GS −1 u2 ,
−1
GS u1 , 0], with E(Y1 ), E(Y2 ) given by (31) and (32), S
given by (6), and u1 , u2 given by (18) and (19).
Based on (34), the best IVs for J Z1 and J Z2 are F1 and F2 ,
respectively (Lee 2003). However, both F1 and F2 are infeasible
as they involve unknown parameters. Hence, we use linear combinations of feasible IVs to approximate F1 and F2 as in Kelejian
and Prucha (2004) and Liu and Lee (2010). Let Ḡ = φ1 φ2 I +
(λ11 + λ22 + φ1 λ12 + φ2 λ21 )G − (λ11 λ22 − λ12 λ21 )G2 . Under
some regularity
conditions,we have ||Ḡ||∞ < 1. Then, S −1 =
∞
p
−1
(I − Ḡ) = j =0 Ḡj = j =0 Ḡj + Ḡp+1 S −1 . It follows that

p
p+1
||S −1 − j =0 Ḡj ||∞ ≤ ||Ḡ||∞ ||S −1 ||∞ . As ||Ḡ||∞ < 1, the
p
approximation error of
Ḡj diminishes in a geometric
pj =0 j
rate as p → ∞. Since j =0 Ḡ can be considered as a linear combination of [I, G, . . . , G2p ], the best IVs F1 and F2
can be approximated by a linear combination of an n × K IV
matrix
QK = J [X1 , GX1 , . . . , G2p+3 X1 , X2 , GX2 , . . . , G2p+3 X2 ,
GL, . . . , G2p+2 L]

(35)

with an approximation error diminishing very fast when K (or
p) goes to infinity, as required by Assumption 5 in Appendix
A. Let PK = QK (Q′K QK )− Q′K . The many-instrument 2SLS
estimators for δ1 and δ2 are δ̂1,2sls = (Z1′ PK Z1 )−1 Z1′ PK Y1 and
δ̂2,2sls = (Z2′ PK Z2 )−1 Z2′ PK Y2 .5
Let H11 = limn→∞ n1 F1′ F1 and H22 = limn→∞ n1 F2′ F2 . The
following proposition establishes the consistency and asymptotic normality of the many-instrument 2SLS estimator.
Proposition 6. Under Assumptions 1–5, if K → ∞ and

d
−1
) and
K/n → 0, then n(δ̂1,2sls − δ1 − b1,2sls ) → N(0, σ12 H11

d
2 −1
n(δ̂2,2sls − δ2 − b2,2sls ) → N(0, σ2 H22 ), where b1,2sls = (Z1′
PK Z1 )−1 E(U1′ PK ǫ1 ) = Op (K/n) and b2,2sls = (Z2′ PK Z2 )−1 E
(U2′ PK ǫ2 ) = Op (K/n).
From Proposition 6, when the number of IVs K grows at
a rate slower than the sample size n, the 2SLS estimators are
consistent and asymptotically normal. However, the asymptotic
distribution of the 2SLS estimator may not center around the
true parameter value due to the presence of many-instrument

p
bias (see, e.g., Bekker 1994). If K 2 /n → 0, then nb1,2sls → 0,

p
nb2,2sls → 0 and the 2SLS estimators are properly centered.
5The

finite sample properties of IV-based estimators can be sensitive to the
number of IVs. In a recent article, Liu and Lee (2013) derived the Nagartype approximate MSE of the 2SLS estimator for spatial models, which can
be minimized to choose the optimal number of IVs as in Donald and Newey
(2001) . The approximate MSE can be derived in a similar way for the proposed
estimators in the article.

Downloaded by [Universitas Maritim Raja Ali Haji], [UNIVERSITAS MARITIM RAJA ALI HAJI TANJUNGPINANG, KEPULAUAN RIAU] at 20:22 11 January 2016

524

Journal of Business & Economic Statistics, October 2014

The condition that K/n → 0 is crucial for the 2SLS estimator to be consistent. To see this, we look at the firstorder conditions of the 2SLS, n1 Z1′ PK (Y1 − Z1 δ̂1,2sls ) = 0
and n1 Z2′ PK (Y2 − Z2 δ̂2,2sls ) = 0. At the true parameter values,
E[ n1 Z1′ PK (Y1 − Z1 δ1 )] = n1 E(U1′ PK ǫ1 ) and E[ n1 Z2′ PK (Y2 −
Z2 δ2 )] = n1 E(U2′ PK ǫ2 ), where
E(U1′ PK ǫ1 )


(σ12 + φ2 σ12 )tr(PK S −1 ) + (λ12 σ12 − λ11 σ12 )

⎢ tr(PK S −1 G)


⎢ 2
2 ⎥
−1
⎢ (σ1 + φ1 σ12 )tr(PK GS ) + (λ21 σ12 − λ22 σ1 ) ⎥


= ⎢ tr(PK GS −1 G)
⎥ = O(K)


⎢ (σ12 + φ2 σ 2 )tr(PK GS −1 ) + (λ12 σ 2 − λ11 σ12 ) ⎥
1
1



⎣ tr(PK GS −1 G)
02kx ×1
(36)

E(U2′ PK ǫ2 )


(σ12 + φ1 σ22 )tr(PK S −1 ) + (λ21 σ22 − λ22 σ12 )
−1
⎢ tr(PK S G)



⎢ (σ 2 + φ σ )tr(P GS −1 ) + (λ σ − λ σ 2 ) ⎥
2 12
K
12 12
11 2 ⎥
⎢ 2


−1
= ⎢ tr(PK GS G)
⎥ = O(K)


2
2
−1
⎢ (σ12 + φ1 σ2 )tr(PK GS ) + (λ21 σ2 − λ22 σ12 ) ⎥


⎣ tr(PK GS −1 G)

02kx ×1
(37)

by Lemma C.2 in the Appendix. Therefore, E[ n1 Z1′ PK (Y1 −
Z1 δ1 )] and E[ n1 Z2′ PK (Y2 − Z2 δ2 )] may not converge to zero
and, thus, the 2SLS estimators may not be consistent, if the
number of IVs grows at the same or a faster rate than the sample
size.
Note that the submatrix GL in the IV matrix QK given by
(35) has r̄ columns, where r̄ is the number of networks. Hence,
K/n → 0 implies r̄/n = 1/n̄r → 0, where n̄r = n/r̄ is the average network size. So, for the 2SLS estimator with the IV matrix
QK to be consistent, the average network size needs to be large.
On the other hand, K 2 /n → 0 implies r̄ 2 /n = r̄/n̄r → 0. So,
for the 2SLS estimator to be properly centered, the average network size needs to be large relative to the number of networks.
The many-instrument bias of the 2SLS estimator can be
corrected by the estimated leading-order biases b1,2sls and
b2,2sls given in Proposition 6. Let √δ̃1 = (φ̃1 , λ̃11 , λ̃21 , β̃1′ , γ̃1′ )′
and δ̃2 = (φ̃2 , λ̃22 , λ̃12 , β̃2′ , γ̃2′ )′ be n-consistent preliminary
2SLS estimators based on a fixed number of IVs (e.g.,
Q = J [X1 , GX1 , G2 X1 , X2 , GX2 , G2 X2 ]). Let ǫ̃1 = J (Y1 −
Z1 δ̃1 ) and ǫ̃2 = J (Y2 − Z2 δ̃2 ). The leading-order biases can
be estimated by b̂1,2sls = (Z1′ PK Z1 )−1 Ê(U1′ PK ǫ1 ) and b̂2,2sls =
(Z2′ PK Z2 )−1 Ê(U2′ PK ǫ2 ), where Ê(U1′ PK ǫ1 ) and Ê(U2′ PK ǫ2 ) are
obtained by replacing the unknown parameters in (36) and (37)
by δ̃1 , δ̃2 and
σ̃12 = ǫ̃1′ ǫ̃1 /(n − r̄), σ̃12 = ǫ̃2′ ǫ̃2 /(n − r̄), σ̂12 = ǫ̃1′ ǫ̃2 /(n − r̄).
(38)

The bias-corrected 2SLS (BC2SLS) estimators are given by
δ̂1,bc2sls = δ̂1,2sls − b̂1,2sls and δ̂2,bc2sls = δ̂2,2sls − b̂2,2sls .
Proposition 7. Under Assumptions 1–5, if K → ∞

d
−1
and K/n → 0, then n(δ̂1,bc2sls − δ1 ) → N(0, σ12 H11
) and

d
2 −1
n(δ̂2,bc2sls − δ2 ) → N(0, σ2 H22 ).
For the IV matrix QK , K/n → 0 implies 1/n̄r → 0. It follows that the BC2SLS estimators have properly centered asymptotic normal distributions as long as the average network size n̄r
is large.
5.2 The 3SLS Estimator With Many IVs
The 2SLS and BC2SLS estimators consider equation-byequation estimation and are inefficient as they do not make
use of the cross-equation correlation in the disturbances. To
fully use the information in the system, we extend the 3SLS
estimator proposed by Kelejian and Prucha (2004) to estimate
the local-aggregate network model in the presence of many IVs.
We stack the equations in the system (33) as
Y = Zδ + (I2 ⊗ L)α + ǫ,
where Y = (Y1′ , Y2′ )′ , Z = diag{Z1 , Z2 }, δ = (δ1′ , δ2′ )′ , α =
(α1′ , α2′ )′ , and ǫ = (ǫ1′ , ǫ2′ )′ . As (I2 ⊗ J )(I2 ⊗ L) = 0, the within
transformation with projector J gives (I2 ⊗ J )Y = (I2 ⊗
J )Zδ + (I2 ⊗ J )ǫ. Let






σ12 σ12
σ̃12 σ̃12
˜
=
and  =
,
(39)
σ12 σ22
σ̃12 σ̃22
where σ̃12 , σ̃12 , σ̃12 are given by (38). As E(ǫǫ ′ ) =  ⊗ I , the
3SLS estimator with the IV matrix QK is given by δ̂3sls =
˜ −1 ⊗ PK )Z]−1 Z ′ (
˜ −1 ⊗ PK )Y .
[Z ′ (
Let
F = diag{F1 , F2 },
U = diag{U1 , U2 },
and
H = limn→∞ n1 F ′ ( −1 ⊗ I )F . The following proposition
gives the asymptotic distribution of the many-instrument 3SLS
estimator.
Proposition 8. Under Assumptions 1–5, if K → ∞ and
K/n → 0, then

d
n(δ̂3sls − δ − b3sls ) → N(0, H −1 ),
where b3sls = [Z ′ (