Empirical model Directory UMM :Data Elmu:jurnal:E:Economics of Education Review:Vol20.Issue3.2001:

204 S.G. Rivkin Economics of Education Review 20 2001 201–209

3. Empirical model

The acquisition of knowledge and skills is a cumulat- ive process that takes place over many years. Eq. 1 describes the relationship between achievement A, fam- ily background X and peer group quality P for student i who attends school j in year T. The effects of both family background and peer group quality are allowed to accumulate over time, but there is no serial correlation in the error term e. A ijT 5 O T 21 t 51 f t X ij , P jt 1e ijT 1 Estimates of peer effects for a single time period almost certainly confound the influences of current peers with those of peer groups and families from prior years. In order to isolate the effects of current peers, a pretest score is included as a control. Eq. 2 presents a linear, value added specification in which the peer group effect is presumed to depend in part upon family characteristics relative to those of peers. A ijT 5W ijT − 1 a1X ij x1P jT − 1 d1P jT − 1 2X ij D1e ijT 2 Despite the inclusion of a measure of academic achievement in the sophomore year of high school, the estimation of Eq. 2 may still generate biased peer effect coefficients if relevant family background or school vari- ables are omitted. As previously discussed, one treatment for this problem has been the use of more aggregate information as instruments for the school or neighbor- hood data. If valid instruments are identified, i.e. instru- ments that are uncorrelated with e ijT and highly corre- lated with P jT 21 , consistent peer effect estimates can be obtained. The key issue is whether the use of aggregate infor- mation as instruments reduces endogeneity bias. This question is considered with the following simple model in which exogenous explanatory variables are omitted without loss of generality. 11 Note that in a linear specifi- cation, the coefficient that captures the common peer effect for all students, d, cannot be separately identified from the coefficient that captures peer influences that depend upon student background relative to school- mates, D 11 It is straightforward to show that IV estimation using the aggregate information as an instrument and OLS estimation that substitutes the aggregate information in place of the school level measure of peer group quality produce estimates that have the same expected value, because within county deviations in peer group quality are orthogonal to county average peer group qual- ity by construction. A ijT 5P cT − 1 1P dT − 1 b1u ijT 3 In Eq. 3, b is the combined peer effect, P c is the aver- age peer group quality in county c and P d is the deviation of school peer group quality from the county average. The decomposition of the variation in peer group quality into orthogonal within county and between county components makes explicit the fact that b ols is determ- ined by both sources of variation, and that the bias is proportional to the sum of the covariation between the county average peer group quality and the error and the covariation between within county deviations in peer group quality and the error: p lim bˆ ols 1s u,P c 1s u,p d s 2 P c 1s 2 P d 4 By comparison, b IV is identified solely by between county variation: p lim bˆ IV 5b1s u,p c s 2 P c 5 The instrumental variable estimate is consistent as long as s Pc,u equals 0. However, if s Pc,u does not equal 0, the use of aggregate data as instruments may move the estimate away from its true value. This will occur if aggregation reduces the denominator of the second term of Eq. 4 and Eq. 5 proportionately more than the numerator, i.e. if the between county variation is rela- tively more contaminated by endogeneity bias than the within county variation. Unfortunately, it is not possible to observe the covari- ations between the within and between county compo- nents of the peer group characteristics and the error. In the case where there are more instruments than endogen- ous explanatory variables, tests of over identifying restrictions such as the Sargan test can be used to test the hypothesis that the instruments are uncorrelated with the structural error. 12 However, this test lacks power against some alternative hypotheses, 13 and it is difficult to interpret the results. Because the instruments are valid only in the case where they have zero explanatory power, the inability to reject the null of zero explanatory power at the 95, 90 or even 50 percent significance levels is 12 The Sargan test statistic equals Sargan 5T2kR 2 |c 2 r where R 2 = the value of R 2 from a regression of the IV residuals from the second stage on the exogenous explanatory variables and the instruments; T = number of observations; k = number of parameters in the outcome equation; and r = number of over- identifying restrictions instruments minus endogenous explana- tory variables. 13 For example, the test cannot distinguish between a set of instruments that is truly unrelated to the structural error and a set in which each instrument has the same relationship with the structural error. 205 S.G. Rivkin Economics of Education Review 20 2001 201–209 certainly not evidence that the true correlation between the instruments and the error is zero. Moreover, the addition of instruments orthogonal to both the endogen- ous explanatory variable and the error increases the prob- ability that the null of zero is not rejected without reduc- ing the correlation between the instrument and the error, raising doubts about the value of information gained from this test. Despite these problems, the Sargan Test statistic will be reported where appropriate. In cases where tests of over identifying restrictions cannot be used, such as just identified specifications or models with binary dependent variables, there is no asymptotically consistent test of the relationship between the structural error and the instruments. One potential way to obtain information on the correlation between the instrument and the error is to examine the explanatory power of the instruments in a regression of the outcome variable on the endogenous explanatory variable, the instruments, and the included exogenous variables. This is the evidence offered by Evans, Oates, and Schwab in support of the validity of their instruments. However, it is straightforward to show that in both the just identified and over-identified cases, the explanatory power of the instruments in such a regression offer no useful infor- mation concerning the relationship between the instru- ments and the structural error Rivkin Woglom, 1999. Though direct tests of instrument validity are either weak or uninformative, the OLS and IV estimates them- selves can provide information on the desirability of using aggregate information as instruments. Theory in favor of aggregation makes explicit the argument that the within county or metropolitan area deviation in peer group quality is contaminated by unobserved family characteristics related to the choice of neighborhoods and schools, and that such contamination introduces an upward bias into estimates of peer group effects. There- fore precisely estimated IV coefficients that are smaller than single equation estimates are consistent with the view that aggregation reduces endogeneity bias, while precisely estimated IV coefficients that are larger than the single equation estimates contradict that view. Of course imprecise IV estimates provide little useful infor- mation.

4. Results