School sample fixed effects and community effects

318 M. Binder Economics of Education Review 18 1999 311–325 Table 3 Models of expected schooling using current attainment of all siblings in a Tobit model a standard errors in parentheses Dependent variable Current attained schooling for all siblings Tobit Desired schooling of: Children Parents Desired schooling of parent or child 0.040 0.140 0.028 0.028 Log of household spending 0.536 0.317 0.181 0.181 No response for household spending 0.737 1.058 0.328 0.345 Parent schooling 0.121 0.097 0.038 0.038 Home-owners without legal title 2 0.731 2 0.586 0.234 0.225 Siblings 2 0.342 2 0.340 0.042 0.042 Nuclear 0.941 0.821 0.194 0.193 Log likelihood 2 1441 2 1399 N 1531 1509 Average gap standard deviation 4.24.3 4.94.0 a Models also include dummy variables for renters and those living in rent-free arrangements with relatives or others, so that coefficients for home-owners without legal title are relative to home-owners with legal title. Models also control for age, age-squared and gender. Significant at the 5 level. Significant at the 10 level. Significant at the 20 level. and, worse, the sample child’s desired schooling for him- or her self to his or her siblings. Nevertheless, parents’ desired schooling is a significant and positive predictor of schooling attainment 13 ; children’s desired schooling is a positive but weak predictor. Likely measures of bind- ing liquidity constraints are negatively associated with schooling attainment. Spending a proxy for permanent income enters the model positively and significantly at the 5 and 10 confidence levels for children and par- ents, respectively. Dummy variables for homeowners without legal title–who would presumably be unable to use their property as collateral for loans—have a statisti- 13 The relationship between aspirations and attainment has been studied widely in the sociological literature. In US data, schooling “aspirations” are found to be statistically significant predictors of eventual educational attainment Portes Wilson, 1976; Thomas, 1980; Sewell, Hauser Wolf, 1980. Jamison and Lockheed 1987, in a rare study that considers the effect of desired schooling and attitudes on schooling in a developing country, find that parents’ desired schooling has a significant effect on their children’s school enrollment in Nepal. cally negative effect at standard confidence levels 14 . In sum, the model performs as expected, with desired schooling of parents playing a significant positive role in determining eventual schooling attainment.

5. School sample fixed effects and community effects

I now estimate the extent to which the variation in desired schooling is systematically related to personal, family and community characteristics of children and parents with the following linear model: S i ,j,k 5 b 1 b 1 I j 1 b 2 F j 1 b 3 C k 1 c k 1 u i ,j 2 where desired schooling S i,j,k of student or parent i in family j and community k is a linear combination of vec- tors I and F which contain child and family character- 14 A model that controls for many more family traits, as well as community residence gives much the same results. See Binder 1998 for a more thorough analysis of schooling deter- minants. 319 M. Binder Economics of Education Review 18 1999 311–325 istics, respectively, and a vector of dummy variables C that take the value of 1 if the observation is drawn from a given school sample, and 0 otherwise; b 1 , b 2 and b 3 are vectors of coefficients and b is a constant term; c k is a common error term for all observations in a given school sample, and u i,j is an error term for person i in family j 15 16 . Child characteristics include age, sex, whether the child was born in the survey city native or not, and birth order. Family background variables include parent schooling and age, a measure of income, number of sib- lings, family structure nuclear or extended and dummy indicators for the occupation of the household head. Weekly household spending rather than earnings is used as the income measure because it should be a better proxy for permanent income 17 . Permanent income, in turn, is a better measure of a household’s economic status than current income, and is especially relevant for investment behavior 18 . In addition, several variables that measure parent attitudes toward schooling are included. Parents were asked if their children would receive more schooling 1 if they enjoyed higher incomes and 2 if they could borrow to finance schooling. Finally, parents’ desired schooling is also used as an independent variable in models of children’s desired schooling. Table 4 shows that the community dummies are sig- nificant predictors of desired schooling for both parents and children, even in the presence of detailed family characteristics. Relative to the omitted low-income Gua- dalajara school sample, being in a middle-income, high- income or Tijuana school has a large, positive and sig- nificant effect on years of desired schooling for both par- ents and children, even after controlling for income and 15 The investment model implicitly relies on investors’ per- ceived and likely uncertain benefits of schooling. The survey did not produce adequate measures of the access to accurate information about future earnings. The survey did include a question about how difficult parents and children felt it would be to find employment in the desired fields. Presumably, dif- ficulty in finding employment would lead to lower expected benefits. The measure of the probability of employment, though, had no statistical significance when it was included in specifi- cations not reported here. 16 This specification does not allow the model parameters to vary among schools. Although results reported in what follows suggest that the parameters do in fact vary, a comprehensive multi-level analysis is not possible, given the limited number of schools in the sample. 17 The most common occupation reported for household heads was construction worker. Construction jobs are usually temporary and often seasonal. The wage reported in any given week is unlikely to be an accurate measure of permanent income. 18 The variables included are standard in the determinants of schooling literature; see footnote 2 for research that follows this approach. other individual and family characteristics. Desired schooling for children in the middle-income Guadalajara school is 1.7 years higher than for children in low- income schools in Guadalajara and Arandas. Desired schooling for Tijuana children is between 1.6 and 1.9 years higher. Coefficients on school dummy variables are even larger for parents, where parents in the high-income communities desire three more years of schooling for their children than do parents in low-income communi- ties. What is causing these large estimates for school sam- ple fixed effects? As summarized earlier, one interpret- ation is that the fixed effects are community effects that exert an independent effect on the behavior of com- munity members. The alternative hypothesis is that the large estimated fixed effects are simply proxies for unob- served, but shared, family characteristics. The fixed effect will then be caused by Tiebout behavior in which “like” people group themselves in communities. I first consider some spurious possible causes of the fixed effects and then examine how well the data fit the implications of each hypothesis. I begin by establishing the overall significance of the school dummy variables. F -tests reject the hypothesis of zero joint significance of the dummy variables for both children and parents see Table 4. F-tests also reject pooling of all samples: poo- ling is accepted for the low-income school samples in Guadalajara and Arandas, and for the remaining schools which include the higher-income Guadalajara groups and the Tijuana schools 19 . This division roughly follows the distinction between schools that have significantly differ- ent coefficients on the school sample dummies and those that do not. The school dummies do seem to be providing infor- mation. It is possible, though, that the school dummies are providing information about imprecisely measured or unobserved family characteristics. Measurement error is unlikely, due to the high degree of control I had in col- lecting data. I nevertheless consider its presence by cal- culating what bias it would introduce in the estimated coefficients, following the approach of Borjas 1992. Assume that the true model depends only on a family background measure and no community measures. In this case, if the available family and community vari- ables are imperfect proxies for the true family measure, then we estimate: S j ,k 5 b 1 u 1 F j 1 u 2 C k 1 u j ,k 3 where F is a family characteristic measured with error 19 In order to increase the degrees of freedom and so have meaningful F-tests, I estimated reduced-form models that included only age, parent schooling, log of household spending, and dummy variables for sex and native to sample city. 320 M. Binder Economics of Education Review 18 1999 311–325 Table 4 Point estimates of determinants of desired schooling for parents and children standard errors in parentheses Desired schooling of: Parents Children Child characteristics I II III Age of student 2 0.123 2 0.202 2 0.178 0.161 0.151 0.148 Student is male 0.419 2 0.194 2 0.277 0.386 0.360 0.353 Student born in survey city 2 0.050 2 0.899 2 0.889 0.464 0.433 0.424 Family background Parents’ desired schooling — — 0.199 0.057 Parents’ schooling 0.115 0.146 0.123 0.202 0.189 0.185 Log weekly spending 1.006 2 0.622 2 0.822 0.408 0.381 0.378 Parents willing to spend on schooling out of extra income 0.450 1.549 1.460 0.835 0.779 0.763 Parents willing to borrow to finance schooling 0.824 2 0.876 2 1.040 0.622 0.580 0.570 Number of siblings 0.096 2 0.149 2 0.168 0.102 0.095 0.094 Nuclear family 0.858 0.492 0.321 0.446 0.417 0.411 Community fixed effects a Low-income school in Guadalajara ID 5 1 0.041 2 0.806 2 0.815 0.763 0.713 0.698 Low-income night school in Guadalajara ID 5 2 0.717 2 0.754 2 0.897 0.843 0.786 0.771 Low–middle-income school in Guadalajara ID 5 3 1.203 0.452 0.213 0.849 0.793 0.779 Middle-income school in Guadalajara ID 5 4 2.756 2.259 1.711 0.975 0.910 0.905 Private high-income school in Guadalajara ID 5 6 3.529 2.462 1.759 1.504 1.404 1.389 Low-income school in Arandas ID 5 7 0.131 2 0.443 2 0.469 0.759 0.709 0.694 Low-income school in Tijuana ID 5 8 2.333 2.363 1.899 0.839 0.784 0.779 Low-income school in Tijuana ID 5 9 1.129 1.857 1.632 0.826 0.771 0.758 Adjusted R 2 0.295 0.366 0.392 Joint significance of school dummy variables F -statistic 2.095 3.558 2.722 Degrees of freedom 8, 255 8, 255 8, 254 P -value 0.037 0.001 0.007 Models were estimated with PROC REG in SAS for 286 observations, and included a constant term and a dummy variable for each school, as well as the following controls: the square of parents’ schooling, average of parents’ years of age, relative birth order early, middle or late, relative to only-children, a dummy variable for observations where no spending value was reported if so, average spending for the school sample was used, and dummy variables for seven occupational groups. a Omitted school is a low-income school in Guadalajara ID 5 0. Significant at the 5 level. Significant at the 10 level. 321 M. Binder Economics of Education Review 18 1999 311–325 and C is a community characteristic, when the real model is S j 5 b 1 b 1 F j 1 u j 4 The true F is equal to F 1 v 2 , and C imperfectly meas- ures F so that C is equal to F 1 v 3 . v 2 and v 3 are random error terms. An OLS estimate of Eq. 3 will give us the following biased estimates. uˆ 1 5 b 1 F s 2 F s 2 v 3 s 2 F 1 1s 2 v 2 1 s 2 v 3 G and uˆ 2 5 b 1 F s 2 F s 2 v 2 s 2 F 1 1s 2 v 2 1 s 2 v 3 G where s 2 v 2 and s 2 v 3 are the variations of the error terms v 2 and v 3 , respectively. It follows that if s 2 v 3 s 2 v 2 then uˆ 1 uˆ 2 If the measurement error of the community proxy for the true family trait is larger than the measurement error of the family proxy for this trait—as might be expected— then the coefficient of the family characteristic will be larger than the coefficient of the community character- istic. In fact, estimated coefficients for school sample means and census measures of household income and parent schooling are larger than the coefficients for the corresponding family measures. Table 6 provides this comparison for random effects estimates. The results were the same using OLS. In order to accept that the results are driven by measurement error, one would have to accept that mean community schooling or spending is a better measure of parent schooling or spending than is the parent’s own report. I now turn to the possibility of omitted variable bias. Columns 2 and 3 in Table 4 allow a comparison of school sample dummies for children in models that alter- natively omit and include parents’ desired schooling—a variable that is usually “unobserved”. If parents’ desired schooling is a good measure of parent attitudes about schooling, then it is exactly the kind of variable that might be expected to drive the community effects results in a Tiebout world. The inclusion of this variable reduces the size and significance of the sample dummy variables, but F-tests definitively reject the hypothesis that the dummy coefficients are zero in both specifications. The P -value for the model that includes parent desired schooling is 0.007. Thus neither measurement error bias nor omitted variable bias provides convincing expla- nations for the large fixed effects. Fortunately, the Tiebout and community effects mod- els imply opposite relative strengths of the observed effects for groups according to their length of residence in a particular neighborhood Jencks Mayer, 1990. In the Tiebout model, people “bring” their “fixed effects” with them to a community. If this is true, then we would expect recent migrants to the community to be carrying the shared unobserved trait. Longer-term residents are less likely to have the trait, since the community may have changed considerably since they moved in, and moving out may be costly. The community effects model predicts just the opposite: longer-term residents should be more subject to the “community effects”, for having lived with them that much longer. It is plausible that all of the possible mechanisms for community effects out- lined in Section 2 above institutional, informational, net- works and peer effects will be stronger over time. The test of these hypotheses is straightforward. Fixed effects models are estimated separately for recent migrants and longer-term residents, and the significance of the sample dummy variables is then compared. If the community effects hypothesis holds, then we would expect the sample dummy variables to be more predic- tive of desired schooling for the longer-term residents than for migrants. If the Tiebout hypothesis holds, we would expect the opposite result. Unfortunately, data limitations will tend to blur the distinction between recent migrants and longer-term residents. Since length of residence in the current community was not reported, migrant families are identified as such if the child was born outside of the sample city and longer-term residents are identified as such if the child was born in the sample city. Thus those classified as longer-term residents may also include new-comers to the neighborhood. The bias introduced by this measurement problem will tend to make the longer-term residents and recent migrants more similar and so understate the result of a difference between them. This will be especially true for children, since some children born outside of the sample city may have migrated shortly after birth. Despite this imprecision, recent migrants and longer- term residents as defined do appear to behave differently. F -tests reject the hypothesis that school sample dummy coefficients are zero at standard confidence levels for native parents and native children, and at the 10 level for migrant children. But the hypothesis cannot be rejected even at the 10 level for migrant parents; in fact, the P-value rises to 0.75. Note that the group with potentially the most time in the community—native adults—exhibit the largest school sample dummy effects. Migrant children have stronger effects than their parents, perhaps because they have lived a much greater share of their lives in the community. As expected, the distinction between migrant and longer-term resident children is less clear, although significance levels are slightly higher for longer-term resident children. This result also holds for children when parents’ desired schooling is controlled: community effects are significant at the 8 level for longer-term residents, and at the 13 level for migrants. Table 5 shows that the significance of the fixed effects 322 M. Binder Economics of Education Review 18 1999 311–325 Table 5 Joint significance of dummy variables for school samples for recent migrants and longer-term residents With parents’ desired Standard specification a schooling b Migrant parents F -statistic 0.635 Degrees of freedom 8, 65 P -value 0.746 Migrant children F -statistic 1.823 1.638 Degrees of freedom 8, 65 8, 64 P -value 0.089 0.132 Longer-term resident children F -statistic 2.149 1.806 Degrees of freedom 8, 161 8, 160 P -value 0.034 0.080 Longer-term resident parents F -statistic 2.291 Degrees of freedom 8, 161 P -value 0.024 a See columns 1 and 2 in Table 4. b See column 3 in Table 4. rises with potential length of residence in the com- munity. But could this result be caused by the relatively large contribution of migrants from communities that are simi- lar to each other? Consider the coefficient estimates on the community dummy variables in Table 4. School IDs 1, 2, 3 and 7 are not significantly different from the omit- ted school. If migrants come disproportionately from these schools, then the results in Table 5 may be a coincidence. In fact, though, all schools have migrants. Moreover, three of the four schools which exhibit the strongest community effects also have the largest migrant contributions. If there were no real difference in the community effect between migrant and longer-term resident groups, then we would expect that the com- munity dummy variables would be more significant for the migrant sample, which draws more heavily from the distinct communities. Moreover, similar results hold even when the two most migrant communities are omit- ted: longer-term adult residents display significant com- munity effects at the 10 level and no other groups show significant effects 20 . Another implication of the community effects hypoth- 20 It is also possible that the distinction between recent migrants and long-term residents many of whom were early migrants results from heterogeneity in the more recent migrant pool. Under this possibility, earlier migrants exhibit Tiebout behavior, and more recent migrants—perhaps facing different constraints—do not. Besides an unknown exogenous change in the ability to express Tiebout behavior, recent and early migrants migrated in similar proportions from rural and urban source communities from the same areas, and many other rel- evant characteristics—such as schooling and family size—are controlled. esis is that community level variables—and not just com- munity dummies—should have explanatory power. Data constraints often dictate the size and shape of the com- munity from which characteristics are measured. For example, Behrman and Wolfe 1987 use the percentage of school age children enrolled in school in a munici- pality as a measure of school availability and Birdsall 1985 uses average teacher schooling and salary as a measure of school quality in each of 169 urban and rural areas in Brazil. Dachter 1982 and Corcoran, Gordon, Laren and Solon 1992 use zip code zones. These meas- ures are, at best, ad hoc. An alternative approach uses group means as inde- pendent community variables. Here researchers may have more leeway in constructing groups. Case and Katz 1991, for example, use means for observations in a two-block radius. The main drawback of using sample means is that they may not accurately represent the over- all community and, in small samples, may be too sensi- tive to individual observations 21 . The census tract data I use for community character- istics correspond to the idea of neighborhoods introduced in Section 2 above. Census tracts used by the 1990 Mex- ican Census Inegi, 1992a have populations of about 5000 and correspond to geographically integrated and sensibly divided from my personal observations neigh- borhoods 22 . 21 In Case and Katz 1991, for example, the measures are based on at most 15 observations for each two-square block area; in Borjas 1992, some of the ethnic groups used have fewer than 15 observations. 22 For two school samples in Guadalajara the low–middle and private schools in which students lived in several dispersed census tracts, I used a weighted average of children’s census 323 M. Binder Economics of Education Review 18 1999 311–325 The data provide population tabulations from which I can calculate percentages of workers in each neighbor- hood earning different multiples of the minimum salary, school enrollment rates for children of ages six to 14, and proportions of adults who completed different schooling levels for each school sample neighborhood. I also use a direct measure of school quality: the number of stu- dents per class in each school. In OLS models for children’s desired schooling, three of these community measures increase the variation explained by family and personal characteristics alone by 6 from R 2 5 0.36 to R 2 5 0.38. In comparison, the school sample dummies increase explained variation by 9 to R 2 5 0.39. Thus most of the fixed effects can be explained by the school quality measure, the pro- portion of neighborhood resident workers earning two minimum salaries or less, and the proportion of adults with nine years of schooling or more in each neighbor- hood. In contrast, these community measures increase variation explained for parents’ desired schooling by less than 1, compared with an 8 increase when school sample dummies are included. Table 6 shows how community variables perform when entered separately in a model that includes all the available family and personal characteristics used in the fixed effects models in Table 4. They are highly corre- lated and so lose significance when entered together. In addition to the census and school quality measures, I also use school sample means. Of particular interest is the high predictive value of average community schooling and community earnings. The former indicator may reflect informational, peer or institutional community effects, the latter is likely asso- ciated with institutional effects. For example, communi- ties with relatively low proportions of workers earning two minimum salaries or less are associated with higher levels of desired schooling. The potential importance of this measure may explain why the Tijuana sample has relatively high desired schooling. Although the average weekly spending in Tijuana is only slightly higher than spending in low-income communities elsewhere in the sample, there are also relatively fewer very poor families in Tijuana. If this fact reflects the relatively greater job opportunities in Tijuana, children and parents may per- ceive higher marginal benefits from schooling through greater probability of finding work. Another possibility is that if there is greater demand for skilled workers in Tijuana, returns to schooling may be higher 23 . tract characteristics. In some sense, then, the true comparison is among school communities. Nevertheless, specifications that included child’s census tract, and not the school average, give similar results. 23 Log wage equations estimated for earners in my sample show that Tijuana has lower rates of return to schooling than The school quality measure, also an institutional effect, is more predictive of children’s than parents’ desired schooling. Peer measures of neighborhood school enrollments are not good predictors of desired schooling, especially for children. One possible expla- nation for lack of peer activity influence is suggested by the relatively low school enrollment rates in the Tijuana communities see Table 2. These communities have a high in-flow of new migrants. If migration initially dis- rupts school enrollment, lower enrollment rates are to be expected at any point in time. It is possible that eventual enrollment rates of migratory families are higher. It is also possible, however, that migration increases desired schooling at the same time it makes school attainment more difficult. In fact, Tijuana liquidity-constrained schooling gaps are higher than the gaps in the other low- income communities in the sample. Table 6 is only suggestive of the sources of the com- munity effects, since the community measures are not entered simultaneously. Data that covered a larger num- ber of communities would perhaps allow for a more nar- row pinpointing of what the community fixed effects are.

6. Conclusions