Data Manajemen | Fakultas Ekonomi Universitas Maritim Raja Ali Haji 263.full

have recently begun to explore the moderating effects of genetic background on a more general set of social outcomes. For example Guo, Roettger, and Cai 2008 study the determination of juvenile delinquency and fi nd signifi cant interactions between ge- netic markers and factors like eating regular family meals and repeating a grade, while Settle et al. 2010 fi nd a signifi cant interaction between a dopamine receptor gene and social networks in the formation of political ideologies. 5 However, I have been able to identify only two previous studies that examined the role of gene- environment interactions in determining any measure of academic achievement. First, Conley and Rauscher 2010 use within twin- pair birthweight differences to study how genetic traits may moderate the relationship between birth weight and several outcomes including high school grade point average. They fi nd only one sig- nifi cant gene- environment interaction, and its sign is the opposite of what had been suggested by prior research. Whereas the plausible exogeneity of within twin- pair birth weight differences make Conley and Rauscher’s research design attractive, its scope is inherently limited since birth weight is only one environmental factor that may impact educational outcomes. The present study analyzes a distinct and arguably broader environmental condition, economic background, and also expands the set of educational outcomes under consideration. Second, a study by Shanahan et al. 2008 analyzes the determination of educational continuation beyond high school, and fi nds signifi cant interactions between a variant of the dopamine receptor gene DRD2 and environmental factors such as having a par- ent that belongs to the PTA and how often parents discuss school related issues with the student. While their fi ndings are suggestive of important interactive effects, the qualitative methodological approach of Shanahan et al. makes their results diffi cult to interpret, and no attention is given to issues of selection bias. In the present study, I take advantage of family level clustering in my data to estimate sibling fi xed- effects models that identify the critical interaction terms using plausibly exogenous varia- tion in genetic status, and do so within a standard multiple regression methodologi- cal approach. Additionally, I analyze distinctly economic environmental factors and a broader set of educational outcomes.

III. Data

The data used in this paper is drawn from the National Longitudinal Study of Adolescent Health Add Health. The Add Health study includes a nationally representative sample of adolescents who were in grades 7–12 during the 1994–95 school year and who have since been interviewed in four waves, with the most recent wave occurring in 2008 when participants were between 24 and 32 years old. The initial Add Health sampling procedure took place in two stages. In the fi rst stage, a 5. Studies within economics have incorporated genetic markers as well but to my knowledge none have investigated their interactive effects. One literature investigates the effects of genetic markers in experimental and behavioral contexts. For example, Carpeneter, Garcia, and Lum 2011 and De Neve and Fowler 2010 fi nd that certain genetic markers predict risk propensities, time preferences, and credit card debt. Another literature has used genetic markers as instrumental variables, primarily for mental health status. For examples see Ding et al. 2009 and Fletcher and Lehrer 2009 as well as the recent minisymposium edited by Mullahy 2011. stratifi ed random sample of 88 high schools was drawn from the frame of all high schools in the US with more than 30 students. For each high school that did not in- clude a seventh grade, a feeder school that did include a seventh grade and regularly sent graduates to the high school was also selected. In the second stage, a sample of 27,000 adolescents from the selected high schools and their feeder schools was drawn, with oversampling of individuals with certain social and demographic characteristics of interest. 6 Selected participants completed an extensive in- home questionnaire on a variety of demographic- , academic- , and health- related topics, and a separate survey was also completed by a resident parental fi gure of each respondent. The Add Health study also included an overlapping “genetic subsample.” Students who identifi ed themselves as twins in a preliminary questionnaire were selected for this sample with certainty, resulting in the inclusion of over 300 monozygotic and dizygotic twin pairs. Respondents with full siblings were typically not oversampled, but more than 1,000 full sibling pairs entered into the sample, mostly by chance. In the third wave of the survey, all active participants from the subsample of twin and full sibling groups were asked to provide a saliva specimen for genetic testing pur- poses, and 2,612 out of 3,139 eligible respondents 83 percent did so. The analysis conducted here uses this subset of original Add Health respondents for whom genetic marker data is available. 7 The fact that the Add Health sampling procedure was strati- fi ed at both the school and household level makes it especially amendable to fi xed- effects modeling techniques, and I exploit this unique feature of the data below. As measures of educational attainment, I use variables indicating whether each in- dividual attended any college and whether they graduated from college, 8 as well as a continuous educational attainment variable measured in years. 9 To measure childhood economic status, I use the logarithm of gross household income in 1994, as reported by the respondent’s resident parental fi gure. 10 Self- reported parental education is also used in some specifi cations. 6. The oversampled populations included minority ethnic groups and African American students with college educated parents. 7. Because the siblings of twins were not selected through probability sampling, sampling weights are un- available for 35.83 percent of the genetic subsample. Given this, I follow the recommendation of Chantala 2001 and conduct my analysis without sampling weights. Of course, the genetic subsample also excludes only children and carries a disproportionate number of twins. As a result of these survey design features, my working sample is not necessarily nationally representative, and this consideration should be kept in mind when interpreting my results. 8. While high school completion is also an important dichotomous measure of educational attainment, it is very common within my sample and there was relatively little variation in high school completion status, especially within families. However, as a robustness check I do confi rm that my basic results hold when using high school completion as the dependent variable see Table 4. 9. This measure is actually semicontinuous because in the fi nal wave of the survey education was reported in detailed categories. These categories were recoded as follows: 8th grade or less was assigned a value of 8; some high school was assigned a value of 10; high school graduate and vocational training were assigned a value of 12; some college was assigned a value of 14; completed college was assigned a value of 16; some graduate school and some postbaccalaureate professional education for example, law school, medical school, nurse were assigned a value of 17; completed master’s degree, some graduate training beyond a master’s degree, and completed postbaccalaureate professional education were assigned a value of 18; completed doctoral degree was assigned a value of 20. 10. Eleven respondents came from households that reported zero income. To preserve these observations in the log transformation 1 was added to the income variable, but the results presented below are not substan- tively changed if these 11 observations are instead simply deleted. With respect to genetic markers, I investigate the interactive effects of variation at the gene locus most strongly suggested by prior research: monoamine oxidase A MAOA, which individuals inherit in varying forms known as alleles. MAOA is a 30 base- pair repeat element, and allelic variation at the MAOA locus takes the form of differing numbers of repeats. Individuals can have 2, 3, 3.5, 4, or 5 repeats, but about 95 percent of individuals have either three or four repeats. 11 Those with 3.5 or 4 repeats are believed to be better able to chaperone and break down neurotransmitters such as serotonin, norepinephrine, and dopamine than are individuals with 2, 3, or 5 repeats Guo, Roettger, and Cai 2008. In turn, the levels of such neurotransmitters have been linked with reduced impulsivity, novelty seeking, anxiety, and aggression, among other social and behavioral traits that could impact educational attainment. Importantly, MAOA is located on the X chromosome, so that males possess only one MAOA copy, while females possess two MAOA copies such genes are called “X linked”. 12 Males inherit their single copy of MAOA from their biological moth- ers, while females inherit a separate MAOA copy from each biological parent. The presence of two MAOA copies in females substantially complicates the classifi cation of their MAOA status. For example, consider a case in which a female has both a 3 repeat and a 4 repeat MAOA allele, which is by no means a rare occurrence, occurring in over 41 percent of female Add Health respondents. Because the phenotypes associ- ated with MAOA status are generally complex, and no clear dominance relations exist between the different MAOA alleles, it is not at all clear how such a female’s MAOA status should be classifi ed. Unfortunately, the biological literature offers little guidance on this issue since both animal and human studies of MAOA have has focused primar- ily on males see below. To avoid these complications, I follow the existing literature and focus only on males. 13 Accordingly, I created an indicator variable for each male respondent equal to one if the individual had 3.5 or four repeats as their single MAOA copy, and zero other- wise. 14 For ease of exposition, I will refer to such individuals as having “positive” MAOA status, but no normative assessment of favorable genetic status should be in- ferred from this choice of language. Indeed my primary fi nding is that the association between MAOA status and educational attainment is highly dependent on economic background. Importantly, positive MAOA status is not a rare genetic trait: of the 1,213 males in the Add Health data with valid MAOA test results, 686 of them ap- proximately 57 percent have positive MAOA status. Given this relatively uniform dis- tribution, any substantial interactions involving MAOA will have broad implications. As noted below, there are approximately 20,000 to 25,000 distinct protein encoding DNA sequences in the human genome, and picking out just one of these may seem 11. There is also a very rare seven- repeat allele at the MAOA locus but no Add Health respondents were found to possess this variant. 12. This is because males possess one X chromosome and one Y chromosome while females possess two X chromosomes. 13. A preliminary analysis of the female sample available on request produced some results similar to those reported below for males, but these results depended strongly on the classifi cation of female’s MAOA status and on other modeling choices. Given the lack of guidance from the scientifi c literature on these questions, choosing between different classifi cations and specifi cations for females is necessarily arbitrary and thus avoided. 14. Results using an alternative MAOA status classifi cation that also includes individuals with fi ve repeats are presented below as a robustness check. rather arbitrary. This is not the case, primarily due to the fact that animal studies provide scientifi cally rigorous guidance in the selection of genes to study. The most powerful technique is the use of so- called knockout mice that have had certain genes experimen- tally blocked by researchers. Such knockout studies have been conducted with male mice on genes similar to MAOA Cases et al. 1995; Shih and Thompson 1999, and the results suggest that MAOA is likely among the most important genetic loci for behav- ioral and social functioning. Additional evidence on the importance of MAOA comes from studies of an extended Dutch family in which males carry no MAOA repeat, cre- ating in essence a human MAOA knockout Brunner et al. 1993. The men in this fam- ily were found to display disproportionate levels of aggression, impulsivity, and crimi- nal behavior. 15 The evidence supporting a major role for MAOA status in determining behavioral traits is suffi cient that in a 2009 U.S. murder trial, the defense successfully used MAOA status combined with a history of child abuse as an argument to limit conviction to voluntary manslaughter and avoid the death penalty Barber 2010. Given this body of evidence, the selection of MAOA for study here is far from arbitrary. Table 1 presents descriptive statistics of the data. Despite the fact that the genetic subsample was not drawn at random, the descriptive statistics show that it still con- tains substantial socioeconomic diversity. Approximately 27 percent of the working sample is nonwhite, all levels of educational attainment are reasonably well repre- 15. Note that in both types of studies those using knockout mice and rare human mutations, the focus has been on males. There is far less evidence linking MAOA status to social and behavioral outcomes in females. Table 1 Descriptive Statistics Variable Mean SD Positive MAOA Status 0.57 0.50 Household Income 10k 1994 dollars 4.50 4.50 Attended College 0.73 0.44 Graduated from College 0.41 0.49 Total Years of Education 13.96 2.21 White 0.73 0.44 African American 0.17 0.38 Hispanic 0.14 0.35 Other Race 0.16 0.37 English Spoken at Home 0.92 0.27 Number of Siblings 3.11 1.40 Birth Order 2.22 1.40 Observations 931 Notes: Sampe includes males only. Positive MAOA status refers to those possessing 3.5 or 4 base pair repeats at the MAOA locus. Household income and student education levels were self reported by parents and children, respectively. Racial categories sum to more than 1 because individuals were asked to report all applicable racial groups. sented, and the 45,000 standard deviation of household income indicates substantial income dispersion.

IV. Results