Choosing Control Groups Econometric Model

months prior to the tax hike, and a full tax effect for women who conceived after the tax hikes were implemented. 3 For each treatment state, we include data for 56 monthly periods: 24 months prior to any exposure to the tax hike, the eight months of a partial tax effect, and 24 months after the tax hike. For these periods, we also include data for the corresponding set of control states. Outcomes are measured at the individual level i and data varies across states s and months m. The primary outcome of interest is a binary indicator for maternal smoking, denoted as S. The basic research question is whether the excise tax hike in a particular state decreased maternal smoking. Births that occurred in a treat- ment state are indicated by the dummy variable D. The equation estimated is of the form 1 FULL TAX + + + a n y PART TAX c D + S X D s sm s sm ism ism s m ism 1 1 1 1 1 1 = + b where X is a vector of characteristics that describe the mother and the pregnancy. We include variables on maternal age, race, ethnicity, marital status, and education. 4 We also include variables for the infant’s sex, parity of birth, plurality of birth, and Kessner adequacy index of prenatal care. 5 State and month of conception effects are represented by µ and υ respectively and is a random error. The variable PART TAX equals one for women who conceived during the eight months before the tax hike, and FULL TAX equals one for women who conceived during the 24 months following the tax hike. Since DFULL TAX equals one only for the treatment state in the post-tax- hike period, the full tax-hike treatment effect is measured by α 1 . The econometric model described in Equation 1 is a standard difference-in-difference model where states that did not raise their excise tax form a comparison group.

B. Choosing Control Groups

As Meyer 1995 points out, a difference-in-difference model will only provide con- sistent estimates of the treatment effect, if in the absence of the intervention, the time path in the outcome is the same in both treatment and control states. For example, suppose a major tax hike reduces the probability a mother smokes during pregnancy, but smoking rates are falling faster in treatment states than in controls. In this case, the difference-in-difference model will overstate the benefits of a tax hike. We can never guarantee that the time paths of outcomes would be the same in the treatment and control states in the absence of the intervention. However, we increase this likelihood by choosing control states that possess the same month-to-month pattern 3. Month of conception is calculated from the month of last menses, weeks of gestation, and month of birth. 4. Our control categories are by maternal age 18 and under, 19 to 25, 26 to 32, 33 to 40 and older than 40, race Black, white, and other race, Hispanic ethnicity and education less than high school, high school graduate, some college, and college graduate. 5. We control for whether the birth was the mother’s first birth, second, or higher birth and whether the birth was a singleton, twin, or higher multiple birth. The Kessner index is a measure of the adequacy of prenatal care that indicates whether the mother went to an adequate, medium, or inadequate number of prenatal care visits given the gestation length of her pregnancy. It does not account for the quality of care received or any pregnancy risk factors that may be associated with a higher than average number of visits. Due to the limi- tations of the Kessner index, we checked and found that our results are not sensitive to the inclusion of this variable. The Journal of Human Resources 378 in smoking as the treatment state before the intervention. In selecting controls, we look at the 24 months before the conception of pregnancies that were at some stage when the tax was enacted. For Michigan, the tax hike occurred in May 1994, so we look at women who conceived between September 1991 and August 1993 to pick controls. Our procedure to choose controls is as follows. Each treatment state has a unique set of potential controls that had no nominal change in state cigarette excise tax levels during the 56-month window. 6 Potential controls for each state are listed in Table 2. From this set, we ran regressions including data from only the treatment state and one potential control for the 24-month pretreatment period. The model for each treatment state is of the form: + D S X 2 ism ism s m s ism 2 = + + + b n y m m This equation is similar to Equation 1 above. In these regressions, we add the same set of covariates listed for Equation 1 and u and v are state and month of conception effects respectively. Since data from after the tax-hike treatment period is excluded, the equa- tion does not include the DPART TAX or DFULL TAX terms. The key terms in this regression are the coefficients on λ m , which allow the monthly dummy variables to dif- fer between the state with a tax hike and a potential control. If we cannot reject the hypothesis that λ 1 = λ 2 = . . . . λ 23 = 0, then conditional on differences in the level of use and X, the treatment state and the potential control have statistically the same monthly pattern in maternal smoking and we include this state as a control. The states in bold in Table 2 are those where we cannot reject the null the λ’s are jointly zero.

IV. Results