The Determinants of Birthweight: Addressing Potential Sample Selection Bias from Babies Who Are Not Weighed at Birth - repository civitas UGM
P a n in fs a n k
TC:JlAH\
oRNGI N
,
~
S ID O M lIN C lI[
GATRA
~
r ~
utsrqponmlkjihgfedcbaZYXWV
~!I
,;
G :.
L
:..
-.
T h e D e t e r m in a n t s o f B ir t h w e ig h t : A d d r e s s in g P o t e n t ia l
S a m p le S e le c t io n B ia s f r o m B a b ie s W h o A r e N o t W e ig h e d a t
B ir t h !
H eni W ahyuni
1
I n t r o d u c t io n utsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
The empirical literature reviewed with regard to the infant health production function has
focused on the issues relating to endogeneity and sample selection biases, caused by unobserved
health heterogeneity and the pregnancy-resolution
decision (Liu 1998;
ROllS,
JewellJIHGFEDCBA
& Brown
2004). The first bias relates to endogeneity of prenatal care, while the second, in existing studies,
arises from a given woman's
decision to abort or continue her pregnancy.
Specifically,
unobservable factors that may influence a woman's decision to proceed with the pregnancy or
abort are factors that are also likely to influence her use of prenatal care and birth outcomes,
particularly birthweight. Sample selection bias relating to the decision to abort is unlikely to be a
problem in Indonesia, where abortion is socially unacceptable and only conducted for medical
reasons. There is a potential for selection bias, however, due to non-random missing information
on birthweight.
The potential sample selection bias that arises from birthweight being missing for some babies
(those not weighed at birth) is a common issue in developing countries and generally does not
occur in studies of birthweight in developed countries. If the birth weight information in the
sample is not missing at random, however, the analysis of the determinants of birthweight
(without considering unreported birthweight) will be biased (Heckman 1979). This represents a
possible sample selection issue, given that the data on a key variable (birthweight) are available
This paper has been presented at the National Seminar of the 60th anniversary of the Faculty of Economics and
Business UGM Seminar, Balancing Indonesian Economy: Governance and Accountability, Ethics, and Strategy
toward Inclusive Growth, 19 September 2015.
2 Heni Wahyuni is a lecturer and researcher at Faculty of Economics and Business, Universitas Gadjah Mada. Email: [email protected]
1
only for a subset of the population, who are not weighed at birth. This is often referred to asONMLKJIHGFEDCBA
in c id e n ta l tr u n c a tio n
(Wooldridge 2002, p. 552).
Relatively few studies, however, have investigated the potential of sample selection bias from
unweighed babies in the relationship between prenatal care and infant health in developing
countries, including Indonesia. Among those few studies, two significant studies have considered
this issue (HabibovJIHGFEDCBA
& Fan 2011; Mwabu 2009). Habibov and Fan (2011) used 73 percent of all
live births with birthweight in Azerbaijan to analyze the effect of prenatal care on birthweight.
Mwabu (2009) reports that only 17 percent of babies delivered at home and 75 percent of babies
delivered at modem facilities in Kenya have a reported birthweight. Those two studies tested the
potential bias, due to unweighed babies, and found no evidence of selection bias in their data.
One study has examined the impact of the village midwife program on birthweight in Indonesia
(Frankenberg
&
Thomas 2001), but it does not take into account the selection problem, arising
due to some babies not being weighed at birth.
This study will use the Indonesia Family Life Survey (lFLS) data, which are IFLS3 and IFLS4
data. The focus of the study is to test whether there is a sample selection bias on the determinants
of birthweight. Specifically, birthweight is the outcome of interest, but I will observe this
outcome, conditional upon whether or not the baby has been weighed at birth. The IFLS data for
live-birth babies, born during 2002-08, inclusively, indicates that approximately
II percent of
babies were not weighed. It is not appropriate to eliminate unweighed babies from the sample
and only include an analysis of the pregnancy outcome (birthweight) of the subset of mothers
whose babies were weighed, unless the birth weight and whether or not the baby was weighed are
independent. Otherwise, it could lead to biased estimates. Furthermore, in Indonesia, the IFLS
data shows that babies not weighed at birth are more likely to be born at home or in the office of
midwives with a traditional birth assistant, as well as to low-income and less educated mothers.
Previous studies, for example Mwabu (2009), use various instruments such as money prices,
time prices, household assets and income, environmental factors (rainfall), interaction terms
between land and mean long-term rainfall, and between cattle and mean long-term rainfall.
However, these are not available in the IFLS data.
2
A M o d e lin g F r a m e w o r k
H e c k m a n S a m p le S e le c t io n M o d e l utsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
In an attempt to analyze the potential
Heckman
selection
model.
selection
The model
problem
consists
from unweighed
of two equations.
babies, I have used a
The selection
equation
(I)
represents whether the baby was weighed and the outcome equation (2) relates to birthweight,
(1)
_ {1ONMLKJIHGFEDCBA
JIHGFEDCBA
00
ifz i
O ·f
Zi -
l
>
*
Zi
-
-
xdJ +
f
>
E iifz i
0
Y i = l-u n o b s e r v e d ifz ;
where
z; is a latent
=
variable
(2)
0
measuring
vector of factors known to influence
the propensity
of a baby to be weighed at birth;
a baby's to be weighed
usage, as well as the age of the mother when pregnancy
schooling,
household
characteristics
(HH index),
whether she smoked before or during pregnancy,
during the pregnancy,
pregnancy,
baby-specific
ended, per capita expenditure,
health condition
(singleton
year of birth), and the cost of the delivery;
Ui
is a
at birth that includes prenatal care
(general
years of
health and 8MI),
and whether she experienced
characteristics
Wi
any complication
or multiple birth, gender, order of
contains
any unmeasured
factors in
equation (I).
We do not actually observe
I if the baby was weighed
sample for the birthweight
z;.
All we observe is a dichotomous
and zero, otherwise;
equation,
equation (1) is a dummy variable,
baby was weighed or not and the dependent
(in kilograms).
included, as independent
identified, the selection
The independent
variables
equation
there is only a selected
(censored)
Yi.
The dependent variable in the selection
birthweight
however,
variable z . , where it is equal to
variable
in the outcome
variables
indicating whether the
equation
in the selection
(2) is the baby's
equation
(1) are also
(xa, in the outcome equation (2). In order for the model to be
should also have at least one independent
variable that is not
3
included in the outcome equation. Otherwise, the model is identified only by functional form,
and the coefficients have no structural interpretation (see identification section).
The disturbance term is assumed to be normal with:
u(-N(O,l)
Er-N(O,l)
WhenONMLKJIHGFEDCBA
p JIHGFEDCBA
"* 0, standard OLS techniques applied to the birthweight equation will yield biased
results. If p
=
0, then there is no selection problem and the standard OLS model is appropriate. It
is important to test if there is a potential selection problem - if babies that were not weighed at
birth to mothers who have different characteristics from those mothers whose babies were
weighed. The null hypothesis of no selection bias is Ho: P
=
O. The Heckman selection model is
used to determine whether there is a selection problem or not. The sample is split between babies
that were or were not weighed directly after delivery.
Identification
Identification is very important in the system of equations. As explained previously, most
unweighed babies were delivered at home with midwives or traditional birth attendants. One
reason for mothers to choose to deliver the baby at home is because it may be less costly than
doing so in a modem health facility with professional birth attendants. Any variable, therefore,
!
that represents the cost of delivery may potentially become an identifier for the selection ~f
whether the baby is delivered at home or at a modem facility. It also can be said that this
1
identifier is an exclusion restriction in the selection equation, which is whether the baby was
weighed or not. In this analysis, the cost of delivery as an identifier will be applied in the
selection equation and it will be excluded from the birthweight equation, since it is unlikely that
any measure of delivery cost will influence birthweight.
4
Data
The sample for the empirical analysis is restricted to live births for pregnancies which ended
between the years 2002 and 2008, inclusively, for ever-married women in IFLS4 (2007-08). A
pooled cross-section approach is used, but it utilizes the panel nature of the IFLS data to provide
information on a range of explanatory variables. After excluding observations with missing
responses;' the sample consists of 4,436 live births in which the birthweight is observed (the
baby is weighed). Including live births in which the birthweight is not observed (the baby is not
weighed), the number of observations is 5,023.MLKJIHGFEDCBA
D e s c r ip t iv e
S t a t is t ic s
Table I presents the summary statistics for the sample, used in the empirical analysis, separately
tor mothers of babies that were weighed and not weighed. The average number of prenatal care
visits for mothers of babies weighed is 9.19, which is much greater than the 6.13 visits for those
unweighed. Similarly, the percentage of mothers who followed the WHO recommendation for
prenatal care visits is much higher for the weighed babies, approximately
84.49 percent,
compared with only about 54.86 percent for unweighed babies. The household index, as an
indicator of economic background, is higher for mothers of babies that were weighed than babies
that were unweighed. The average number of years of schooling was about 12 years or seniorhigh school level for mothers whose babies were weighed, but only about 7 years or graduated
from primary school for those whose babies were not weighed. The lower level of maternal
education relating to unweighed babies may indicate a limited knowledge about health, which
may negatively impact pregnancy outcomes. Moreover, more than 55 percent of mothers whose
babies were weighed lived in urban areas, compared to approximately 17 percent of those with
unweighed babies. The average age at the end of the pregnancy was about the same for both
groups: 27 years of age. Similarly, there was not much difference in the 13M!of mothers and the
general health condition of the mothers. There were very low rates of smoking behavior and
pregnancy complications during pregnancy for both categories of mothers.
Responses were coded as missing are defined as the responses with an illogical answer; the surveyor could not
meet the respondents, such as in the case of 8M! (height &weight measurement) or when the respondent refused to
answer.
3
5
The inclusion of unweighed babies is important because not only are there a significant
percentage (approximately II percent) of babies not weighed in the sample, but the observed
characteristics of mothers of babies not weighed are different from those whose babies were
weighed, thus increasing a potential for bias. More specifically, these mothers are from a lower
socioeconomic background, compared to those whose babies were weighed, in terms of per
capita expenditure, years of education, owning a television, using electricity, having good
drinking water (the household index), and living in rural areas.
Table 1: Statistics Descriptive of Variables Per Category of Birthweight and Definitions
(in percent, unless otherwise indicated)
Variable
__________________________________________________
Total number of prenatal care visit (visits)
WHO recommendation"
Household index
Birthweight
~N~o~t_w~e~ig=h~ed~
VVeighed
6.13
54.86
9.19
84.49
3.71
5.45
7.29
11.99
Age of mother (years)
27.47
27.71
Body mass index (index)
21.45
22.53
Healthy mother (general health condition)
First birth
87.56
88.89
55.54
64.29
57.92
29.13
Years of education (years)
th
Per capita expenditure below 25 percentile of population level
th
th
Per capita expenditure between 25 and 50 percentile of population
level
24.19
27.71
Per capita expenditure above 50th percentile of population level
17.89
Smoking behavior
43.17
1.24
Male baby
0.68
52.47
Singleton baby
92.84
94.21
Having pregnancy complication
14.31
17.38
15.89
Living in urban area
Cost of delivery of the baby (rupiahs)
Number of observations
126,338.20
50.97
55.95
892,835.30
4436
587JIHGFEDCBA
Source: IFLS3 (2000) and IFLS4 (2007-08).
WHO recommends that the minimum number of prenatal care visits during pregnancy be four; with at least one
visit in the first trimester of pregnancy, at least once in the second trimester, and at least twice in the third trimester
(World Health Organization 2005).
4
6
The delivery expense is the variable that differentiates mothers whose babies were and were not
weighed at birth. In general, difference in delivery cost is large between the two groups of
mother: 892,835.30 rupiahs (about A$100) for weighed babies and 126,338.20 rupiahs (A$14)
for unweighed babies. This is a key variable that will be included in the selection equation
(selection for weighed or not weighed), but will not be included in the outcome equation for
birthweight (in kilograms). Thus, it is assumed that the cost of delivery is an instrument in the
selection equation that will not influence birthweight.MLKJIHGFEDCBA
R e s u lt s a n d D is c u s s io n
Estimation results of the Heckman model with the selection problem are provided in Table 2.
The selection equation is the equation that represents whether the baby is weighed or not
weighed immediately following delivery. The estimate for rho indicates a weak correlation
between the selection and the birth outcome (jj
=
-0.0614).
The negative estimate of rho may
appear counterintuitive; however, it is not statistically significant. The associated Wald-test of
independence of equations is not statistically significant (chi2JIHGFEDCBA
= 0.63) with p-value
=
0.4289.
This suggests that the Heckman model with selection may not be appropriate in this case or, in
other words, there is no sample selection bias problem due to unweighed babies.
Although the Heckman model is not appropriate in this case, it is interesting to observe the
results of the selection equation. The estimated coefficient of money spent on delivery, as a key
variable, is significant in explaining whether the baby was weighed or not. Birthweights also are
more likely to be recorded for mothers who have more years of schooling, are from households
with a higher household socioeconomic index with per capita expenditure between 25th and
so"
percentiles of the population level. However, as there is no evidence of a selection bias, the
analysis can be continued on a single structural equation for birthweight, using Ordinary Least
Squares (OLS) regression.
7
Table 2: Heckman Selection Model Cor Weighed and Not Weighed Babies
Variable
Outcome
Selection
Coefficient (SE)
Coefficient (SE)
Total number of prenatal care visit
o.oi 17 (0.0059)**
0.1290 (0.0135)···
Total number of prenatal care visit squared
-0.0002 (0.0002)
-0.0034 (0.0005)··*
Household index
-0.0064 (0.0072)
0.1848 (0.0229)**·
0.0017 (0.0022)
0.0497 (0.0075)···
Age of mother between 25-34 yrs
0.0452 (0.0198)**
0.1166 (0.0683)·
Age of mother 35 and older
0.0832 (0.0304)***
0.1391 (0.0997)
Body mass index
0.0435 (0.0126)***
0.0145 (0.0638)
Years of education
Age of mother less than 25 yrs (baseline)
I
,J
Body mass index squared
-0.0006 (0.0002)**
Dummy if Body mass index is imputed
-0.0810 (0.1691)
5.2062 (0.1852)···
Healthy mother (general health status)
-0.0098 (0.0296)
0.0573 (0.0923)
-0.0735 (0.0180)***
0.1 096 (0.063)·
-0.0203 (0.0242)
0.1377 (0.0722)*
-0.0349 (0.0237)
0.0977 (0.0781)
utsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
First birth
Per capita expenditure below 25th percentile of
population level (baseline)
Per capita expenditure between 25th and so"
percentile of population level
Per capita expenditure above 50th percentile of
population level
0.0003 (0.0013)
0.0931 (0.0948)
0.5679 (0.3416)*
Male baby
0.0838 (0.0169)"*
0.0003 (0.0544)
Singleton baby
0.2911 (0.0489)"*
0.1 02 (0.1392)
Having pregnancy complication
0.0365 (0.0250)
-0.0456 (0.0886)
-0.0158 (0.0051)"*
-0.0009 (0.0181)
Smoking behavior
Year of baby's birth
Cost of delivery of the baby
Constant
Rho
Wald test of independent equation (chi2)=0.63
0.0000015 (0.0000005)"*
2.2066 (0.1719)***
-2.1115 (0.7648)*"
-0.0614 (0.0775)
Probability> chi2 = 0.4289
Number of observation
4436
587
••• Significant at a ) percent level; ** at a 5 percent level; * at a ) 0 percent level.
Additional (Sensitivity) Analysis
Variables, such as years of education, BMI, general health status, household index, and per
capita expenditure have missing information. Observations with missing information were
excluded from the regression analyses. As a sensitivity analysis, the model was re-estimated,
using imputed values for the missing data. The median value for continuous variables and zero
values for dummy variables were applied to replace the missing values. Dummy variables,
8
iildicnting that the data were missing,
results of these sensitivity
analyses
Wald-test
as additional
were qualitatively
lor rho indicates a weak correlation
The associated
were included
between
of independence
variables.
the same as those reported.
selection
is not statistically
that the reported
outcome
The
The estimate
(jjJIHGFEDCBA
= -0.0718814 ).
and birth outcome
of equations
0:.)'-;) with p-value = 0.3297. This indicates
explanatory
significant
equations
(chi '
=
are robust to
»ussingness and the evidence of the sample selection problem has not been found.MLKJIHGFEDCBA
'C h a p te r C o n c lu s io n
This study has analyzed
'wcip,hed immediately
the potential
after delivery
oi[..; because some pregnancies
selection
in Indonesia.
that arises when some babies are not
In other countries,
there is a sample selection
do not end in live birth due to abortion
resolution bias), which are not random.
the pregnancy-resolution
problem
decision
In Indonesia,
religious
is less of a concern,
decisions
and cultural
(pregnancy-
views indicate that
but there is a similar sample
selection
issue for unweighed babies.
In this study, I use the Heckman
selection model to test whether there is a potential selection bias
from babies that are not weighed.
The results show that I did not find evidence
problem from some babies not having been weighed
be continued on the sub-sample
One limitation
in Indonesia.
therefore,
can
of live-birth babies with reported birthweight.
has been noted
in the analysis.
The cost of delivery
women. It is difficult to use the cost of delivery that is measured
the data on the community
The analyses,
of a selection
(IFLSI). However, I use the data from IFLS 2007-08. There are many missing community
data
variables
from the community
module
delivery that is reported by individual
of communities
level because
1993
(not panel respondents)
on the sample
at the community
by the
IFLS
for new respondents
level is based
here is reported
in IFLS 2007. Therefore,
and so the analysis
from
it is difficult to use any
uses the information
of cost of
mothers,
9
R e f e r e n c e utsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
& Thomas, D. 2001, 'Women's health and pregnancy outcomes: do services
Frankenberg, E.JIHGFEDCBA
make a difference?',ONMLKJIHGFEDCBA
D e m o g r a p h y , vol. 38, no. 2, pp. 253-65.
Habibov, N.N. & Fan, L. 2011, 'Does prenatal healthcare improve child birthweight outcomes in
Azerbaijan? Results of the national demographic and health survey', E c o n o m i c s a n d
H u m a n B i o l o g y , vol. 9, no. 1, pp. 56-65.
Heckman, J.1. 1979, 'Sample selection bias as a specification error', E c o n o m e t r i c a , vol. 47, no.
1, pp. 153 - 61.
Liu, G.G. 1998, 'Birth outcomes and the effectiveness of prenatal care', H e a l t h S e r v i c e s
R e s e a r c h , vol. 32, no. 6, pp. 805-23.
Mwabu, G. 2009, 'The production of child in Kenya: a structural model of birth weight', J o u r n a l
o f A f r i c a n E c o n o m i e s , vol. 18, no. 2, pp. 212-60.
Rous, J.1., Jewell, R.T. & Brown, R.W. 2004, 'The effect of prenatal care on birthweight: a fullinformation maximum likelihood approach', H e a l t h E c o n o m i c s , vol. 13, no. 3, pp. 25164.
Wooldridge, J.M. 2002, E c o n o m e t r i c a n a l y s i s o f c r o s s s e c t i o n a n d p a n e l d a t a , The MIT Press
Cambridge, Massachusetts, London.
World Health Organization 2005, T h e W o r l d H e a l t h R e p o r t 2 0 0 5 : M a k e e v e r y m o t h e r a n d c h i l d
c o u n t , World Health Organization, Geneva.
lO
f,~ r -
,~ ,
Faculty of Economics
'
~
U N IV E R S IT A S
G A D JA H
H eni
/
/
/'
./,-;;="7._,<
"'",/
/.,/ .«.•••
. -1 > -c ..~
c..---. (/"c...
ti!;;" -
./
~
v
INNOVAII
INIIRACI
I NSPIR I
1955-2015
I~
th a t
Wahyuni,
ONMLKJIHGFEDCBA
FEB
UGM
M ADA
.- utsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
T h is is c e r tify
,~
B u s in e s s .
andMLKJIHGFEDCBA
PaninBank
TEl lJl1 1\ l1N61N'
1 f SIOOMUNCUI.
P h .D .
. -.
.
TC:JlAH\
oRNGI N
,
~
S ID O M lIN C lI[
GATRA
~
r ~
utsrqponmlkjihgfedcbaZYXWV
~!I
,;
G :.
L
:..
-.
T h e D e t e r m in a n t s o f B ir t h w e ig h t : A d d r e s s in g P o t e n t ia l
S a m p le S e le c t io n B ia s f r o m B a b ie s W h o A r e N o t W e ig h e d a t
B ir t h !
H eni W ahyuni
1
I n t r o d u c t io n utsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
The empirical literature reviewed with regard to the infant health production function has
focused on the issues relating to endogeneity and sample selection biases, caused by unobserved
health heterogeneity and the pregnancy-resolution
decision (Liu 1998;
ROllS,
JewellJIHGFEDCBA
& Brown
2004). The first bias relates to endogeneity of prenatal care, while the second, in existing studies,
arises from a given woman's
decision to abort or continue her pregnancy.
Specifically,
unobservable factors that may influence a woman's decision to proceed with the pregnancy or
abort are factors that are also likely to influence her use of prenatal care and birth outcomes,
particularly birthweight. Sample selection bias relating to the decision to abort is unlikely to be a
problem in Indonesia, where abortion is socially unacceptable and only conducted for medical
reasons. There is a potential for selection bias, however, due to non-random missing information
on birthweight.
The potential sample selection bias that arises from birthweight being missing for some babies
(those not weighed at birth) is a common issue in developing countries and generally does not
occur in studies of birthweight in developed countries. If the birth weight information in the
sample is not missing at random, however, the analysis of the determinants of birthweight
(without considering unreported birthweight) will be biased (Heckman 1979). This represents a
possible sample selection issue, given that the data on a key variable (birthweight) are available
This paper has been presented at the National Seminar of the 60th anniversary of the Faculty of Economics and
Business UGM Seminar, Balancing Indonesian Economy: Governance and Accountability, Ethics, and Strategy
toward Inclusive Growth, 19 September 2015.
2 Heni Wahyuni is a lecturer and researcher at Faculty of Economics and Business, Universitas Gadjah Mada. Email: [email protected]
1
only for a subset of the population, who are not weighed at birth. This is often referred to asONMLKJIHGFEDCBA
in c id e n ta l tr u n c a tio n
(Wooldridge 2002, p. 552).
Relatively few studies, however, have investigated the potential of sample selection bias from
unweighed babies in the relationship between prenatal care and infant health in developing
countries, including Indonesia. Among those few studies, two significant studies have considered
this issue (HabibovJIHGFEDCBA
& Fan 2011; Mwabu 2009). Habibov and Fan (2011) used 73 percent of all
live births with birthweight in Azerbaijan to analyze the effect of prenatal care on birthweight.
Mwabu (2009) reports that only 17 percent of babies delivered at home and 75 percent of babies
delivered at modem facilities in Kenya have a reported birthweight. Those two studies tested the
potential bias, due to unweighed babies, and found no evidence of selection bias in their data.
One study has examined the impact of the village midwife program on birthweight in Indonesia
(Frankenberg
&
Thomas 2001), but it does not take into account the selection problem, arising
due to some babies not being weighed at birth.
This study will use the Indonesia Family Life Survey (lFLS) data, which are IFLS3 and IFLS4
data. The focus of the study is to test whether there is a sample selection bias on the determinants
of birthweight. Specifically, birthweight is the outcome of interest, but I will observe this
outcome, conditional upon whether or not the baby has been weighed at birth. The IFLS data for
live-birth babies, born during 2002-08, inclusively, indicates that approximately
II percent of
babies were not weighed. It is not appropriate to eliminate unweighed babies from the sample
and only include an analysis of the pregnancy outcome (birthweight) of the subset of mothers
whose babies were weighed, unless the birth weight and whether or not the baby was weighed are
independent. Otherwise, it could lead to biased estimates. Furthermore, in Indonesia, the IFLS
data shows that babies not weighed at birth are more likely to be born at home or in the office of
midwives with a traditional birth assistant, as well as to low-income and less educated mothers.
Previous studies, for example Mwabu (2009), use various instruments such as money prices,
time prices, household assets and income, environmental factors (rainfall), interaction terms
between land and mean long-term rainfall, and between cattle and mean long-term rainfall.
However, these are not available in the IFLS data.
2
A M o d e lin g F r a m e w o r k
H e c k m a n S a m p le S e le c t io n M o d e l utsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
In an attempt to analyze the potential
Heckman
selection
model.
selection
The model
problem
consists
from unweighed
of two equations.
babies, I have used a
The selection
equation
(I)
represents whether the baby was weighed and the outcome equation (2) relates to birthweight,
(1)
_ {1ONMLKJIHGFEDCBA
JIHGFEDCBA
00
ifz i
O ·f
Zi -
l
>
*
Zi
-
-
xdJ +
f
>
E iifz i
0
Y i = l-u n o b s e r v e d ifz ;
where
z; is a latent
=
variable
(2)
0
measuring
vector of factors known to influence
the propensity
of a baby to be weighed at birth;
a baby's to be weighed
usage, as well as the age of the mother when pregnancy
schooling,
household
characteristics
(HH index),
whether she smoked before or during pregnancy,
during the pregnancy,
pregnancy,
baby-specific
ended, per capita expenditure,
health condition
(singleton
year of birth), and the cost of the delivery;
Ui
is a
at birth that includes prenatal care
(general
years of
health and 8MI),
and whether she experienced
characteristics
Wi
any complication
or multiple birth, gender, order of
contains
any unmeasured
factors in
equation (I).
We do not actually observe
I if the baby was weighed
sample for the birthweight
z;.
All we observe is a dichotomous
and zero, otherwise;
equation,
equation (1) is a dummy variable,
baby was weighed or not and the dependent
(in kilograms).
included, as independent
identified, the selection
The independent
variables
equation
there is only a selected
(censored)
Yi.
The dependent variable in the selection
birthweight
however,
variable z . , where it is equal to
variable
in the outcome
variables
indicating whether the
equation
in the selection
(2) is the baby's
equation
(1) are also
(xa, in the outcome equation (2). In order for the model to be
should also have at least one independent
variable that is not
3
included in the outcome equation. Otherwise, the model is identified only by functional form,
and the coefficients have no structural interpretation (see identification section).
The disturbance term is assumed to be normal with:
u(-N(O,l)
Er-N(O,l)
WhenONMLKJIHGFEDCBA
p JIHGFEDCBA
"* 0, standard OLS techniques applied to the birthweight equation will yield biased
results. If p
=
0, then there is no selection problem and the standard OLS model is appropriate. It
is important to test if there is a potential selection problem - if babies that were not weighed at
birth to mothers who have different characteristics from those mothers whose babies were
weighed. The null hypothesis of no selection bias is Ho: P
=
O. The Heckman selection model is
used to determine whether there is a selection problem or not. The sample is split between babies
that were or were not weighed directly after delivery.
Identification
Identification is very important in the system of equations. As explained previously, most
unweighed babies were delivered at home with midwives or traditional birth attendants. One
reason for mothers to choose to deliver the baby at home is because it may be less costly than
doing so in a modem health facility with professional birth attendants. Any variable, therefore,
!
that represents the cost of delivery may potentially become an identifier for the selection ~f
whether the baby is delivered at home or at a modem facility. It also can be said that this
1
identifier is an exclusion restriction in the selection equation, which is whether the baby was
weighed or not. In this analysis, the cost of delivery as an identifier will be applied in the
selection equation and it will be excluded from the birthweight equation, since it is unlikely that
any measure of delivery cost will influence birthweight.
4
Data
The sample for the empirical analysis is restricted to live births for pregnancies which ended
between the years 2002 and 2008, inclusively, for ever-married women in IFLS4 (2007-08). A
pooled cross-section approach is used, but it utilizes the panel nature of the IFLS data to provide
information on a range of explanatory variables. After excluding observations with missing
responses;' the sample consists of 4,436 live births in which the birthweight is observed (the
baby is weighed). Including live births in which the birthweight is not observed (the baby is not
weighed), the number of observations is 5,023.MLKJIHGFEDCBA
D e s c r ip t iv e
S t a t is t ic s
Table I presents the summary statistics for the sample, used in the empirical analysis, separately
tor mothers of babies that were weighed and not weighed. The average number of prenatal care
visits for mothers of babies weighed is 9.19, which is much greater than the 6.13 visits for those
unweighed. Similarly, the percentage of mothers who followed the WHO recommendation for
prenatal care visits is much higher for the weighed babies, approximately
84.49 percent,
compared with only about 54.86 percent for unweighed babies. The household index, as an
indicator of economic background, is higher for mothers of babies that were weighed than babies
that were unweighed. The average number of years of schooling was about 12 years or seniorhigh school level for mothers whose babies were weighed, but only about 7 years or graduated
from primary school for those whose babies were not weighed. The lower level of maternal
education relating to unweighed babies may indicate a limited knowledge about health, which
may negatively impact pregnancy outcomes. Moreover, more than 55 percent of mothers whose
babies were weighed lived in urban areas, compared to approximately 17 percent of those with
unweighed babies. The average age at the end of the pregnancy was about the same for both
groups: 27 years of age. Similarly, there was not much difference in the 13M!of mothers and the
general health condition of the mothers. There were very low rates of smoking behavior and
pregnancy complications during pregnancy for both categories of mothers.
Responses were coded as missing are defined as the responses with an illogical answer; the surveyor could not
meet the respondents, such as in the case of 8M! (height &weight measurement) or when the respondent refused to
answer.
3
5
The inclusion of unweighed babies is important because not only are there a significant
percentage (approximately II percent) of babies not weighed in the sample, but the observed
characteristics of mothers of babies not weighed are different from those whose babies were
weighed, thus increasing a potential for bias. More specifically, these mothers are from a lower
socioeconomic background, compared to those whose babies were weighed, in terms of per
capita expenditure, years of education, owning a television, using electricity, having good
drinking water (the household index), and living in rural areas.
Table 1: Statistics Descriptive of Variables Per Category of Birthweight and Definitions
(in percent, unless otherwise indicated)
Variable
__________________________________________________
Total number of prenatal care visit (visits)
WHO recommendation"
Household index
Birthweight
~N~o~t_w~e~ig=h~ed~
VVeighed
6.13
54.86
9.19
84.49
3.71
5.45
7.29
11.99
Age of mother (years)
27.47
27.71
Body mass index (index)
21.45
22.53
Healthy mother (general health condition)
First birth
87.56
88.89
55.54
64.29
57.92
29.13
Years of education (years)
th
Per capita expenditure below 25 percentile of population level
th
th
Per capita expenditure between 25 and 50 percentile of population
level
24.19
27.71
Per capita expenditure above 50th percentile of population level
17.89
Smoking behavior
43.17
1.24
Male baby
0.68
52.47
Singleton baby
92.84
94.21
Having pregnancy complication
14.31
17.38
15.89
Living in urban area
Cost of delivery of the baby (rupiahs)
Number of observations
126,338.20
50.97
55.95
892,835.30
4436
587JIHGFEDCBA
Source: IFLS3 (2000) and IFLS4 (2007-08).
WHO recommends that the minimum number of prenatal care visits during pregnancy be four; with at least one
visit in the first trimester of pregnancy, at least once in the second trimester, and at least twice in the third trimester
(World Health Organization 2005).
4
6
The delivery expense is the variable that differentiates mothers whose babies were and were not
weighed at birth. In general, difference in delivery cost is large between the two groups of
mother: 892,835.30 rupiahs (about A$100) for weighed babies and 126,338.20 rupiahs (A$14)
for unweighed babies. This is a key variable that will be included in the selection equation
(selection for weighed or not weighed), but will not be included in the outcome equation for
birthweight (in kilograms). Thus, it is assumed that the cost of delivery is an instrument in the
selection equation that will not influence birthweight.MLKJIHGFEDCBA
R e s u lt s a n d D is c u s s io n
Estimation results of the Heckman model with the selection problem are provided in Table 2.
The selection equation is the equation that represents whether the baby is weighed or not
weighed immediately following delivery. The estimate for rho indicates a weak correlation
between the selection and the birth outcome (jj
=
-0.0614).
The negative estimate of rho may
appear counterintuitive; however, it is not statistically significant. The associated Wald-test of
independence of equations is not statistically significant (chi2JIHGFEDCBA
= 0.63) with p-value
=
0.4289.
This suggests that the Heckman model with selection may not be appropriate in this case or, in
other words, there is no sample selection bias problem due to unweighed babies.
Although the Heckman model is not appropriate in this case, it is interesting to observe the
results of the selection equation. The estimated coefficient of money spent on delivery, as a key
variable, is significant in explaining whether the baby was weighed or not. Birthweights also are
more likely to be recorded for mothers who have more years of schooling, are from households
with a higher household socioeconomic index with per capita expenditure between 25th and
so"
percentiles of the population level. However, as there is no evidence of a selection bias, the
analysis can be continued on a single structural equation for birthweight, using Ordinary Least
Squares (OLS) regression.
7
Table 2: Heckman Selection Model Cor Weighed and Not Weighed Babies
Variable
Outcome
Selection
Coefficient (SE)
Coefficient (SE)
Total number of prenatal care visit
o.oi 17 (0.0059)**
0.1290 (0.0135)···
Total number of prenatal care visit squared
-0.0002 (0.0002)
-0.0034 (0.0005)··*
Household index
-0.0064 (0.0072)
0.1848 (0.0229)**·
0.0017 (0.0022)
0.0497 (0.0075)···
Age of mother between 25-34 yrs
0.0452 (0.0198)**
0.1166 (0.0683)·
Age of mother 35 and older
0.0832 (0.0304)***
0.1391 (0.0997)
Body mass index
0.0435 (0.0126)***
0.0145 (0.0638)
Years of education
Age of mother less than 25 yrs (baseline)
I
,J
Body mass index squared
-0.0006 (0.0002)**
Dummy if Body mass index is imputed
-0.0810 (0.1691)
5.2062 (0.1852)···
Healthy mother (general health status)
-0.0098 (0.0296)
0.0573 (0.0923)
-0.0735 (0.0180)***
0.1 096 (0.063)·
-0.0203 (0.0242)
0.1377 (0.0722)*
-0.0349 (0.0237)
0.0977 (0.0781)
utsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
First birth
Per capita expenditure below 25th percentile of
population level (baseline)
Per capita expenditure between 25th and so"
percentile of population level
Per capita expenditure above 50th percentile of
population level
0.0003 (0.0013)
0.0931 (0.0948)
0.5679 (0.3416)*
Male baby
0.0838 (0.0169)"*
0.0003 (0.0544)
Singleton baby
0.2911 (0.0489)"*
0.1 02 (0.1392)
Having pregnancy complication
0.0365 (0.0250)
-0.0456 (0.0886)
-0.0158 (0.0051)"*
-0.0009 (0.0181)
Smoking behavior
Year of baby's birth
Cost of delivery of the baby
Constant
Rho
Wald test of independent equation (chi2)=0.63
0.0000015 (0.0000005)"*
2.2066 (0.1719)***
-2.1115 (0.7648)*"
-0.0614 (0.0775)
Probability> chi2 = 0.4289
Number of observation
4436
587
••• Significant at a ) percent level; ** at a 5 percent level; * at a ) 0 percent level.
Additional (Sensitivity) Analysis
Variables, such as years of education, BMI, general health status, household index, and per
capita expenditure have missing information. Observations with missing information were
excluded from the regression analyses. As a sensitivity analysis, the model was re-estimated,
using imputed values for the missing data. The median value for continuous variables and zero
values for dummy variables were applied to replace the missing values. Dummy variables,
8
iildicnting that the data were missing,
results of these sensitivity
analyses
Wald-test
as additional
were qualitatively
lor rho indicates a weak correlation
The associated
were included
between
of independence
variables.
the same as those reported.
selection
is not statistically
that the reported
outcome
The
The estimate
(jjJIHGFEDCBA
= -0.0718814 ).
and birth outcome
of equations
0:.)'-;) with p-value = 0.3297. This indicates
explanatory
significant
equations
(chi '
=
are robust to
»ussingness and the evidence of the sample selection problem has not been found.MLKJIHGFEDCBA
'C h a p te r C o n c lu s io n
This study has analyzed
'wcip,hed immediately
the potential
after delivery
oi[..; because some pregnancies
selection
in Indonesia.
that arises when some babies are not
In other countries,
there is a sample selection
do not end in live birth due to abortion
resolution bias), which are not random.
the pregnancy-resolution
problem
decision
In Indonesia,
religious
is less of a concern,
decisions
and cultural
(pregnancy-
views indicate that
but there is a similar sample
selection
issue for unweighed babies.
In this study, I use the Heckman
selection model to test whether there is a potential selection bias
from babies that are not weighed.
The results show that I did not find evidence
problem from some babies not having been weighed
be continued on the sub-sample
One limitation
in Indonesia.
therefore,
can
of live-birth babies with reported birthweight.
has been noted
in the analysis.
The cost of delivery
women. It is difficult to use the cost of delivery that is measured
the data on the community
The analyses,
of a selection
(IFLSI). However, I use the data from IFLS 2007-08. There are many missing community
data
variables
from the community
module
delivery that is reported by individual
of communities
level because
1993
(not panel respondents)
on the sample
at the community
by the
IFLS
for new respondents
level is based
here is reported
in IFLS 2007. Therefore,
and so the analysis
from
it is difficult to use any
uses the information
of cost of
mothers,
9
R e f e r e n c e utsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
& Thomas, D. 2001, 'Women's health and pregnancy outcomes: do services
Frankenberg, E.JIHGFEDCBA
make a difference?',ONMLKJIHGFEDCBA
D e m o g r a p h y , vol. 38, no. 2, pp. 253-65.
Habibov, N.N. & Fan, L. 2011, 'Does prenatal healthcare improve child birthweight outcomes in
Azerbaijan? Results of the national demographic and health survey', E c o n o m i c s a n d
H u m a n B i o l o g y , vol. 9, no. 1, pp. 56-65.
Heckman, J.1. 1979, 'Sample selection bias as a specification error', E c o n o m e t r i c a , vol. 47, no.
1, pp. 153 - 61.
Liu, G.G. 1998, 'Birth outcomes and the effectiveness of prenatal care', H e a l t h S e r v i c e s
R e s e a r c h , vol. 32, no. 6, pp. 805-23.
Mwabu, G. 2009, 'The production of child in Kenya: a structural model of birth weight', J o u r n a l
o f A f r i c a n E c o n o m i e s , vol. 18, no. 2, pp. 212-60.
Rous, J.1., Jewell, R.T. & Brown, R.W. 2004, 'The effect of prenatal care on birthweight: a fullinformation maximum likelihood approach', H e a l t h E c o n o m i c s , vol. 13, no. 3, pp. 25164.
Wooldridge, J.M. 2002, E c o n o m e t r i c a n a l y s i s o f c r o s s s e c t i o n a n d p a n e l d a t a , The MIT Press
Cambridge, Massachusetts, London.
World Health Organization 2005, T h e W o r l d H e a l t h R e p o r t 2 0 0 5 : M a k e e v e r y m o t h e r a n d c h i l d
c o u n t , World Health Organization, Geneva.
lO
f,~ r -
,~ ,
Faculty of Economics
'
~
U N IV E R S IT A S
G A D JA H
H eni
/
/
/'
./,-;;="7._,<
"'",/
/.,/ .«.•••
. -1 > -c ..~
c..---. (/"c...
ti!;;" -
./
~
v
INNOVAII
INIIRACI
I NSPIR I
1955-2015
I~
th a t
Wahyuni,
ONMLKJIHGFEDCBA
FEB
UGM
M ADA
.- utsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
T h is is c e r tify
,~
B u s in e s s .
andMLKJIHGFEDCBA
PaninBank
TEl lJl1 1\ l1N61N'
1 f SIOOMUNCUI.
P h .D .
. -.
.