282
VI.2  Model development
The exercise proceeds in three steps.   First, we identified common non-consumption variables across surveys, including variables that
correlate  well  with  household  consumption  data . Obviously, the model’s ability to estimate
changes in household consumption and poverty depends on changes in the explanatory variables over the same time. If the final model includes only variables that do not change
– things such as  type  of  dwelling  and  construction  material  of  walls  and  floor
– the model will predict a poverty rate the same as the poverty rate for the base year. The model therefore must include
variables that change more over time, such as household head’s employment status and level of conflict.
We reviewed each question across the three surveys to ensure that the common variables used remained  comparable  across  surveys.
69
For  instance,  although  the  question ‘type  of  toilet
facility used by household ’ is in all three surveys, the list of the categories of toilet changed in
ALCS 2013-14. Therefore, we selected only categories within the question that are comparable. Similarly, the labour module covers a different length of time in the different surveys, altering
female labour participation rates across surveys; as such, we restrict labour outcome variables such  as  employment  status  and  employment  type  to
‘head  of  household’  and  ‘adult  male members between the ages of 25 and 50
’, categories not affected by the time range used.
Table VI.1
summarises the variables.   Secondly, we developed a model following the Yoshida et al. 2015 SWIFT approach. The
model assumes a linear relationship between household consumption and its correlates, and the model assumes a projection error.
70
The equation representing the consumption model is: ln
ℎ
=
ℎ
′� + �
ℎ
1 Where,
���
ℎ
is  the  log  of  per  capita  consumption  of  household
h
,
ℎ
is  a  k×1  vector  of poverty correlates of household h, β is a k×1 vector of coefficients of poverty correlates, k is
a number of variables and �
ℎ
is the projection. The explanatory variables in the right-hand side of the model capture variation in household consumption, thus differentiating poor from non-
poor  households.    For  the  equation  1  we  use  NRVA  2011-12  survey,  which  has  the consumption data. We then impose the estimated variables of the model onto the ALCS 2013-
14 dataset to predict household consumption and the poverty rate.
The SWIFT modelling process includes multiple steps to improve the ability of the formula to project household income or expenditures by adjusting the coefficients β and estimating the
distributions  of  both  the  coefficients  and  the  projection  errors.
71
No  formula  is  perfect;  so inclusion  of  the  projection  error  is  essential  and  estimating  the  distribution  of  the  projection
error is key for estimating poverty rates and their standard errors.
69
Beegle  et  al.  2011  show  empirical  evidence  of  how  responses  of  households  to  questions  can  change  based  on  the questionnaire designs.
70
This does not mean SWIFT does not use a non- linear model. However, SWIFT’s formula is linear in variables created in
the dataset. Since some variables can be squares of other variables, SWIFT’s formula can be non-linear. One of typical examples is that SWIFT uses household size and household size squared in a formula.
71
The approach adopted by the SWIFT team is rather conservative in that the team did not adopt some approaches discussed at the frontier of research on modelling because the team thought evidence for these approaches is not yet strong enough.
However, the team has been exploring such new techniques and may update the SWIFT modelling process once enough supportive evidence for these methodologies is provided.
283
VI.3  Model selection: cross-validation