Resident population Calculation of sampling weights and post-stratification

271 Table IV.2 Interviewed households, by year, and by season Shamsi calendar Season Year Total 1392 1393 Winter 5,067 - 5,067 Spring 4,449 4,449 Summer 4,880 4,880 Autumn 6,390 6,390 Total 5,067 15,719 20,786 Non-response within clusters was very limited. Only 845 4.1 percent of the households in the visited clusters were not available or refused or were unable to participate. In 841 of these cases households were replaced by reserve households listed in the cluster reserve list, leaving 4 households unaccounted for 0.02 percent.

IV.8 Calculation of sampling weights and post-stratification

Sample weights are the scaling factors that are required to inflate the sampled households to the number of households that they represent in the survey. The use of stratification in the ALCS 2014 sample design requires that sample weights are separately calculated for each stratum. Due to imperfections in the survey implementation, the design weights are adjusted in order to achieve optimal representation of the survey results. This section separately addresses the calculation of the sampling weights for the resident population and that for the Kuchi population.

IV.8.1 Resident population

Calculation of the design weight The first step in calculating the sample weights is calculating the weights that would inflate the sampled households to the number of households in the sampling frame. This calculation follows from the selection probability of the households as defined in the sampling design. In the two-stage sampling design of ALCS 2013-14, the PSUs were the EAs as defined in the sample frame, made up of the 2009 household listing and the available household listings from the SDS. The selection of PSUs in the first sampling stage was implemented in accordance with:  stratification by province  an optimum allocation distribution for provinces, which minimises the standard error  selection with probability proportional to the number of households PPS. The probability of selecting a PSU in stage 1 is p 1 = c s h ps H s09 where p 1 is the probability of selecting PSU or EA p in stratum s, c s is the number of clusters selected in stratum s, h ps is the number of households in EA p from stratum s and H s09 is the number of households in stratum s as reported in the sampling frame. For EAs encompassing two or more villages, a second sampling stage was introduced in order to reduce travel time and costs. The selection of the village to be included was done with probability proportional to the number of households, with 272 p 2 = m vs h ps where p 2 is the probability of selecting one village out of all villages in EA p in stratum s, m vs is the number of households in that village and h ps is the number of households in EA p from stratum s. For EAs without village segmentation, m vs = h ps and p vp = 1. The Ultimate Sampling Units in the survey were households. The sampling design specified a fixed number of 10 households per selected EA. Therefore, the probability of selecting a household in an EA or in the selected village in the EA in the third sampling stage is p 3 = 10 m vs The overall probability of selecting a household is the product of the selection probabilities in each stage for any stratum. p 123 = p 1 p 2 p 3 = c s h ps H s09 m vs h ps 10 m ps = 10 c s H s09 The design weight for each sampled household is the reciprocal of the selection probability, thus d w hs = 1 p 123 = H s09 10c s where d w hs is the design weight for households in stratum s. The weighted sample total – the sum of the products of sampled households and their respective design weights – is equal to the total population of households in each stratum in the sample frame: ∑ h ps d w hs = H s09 Calculation of non-coverage adjustment factors Two main reasons exist in survey taking for exclusion of households in the collected data:  Non-response – households not willing to be interviewed or not available for being interviewed  Non-coverage – households that cannot be reached if areas are inaccessible because of reasons such as the local security situation or road conditions. 66 Non-response in ALCS in not a major issue: overall non-response was 4 percent. Very few household refuse to collaborate in the survey and most of the non-response was due to non-available households. As ALCS adopted the strategy of addressing non-response by substituting households from a reserve list, there is no need to adjust for non-response. Non-coverage, on the other hand, was a more serious problem in the survey, especially because of the security situation in the country. Non-coverage was partly addressed by replacing inaccessible clusters by clusters from a reserve list. Since a number of inaccessible clusters could not be replaced during the fieldwork of ALCS, the sampled households weighted by the design weight ∑h ps d w hs do not add up to the total population of households H s in the sample frame. To compensate for non-covered households, the design weight was adjusted. To obtain the non-coverage adjustment factor, first the non-coverage rate was calculated. This is the ratio between the number of actually interviewed households and the number of sampled households: nc s = i h s s h s 66 In addition, some surveys exclude some areas on beforehand because the relevance of information from these – e.g. very thinly populated areas – does not compensate the costs of getting there. 273 where nc s is the non-coverage rate in stratum s, i h s is the number of interviewed households in stratum s and s h s is the number of sampled households in stratum s. The adjustment factor for non-coverage in stratum s n w hs is the reciprocal of the non-coverage rate nc: n w hs = 1 nc s The sample weight that is required to scale-up the sampled households to the total population of households in the sample frame dn w hs now becomes the product of the design weight and the non- coverage factor. For each stratum s this is: dn w hs = d w hs n w hs The newly weighted sample total ∑ h ps dn w hs is again equal to the sample frame population H s09 . Calculation of post-stratification factors Additional expansion factors are required to re-scale the number of households in the sample frame to the number in the period in which the survey was conducted. As in the previous survey round, the estimated number of households was derived from the CSO population projections by province P s14 . 67 For the settled households, the provincial population was divided by the average household size for that province, which was obtained in the current survey by applying dn w s the combined design weight and non-coverage factor in order to reduce distortion by sampling and coverage effects. Since the re-scaling of the number of households is done at province level, this normalisation exercise implies post-stratification of the sample. The re-scaling factors are calculated as the ratio between the CSO estimate of the number of households in 2014 in a stratum and the number of households in the sampling frame: r w hs = H s14 H s09 and the combined sampling weight becomes dnr w hs = dn w hs r w hs Seasonal distribution Because the interview implementation was not entirely uniform across seasons quarters, uncorrected annual estimates would place relatively larger weights on those seasons which had a large sample winter and autumn, thereby distorting the representativeness of national results. Because the sample was stratified by season, and imposing the assumption that the level of seasonal, international migration is negligible, the weighted distribution can be smoothed out to ensure that the estimated population size by quarter is the same. This adjustment is implemented as: w hsq = 1 dnr w hs 0.25 P s14 dnr p sq where w hsq is the factor that standardises across seasons quarters and dnr p sq is the sampled population in stratum s and season q, weighted by the weights for the sampling design, non-coverage and re-scaling. The denominator gives the total number of sampled, settled individuals in each stratum by quarter. The adjustment term in the numerator gives the population of individuals for each stratum by quarter according to the CSO 2014 population estimate. 67 The 1393 population projections are considered a sufficient approximation for the mid-survey population. 274 The final household sampling weight hw hsq is the product of all weighting factors: hw sq = dnr w hs w hsq Individual weights In order to obtain the expansion factor for individuals the following calculation was made: iw hsq = hw hsq hs hsq the term hs hsq being the household size of household h in stratum s and quarter q. IV.8.2 Kuchi population The Kuchi sample was designed on basis of the 2003-04 National Multi-sectoral Assessment of Kuchi NMAK-2004. For this separate Kuchi stratum a community selection was implemented with PPS and a second stage selection with again a constant cluster size of ten households. The 66 clusters 660 households for this stratum were divided between the summer 30 clusters and winter 36 clusters periods in 1393 2014. In the absence of up-to-date information about the actual number of Kuchis and the poltical sensitivity of addressing this issue, the present position taken by CSO is that the Kuchi pupolation is stable at a number close to 1.5 million people. Apart from the sampling frame, the restriction to two seasons and the absence of the need to accommodate population growth, the procedures for the calculation of the sampling weights for the Kuchi stratum are the same as those for the resident population

IV.8.3 Weights variables The values of the final household sample weight hw