Quality assurance Afghanistan - National Risk and Vulnerability Survey 2011-2012, Living Conditions Survey NRVA 2011 12 report

184 • The survey was designed to produce information that: ▪ is representative at the level of provinces ▪ captures the seasonality of development indicators. • The questionnaire was designed to capture information relevant to the speciic context of Afghanistan. • NRVA played a key role in deining national deinitions of employment, underemployment and unemployment, in agreement with key stakeholders, such as MoEc, MoLSAMD, ILO and World Bank. • A strategy was designed and implemented to optimise the probability of implementing ieldwork in remote and high- risk areas and thus avoiding bias in survey results. • Key stakeholders participated in the NRVA Steering Committee and Technical Advisory Committee to ensure the soundness the overall project strategy and technical components. b. Management of completeness • The NRVA survey cycle is based on a rotation scheme that was agreed upon by the stakeholders March 2010. This implies that in each survey round a core set of key indicators across development themes is produced, and that at appropriate intervals additional or expanded questionnaire modules are administered to allow more comprehensive information for selected themes in successive survey rounds. • Special efforts were made to capture information about the nomadic Kuchi population. c. Management of accuracy • Questionnaire design included considerations of question justiication, wording, sequence of questions and modules, complexity of routing, interview burden, classiications, formatting and layout. • Questionnaires were tested in a pre-test and in a pilot test October 2010. • Field staff recruitment was based on review of CV’s, a written test by and an interview with shortlisted applicants, as well as on a inal exam during the ield-staff training. • Training activities and procedures included the following: ▪ The ield staff training was centrally organised in Kabul to ensure a uniform training by the highest qualiied CSO staff. ▪ The ield staff training was conducted during a full three weeks to allow suficient time for respective training elements. ▪ In addition to the initial central training, two rounds of regional workshops were conducted to discuss lessons learned and provide refresher training. • Field monitoring and supervision was implemented at several levels: ▪ Field supervisors supervised day-to-day procedures and checked completed questionnaires. ▪ Provincial Statistical Oficers checked completed questionnaires on a sample basis. ▪ Regional Statistical Oficers supervised general ield operations. ▪ Key NRVA staff from CSO Headquarters performed monthly ield monitoring missions. • Data-processing activities and procedures included the following: ▪ Monthly provincial batches of completed questionnaires were manually checked upon receipt at CSO Headquarters. In case of serious shortcomings, questionnaires were referred back to the ield. ▪ Data capture in MS Access consisted of double data entry with independent veriication. This in principle eliminated any data typing mistakes. ▪ Checks in MS Access were performed to identify and remedy essential data structure and data integrity problems. ▪ A limited number of consistency and range checks in MS Access were performed before the raw dataset was delivered. ▪ Comprehensive data-editing programmes were designed in Stata to perform consistency and range checks ▪ Frequency and cross tabulations were produced in Stata to determine response distributions and identify any skewed data, missing values, odd results and outliers. Data were corrected as far as circumstantial evidence allowed. If this was not possible, incorrect values were converted to missing values. ▪ NRVA 2011-12 results were triangulated with other data sources where available to assess their plausibility. ▪ Indicators of sampling and non-sampling errors were produced to assess speciic data quality components see sections IX.3 and IX.4. d. Management of comparability • Advise was sought with international experts and agencies as to better harmonise NRVA data collection and analysis with international standards and keep it up-to-date with new developments. ANNEX IX QUALITY ASSURANCE AND QUALITY ASSESSMENT 185 ANNEX IX QUALITY ASSURANCE AND QUALITY ASSESSMENT • Consultations with national stakeholders were organised to explore comparability between NRVA and other data sources, as well as strategies to improve comparability. • In each phase of survey implementation, comparability with previous rounds of NRVA was a key consideration. e. Management of coherence • In the absence of a formalised procedure NRVA explored on an ad-hoc basis the consistency of sampling, data collection, methodologies, concepts, deinitions, classiications, indicators and other statistics, and dissemination across statistical activities in CSO, and between CSO and other data producers and data users. Where feasible, these were harmonised. f. Management of timeliness • Data collection and data processing were done in parallel to minimise the period between completion of both activities. • Monitoring procedures were designed and implemented to monitor progress in data collection and data processing. • The release of quarterly reports with selected key indicators one month after completion of quarterly data collection was planned. Only the release of one mid-term report with selected key indicators was realised August 2012. g. Management of accessibility • The NRVA 2011-12 report will be made available in Dari, Pashtu and English, as to broaden NRVA’s effective audience. • The NRVA 2011-12 report will be made available both on the CSO website and in printed form. • Selected tables at national and provincial level will be made available on the CSO website. • The NRVA 2011-12 report provides meta data, which supports the understanding of the contents and quality of the survey results. These meta data include, among other, information about questionnaires, sampling design, survey procedures, concepts and deinitions, methodologies applied for poverty and food-security assessment, mortality estimation, and quality assurance. • NRVA data will be made available to data users in line with CSO’s micro-data access policy. • The name of NRVA will change to ALCS – Afghanistan Living Conditions Survey – to enhance the survey’s appeal and recognition.

IX.3 Sampling errors

Statistics based on a sample, such as means and percentages, generally differ from the statistics based on the entire population, since the sample does not include all the units of that population. The sampling error refers to the difference between the statistics of the sample and that of the total population. Usually, this error cannot be directly observed or measured, but is estimated probabilistically. The sampling error is generally measured in terms of the standard error for a particular statistic, which equals the square root of the variance of that statistic in the sample. Subsequently, the standard error can be used to calculate the conidence interval within which the true value of the statistic for the entire population can reasonably be assumed to fall: a value of the statistic produced from the sample will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design. Table IX.2 provides an overview of standard errors and conidence intervals for selected indicators. Since the sample design of NRVA 2011-12 is not simple random sampling, but a multi-stage stratiied design, the linearisation method is used for estimation of standard errors. IX QUALITY ASSURANCE AND QUALITY ASSESSMENT Table IX.2 Sampling errors and conidence intervals for selected indicators Statistic Base population Value Standard error Conidence limits Lower Upper Percentage under age 15 Total population 48.4 0.3 47.9 48.9 Average household size All households 7.4 4.4 7.3 7.5 Employment-to-population ratio Working-age population 45.7 0.3 45.1 46.4 Percentage not-gainfully employed Labour force 25.0 0.5 24.1 25.9 Percentage owning irrigated land All households 37.9 0.8 36.5 39.4 Percentage owning garden plot All households 12.6 0.6 11.4 13.7 Percentage below the poverty line Total population 36.5 0.8 34.8 38.1 Percentage food-insecure Total population 30.1 0.7 28.7 31.6 Youth literacy rate Population aged 15-24 47.0 0.9 45.2 48.8 Net attendance ratio in primary education Primary-school-age population 56.8 0.8 55.1 58.4 Net attendance ratio in secondary education Secondary-school-age population 32.7 0.9 30.9 34.4 Percentage receiving skilled ante-natal care at least 1 visit Married women under age 50 with a birth in the ive years preceding the survey 51.2 0.9 49.5 52.9 Percentage of births attended by skilled health personnel. Married women under age 50 with a birth in the ive years preceding the survey 39.9 0.9 38.1 41.6 Percentage using improved drinking water sources Total population 45.5 1.1 43.4 47.7 Percentage using improved sanitation Total population 8.3 0.6 7.1 9.7 187 ANNEX IX QUALITY ASSURANCE AND QUALITY ASSESSMENT IX.4 Non-sampling errors IX.4.1 Overview of possible non-sampling errors Aside from the sampling error associated with the process of selecting a sample, a survey is subject to a wide variety of non-sampling errors. These errors may – and unavoidably do – occur in all stages of the survey process. Non-sampling errors are usually classiied into two groups: random errors and systematic errors. Random errors are unpredictable errors that are generally cancelled out if a large enough sample is used. Since NRVA has a large sample size, random errors are a priori not considered to be an issue of large concern. Systematic errors are those errors that tend to accumulate over the entire sample and may bias the survey results to a considerable extent. Therefore, this category of non-sampling errors is a principal cause for concern. The following overview elaborates the main types of systematic non-sampling errors. Coverage errors Coverage errors occur when households are omitted, duplicated or wrongly included in the population or sample. Such errors are caused by defects in the sampling frame, such as inaccuracy, incompleteness, duplications, inadequacy or obsolescence. Coverage errors may also occur in ield procedures, for instance when omitting speciic households or persons. The sampling frames used for NRVA 2011-12 included the 2003-05 pre-census household listing and the 2003-04 National Multi-sectoral Assessment of Kuchi NMAK-2004. Both listings were outdated by the time of ieldwork implementation and it is likely that in the intervening period considerable changes occurred with respect to the number and geographic distribution of households. This will have had particular effect on newly built-up urban areas and squatter settlements, including areas with high density of internally displaced persons and returnees. Such areas will have been systematically underrepresented in the sample selection. With regard to the Kuchi coverage, besides the observed, but un-quantiied rate of settlement of Kuchi households and natural population growth, changing migration patterns will have caused a population distribution in 2011-12 that is different to the one represented in the NMAK list. Non-response errors There are two types of non-response: unit non-response and item non-response. Unit non-response implies that no information is obtained from a given sample unit, while item non-response refers to a situation where some but not all the information is collected for the unit. Item non-response occurs when respondents provide incomplete information, because of respondents’ refusal or incapacity to answer, or omissions by interviewers. Often non-response is not evenly spread across the sample units but is concentrated among sub-groups. As a result, the distribution of the characteristics of subgroups may deviate from that of the selected sample. Unit non-response in NRVA 2011-12 occurred to the extent that sampled clusters were not visited, or that sampled households in selected clusters were not interviewed. Out of the 2,100 originally scheduled clusters, 150 7.1 percent were not visited. For 133 of these non-visited clusters 6.3 percent, replacement clusters were sampled and visited. Although this ensured the approximation of the targeted sample size, it could not avoid the likely introduction of some bias as the omitted clusters probably have a different proile than included clusters. In the visited clusters – including replacement clusters – 797 households 3.8 percent of the total could not be interviewed because – mostly – they were not found or because they refused or were unable to participate. For 779 of these non-response households 3.7 percent of the total, replacement households were sampled and interviewed. Since the household non-response is low and it can be expected that the replacement households provide a reasonable representation of the non-response households, this non-response error is considered of minor importance. The overall unit non-response rate – including non-visited clusters and non-interviewed households, without replacement – is 10.9 percent. With regard to item non-response, the close to 800 variables in the NRVA household and Shura questionnaires each reveal different levels of missing values. During the data-processing stages of manual checking, computerised batch editing and inal editing these levels were reduced by edit strategies. For some key variables, 2 missing values were 2 All household identiication variables – Cluster code Q1.1, Residence code Q1.2, Province code Q1.3, District code Q1.4, Nahia code Q1.5, Control and Enumeration area code Q1.6 and Village code Q1.7, as well as individual-level variables Relationship to the head of household Q3.3, Sex Q3.4, Age Q3.5 and Marital status Q3.6.