What and why ”small area”
Small Area Estimation Research Activity 1) in Bogor Agricultural University Khairil A. Notodiputro Anang Kurnia Department of Statistics Bogor Agricultural University Jl. Meranti, Wing 22 Level 4
1) Kampus IPB Darmaga, Bogor 1 Paper has presented in Seminar on Use, Analysis and Application of Small Area Statistics, March 7, 2007. BPS-JICA
What and why ”small area”
Definition : A sub-population is small if the domain specific sample size is not large enough to support direct estimates of adequate precision.
Small area, small domain, local area
Small geographical area
Domain : age-sex-race, poverty status
What and why ”small area”
Direct estimates: Use area-specific sample data only.
Indirect estimates: Borrow strength from sample observations of related areas through auxiliary data (recent census and current administrative records) to increase effective sample size.
Can we minimize or even eliminate the use of indirect estimates ? 3 Small Area Estimation in Indonesia
The attention of SAE has increased along with increasing of
government or private sector demand to provide accurate information quickly, not only for national (large domain) but also for small domain such as sub-district.
In Indonesia, it’s important to develop SAE because nowadays
there is moving away from centralization to decentralization in
making decision of public policy that a local government can
manage their districts, allocate their funds and make regional
planning well. Certainly, the decision maker in local government will require some statistics for their districts.
Statistics Indonesia (BPS) regularly conducts surveys such as
“SUSENAS, etc” but it’s based on national designed.Small Area Model
- e
We assume that β and A unknown but D i are known.
5 There are essentially two-types of SAE models:
Basic area level model that relate small area direct estimator to area-specific auxiliary data
Basic unit level model that the information is available at the sampling unit level and modeling is done based on individual data
Consider the following Fay-Herriot (1979) model for area level
y i= x i
’ β + υ i
i where υ i and e i are independent with υ i
~ N(0, A), e i
~ N(0, D i ) for i = 1, 2, ..., k.
Small Area Model
Model description :
- e
i
i
i
Æ the special case of GLMM
The best predictor (BP) of θ
i
= x
i
’ β + υ
if β and A = σ
i
2 υ
known is given by: where B
i
= σ
2 ei
/( σ
2 υ
2 ei
). The best predictor is equivalent with empirical bayes approach for normal cases.
’ β + υ
= x
1. x i
= x
= (x
i1
, x
i2
, ..., x
ip
) Æ auxiliary data 2. θ
i
i
4. y i
’ β + υ
i
Æ the parameter that is a function of auxiliary data and random effect υ
i 3.
y
= θ
i
i
Æ direct estimate with sampling error
i
- e
- σ
A Brief Review of SAE Techniques
1. Estimates of small area characteristics based on fixed effect models are reffered to as synthetic estimator (Levy and French, 1977), composite estimator (Schaibel et al, 1977), and prediction estimator (Holt et al, 1979, Sarndal 1984, Marker 1999)
2. Mixed models have been used to improve estimation of small area characteristics of small area based on survey sampling or cencus data by Fay and Herriot (1979), Ghosh and Rao (1994), Rao (1999) and Pfeffermann (1999)
3. In addition to EBLUP, empirical Bayes (EB) and hierarchical Bayes (HB) estimation and inference methods have been also applied to small area estimation.
4. Ghosh and Rao (1994) review the application of these estimation.
Maiti (1998) has used non-informative priors for hyperparameters in HB methods and You and Rao (2000) have used HB methods to estimate small area means under random effect models. 7 A Brief Review of SAE Techniques
6. A general approach for SAE based on GLM is describe in Ghosh et al (1998), Malec et al (1999). Farrel et al (1997) extended the mixed logistic model and Moura and Migon (2001) further extend with introducing a component to account for spatially correlated structure in the biner respon data.
7. A measure of uncertainty of EBLUP or EB has been developed in recent years. Rao (2003) described the result of simulation study of Jiang, Lahiri and Wan (2002). They reported the simulation results on the relative performance of estimator of MSE under the simple model.
8. Some author who concern in a measure of uncertainty are Butar and Lahiri (2001, 2003) on Bootstraping methods; Wang and Fuller (2003), Rivest and Vandal (2003) on aspect of unknown sampling variance; Rao (2003), Jiang, Lahiri and Wan (2003) on jackknife MSE estimator; Datta, Rao and Smith (2004) on HB estimator.
Inference of Area Level Model
1. Empirical Best Linear Unbiased Predictor (EBLUP)
Î Estimation of variance component
2. Empirical Bayes (EB)
Î Mean of posterior distribution, the parameter was estimated from empirical data
3. Hierachical Bayes (HB)
Î Mean of posterior distribution, prior distribution
9 Inference of Area Level Model
One of recent problem on SAE is uncertainty and MSE estimator
Ghosh and Rao (1994), Prasad and Rao (1990), Butar and
Lahiri (2001, 2003), Jiang, Lahiri and Wan (2002), Chen and Lahiri (2001, 2005), Hall and Maiti (2005) give a contribution for this problem.
The approximation that proposed by some authors could be eliminated the problem of underestimate especially for case of A = D = 1 and X
β = 0. However Kurnia and Notodiputro (2006)
i
showed that for heterogenity of D (sampling error) and all of
i
parameter model must be estimate, the underestimate of MSE large enough about 13% - 19%.
The Development of SAE Research in IPB
11 The Chronology
1. A discussion in ”Forum Masyarakat Statistik”
held in Solo on December, 3, 2004, identifiedthe need of quality research in small area
estimation models for Indonesia Case
2. In 2003 Smeru Research Institute developed
small area statistics map for several provinces
3. Kurnia and Notodiputro (2005) carried out a
study of generalized linear mixed model
approach and hierarchical Bayes for SAE applied to BPS data.Research on Small Area Estimation at IPB has been carried out through support from DGHE
Developing Small Area Estimation Models for BPS Data
This research is conducted in three years (2006-2008) The Development of SAE Research in IPB
Development Stages of SAE Research at IPB
Roadmap Kepustakaan untuk Komparasi Tinjauan Desain Survey Identifikasi Data Komparasi Hasil, BPS /Review Evaluasi Perbaikan M odel Sampling Konsistensi Hasil Pengembangan M etode untuk M etodologi Komparasi Eksplorasi Survey untuk Pemilihan Peubah Akurasi dan Presisi Penyusunan M eningkatkan (Software) Program M etode yang Potensial M etode (Unggul) Evaluasi M etode Pengembangan Desain Software Uji Coba Beberapa Hasil Pembandingan Simulasi yang Kajian untuk Terhadap Data M etode M engikuti Sampling BPS Up dating data Implementasi BPS Software M etode Terhadap M etode Terhadap Penerapan Penerapan Data BPS Data BPS
Tahun I Tahun II Tahun III
13 The Development of SAE Research in IPB Papers in conference proceeding or journals:
Kurnia, A. dan Notodiputro, K.A. 2006. The Jacknife Method in
Small Area Estimation. Forum Statistika dan Komputasi, Vol. 11 No.1, p:12-15.
Sadik, K. dan Notodiputro, K.A. 2006. Small Area Estimation
based on Random Walk Models. Forum Statistika dan Komputasi, Vol. 11 No.1.
Sadik, K. dan Notodiputro, K.A. 2006. Small Area Estimation with Time and Area Effects Using Two Stage Estimation, ICoMS-1 : Bandung.
Kurnia, A. dan Notodiputro, K.A. 2006. EB-EBLUP MSE Estimator
on Small Area Estimation with Application to BPS Data, ICoMS-1 : Bandung.
Kismiantini, Kurnia, A. and Notodiputro, K.A. 2006. Risk of
Dengue Haemorrhagic Fever In Bekasi Municipality With Small Area Approach, ICoMS-1 : Bandung .
The Development of SAE Research in IPB
Papers in conference proceedings or journals (continued): Indahwati and Notodiputro, K.A. 2006. Effect of Inappropiate
Sampling Design on Reliability of Small Area Estimates, ICoMS-1 : Bandung.
Sadik, K. dan Notodiputro, K.A. P-Spline M-Quantile Approach in Small Area Estimation. Jurnal Matematika Aplikasi dan Pembelajaran, Vol. 5 No.2 Jilid 1, p:142-147.
Kurnia, A. dan Notodiputro, K.A. Effects of Sampling Variance
Estimation in Small Area Estimation. Jurnal Matematika Aplikasi dan Pembelajaran, Vol. 5 No.2 Jilid 2.
Handayani, D. dan Kurnia, A. 2006. Empirical Bayes Approach to
Estimate Finite Population Mean in Small Area Estimation.Jurnal Matematika Aplikasi dan Pembelajaran, Vol. 5 No.2 Jilid 2
15 The Development of SAE Research in IPB nd
Research plan in the 2 year :
The research will be focus on development of
method of small area estimation to increase the accuration Data simulation designed to mimick the BPS sampling, will be generated and utilized
A survey will be conducted to compare the
estimates resulting from the method with the direct estimate.17 Thank You