What and why ”small area”

Small Area Estimation Research Activity 1) in Bogor Agricultural University Khairil A. Notodiputro Anang Kurnia Department of Statistics Bogor Agricultural University Jl. Meranti, Wing 22 Level 4

  1) Kampus IPB Darmaga, Bogor 1 Paper has presented in Seminar on Use, Analysis and Application of Small Area Statistics, March 7, 2007. BPS-JICA

What and why ”small area”

  Definition : A sub-population is small if the domain specific sample size is not large enough to support direct estimates of adequate precision.

  „ Small area, small domain, local area

  „ Small geographical area

  „ Domain : age-sex-race, poverty status

What and why ”small area”

  Direct estimates: Use area-specific sample data only.

  Indirect estimates: Borrow strength from sample observations of related areas through auxiliary data (recent census and current administrative records) to increase effective sample size.

  Can we minimize or even eliminate the use of indirect estimates ? 3 Small Area Estimation in Indonesia

The attention of SAE has increased along with increasing of

government or private sector demand to provide accurate information quickly, not only for national (large domain) but also for small domain such as sub-district.

  

In Indonesia, it’s important to develop SAE because nowadays

there is moving away from centralization to decentralization in

making decision of public policy that a local government can

manage their districts, allocate their funds and make regional

planning well. Certainly, the decision maker in local government will require some statistics for their districts.

  

Statistics Indonesia (BPS) regularly conducts surveys such as

“SUSENAS, etc” but it’s based on national designed.

Small Area Model

  • e

  We assume that β and A unknown but D i are known.

  5 There are essentially two-types of SAE models: „

  Basic area level model that relate small area direct estimator to area-specific auxiliary data

  „ Basic unit level model that the information is available at the sampling unit level and modeling is done based on individual data

  

Consider the following Fay-Herriot (1979) model for area level

y i

  = x i

  ’ β + υ i

  i where υ i and e i are independent with υ i

  ~ N(0, A), e i

  ~ N(0, D i ) for i = 1, 2, ..., k.

Small Area Model

  Model description :

  • e

  i

  i

  i

  Æ the special case of GLMM

  The best predictor (BP) of θ

  i

  = x

  i

  ’ β + υ

  if β and A = σ

  i

  2 υ

  known is given by: where B

  i

  = σ

  2 ei

  /( σ

  2 υ

  2 ei

  ). The best predictor is equivalent with empirical bayes approach for normal cases.

  ’ β + υ

  = x

  1. x i

  = x

  = (x

  i1

  , x

  i2

  , ..., x

  ip

  ) Æ auxiliary data 2. θ

  i

  i

  4. y i

  ’ β + υ

  i

  Æ the parameter that is a function of auxiliary data and random effect υ

  i 3.

  y

  = θ

  i

  i

  Æ direct estimate with sampling error

  i

  • e
  • σ

A Brief Review of SAE Techniques

  1. Estimates of small area characteristics based on fixed effect models are reffered to as synthetic estimator (Levy and French, 1977), composite estimator (Schaibel et al, 1977), and prediction estimator (Holt et al, 1979, Sarndal 1984, Marker 1999)

  2. Mixed models have been used to improve estimation of small area characteristics of small area based on survey sampling or cencus data by Fay and Herriot (1979), Ghosh and Rao (1994), Rao (1999) and Pfeffermann (1999)

  3. In addition to EBLUP, empirical Bayes (EB) and hierarchical Bayes (HB) estimation and inference methods have been also applied to small area estimation.

  4. Ghosh and Rao (1994) review the application of these estimation.

  Maiti (1998) has used non-informative priors for hyperparameters in HB methods and You and Rao (2000) have used HB methods to estimate small area means under random effect models. 7 A Brief Review of SAE Techniques

  6. A general approach for SAE based on GLM is describe in Ghosh et al (1998), Malec et al (1999). Farrel et al (1997) extended the mixed logistic model and Moura and Migon (2001) further extend with introducing a component to account for spatially correlated structure in the biner respon data.

  7. A measure of uncertainty of EBLUP or EB has been developed in recent years. Rao (2003) described the result of simulation study of Jiang, Lahiri and Wan (2002). They reported the simulation results on the relative performance of estimator of MSE under the simple model.

  8. Some author who concern in a measure of uncertainty are Butar and Lahiri (2001, 2003) on Bootstraping methods; Wang and Fuller (2003), Rivest and Vandal (2003) on aspect of unknown sampling variance; Rao (2003), Jiang, Lahiri and Wan (2003) on jackknife MSE estimator; Datta, Rao and Smith (2004) on HB estimator.

Inference of Area Level Model

  1. Empirical Best Linear Unbiased Predictor (EBLUP)

  Î Estimation of variance component

  2. Empirical Bayes (EB)

  Î Mean of posterior distribution, the parameter was estimated from empirical data

  3. Hierachical Bayes (HB)

  Î Mean of posterior distribution, prior distribution

  9 Inference of Area Level Model „

  One of recent problem on SAE is uncertainty and MSE estimator

  „ Ghosh and Rao (1994), Prasad and Rao (1990), Butar and

  Lahiri (2001, 2003), Jiang, Lahiri and Wan (2002), Chen and Lahiri (2001, 2005), Hall and Maiti (2005) give a contribution for this problem.

  „

  The approximation that proposed by some authors could be eliminated the problem of underestimate especially for case of A = D = 1 and X

  β = 0. However Kurnia and Notodiputro (2006)

  i

  showed that for heterogenity of D (sampling error) and all of

  i

  parameter model must be estimate, the underestimate of MSE large enough about 13% - 19%.

The Development of SAE Research in IPB

  11 The Chronology

  

1. A discussion in ”Forum Masyarakat Statistik”

held in Solo on December, 3, 2004, identified

the need of quality research in small area

estimation models for Indonesia Case

  

2. In 2003 Smeru Research Institute developed

small area statistics map for several provinces

  

3. Kurnia and Notodiputro (2005) carried out a

study of generalized linear mixed model

approach and hierarchical Bayes for SAE applied to BPS data.

  Research on Small Area Estimation at IPB has been carried out through support from DGHE

Developing Small Area Estimation Models for BPS Data

  This research is conducted in three years (2006-2008) The Development of SAE Research in IPB

Development Stages of SAE Research at IPB

  Roadmap Kepustakaan untuk Komparasi Tinjauan Desain Survey Identifikasi Data Komparasi Hasil, BPS /Review Evaluasi Perbaikan M odel Sampling Konsistensi Hasil Pengembangan M etode untuk M etodologi Komparasi Eksplorasi Survey untuk Pemilihan Peubah Akurasi dan Presisi Penyusunan M eningkatkan (Software) Program M etode yang Potensial M etode (Unggul) Evaluasi M etode Pengembangan Desain Software Uji Coba Beberapa Hasil Pembandingan Simulasi yang Kajian untuk Terhadap Data M etode M engikuti Sampling BPS Up dating data Implementasi BPS Software M etode Terhadap M etode Terhadap Penerapan Penerapan Data BPS Data BPS

Tahun I Tahun II Tahun III

  13 The Development of SAE Research in IPB Papers in conference proceeding or journals:

„ Kurnia, A. dan Notodiputro, K.A. 2006. The Jacknife Method in

  Small Area Estimation. Forum Statistika dan Komputasi, Vol. 11 No.1, p:12-15.

  

„ Sadik, K. dan Notodiputro, K.A. 2006. Small Area Estimation

  based on Random Walk Models. Forum Statistika dan Komputasi, Vol. 11 No.1.

  „

  Sadik, K. dan Notodiputro, K.A. 2006. Small Area Estimation with Time and Area Effects Using Two Stage Estimation, ICoMS-1 : Bandung.

  „ Kurnia, A. dan Notodiputro, K.A. 2006. EB-EBLUP MSE Estimator

  on Small Area Estimation with Application to BPS Data, ICoMS-1 : Bandung.

  „ Kismiantini, Kurnia, A. and Notodiputro, K.A. 2006. Risk of

  Dengue Haemorrhagic Fever In Bekasi Municipality With Small Area Approach, ICoMS-1 : Bandung .

The Development of SAE Research in IPB

  Papers in conference proceedings or journals (continued): „ Indahwati and Notodiputro, K.A. 2006. Effect of Inappropiate

  Sampling Design on Reliability of Small Area Estimates, ICoMS-1 : Bandung.

  „

  Sadik, K. dan Notodiputro, K.A. P-Spline M-Quantile Approach in Small Area Estimation. Jurnal Matematika Aplikasi dan Pembelajaran, Vol. 5 No.2 Jilid 1, p:142-147.

  

„ Kurnia, A. dan Notodiputro, K.A. Effects of Sampling Variance

  Estimation in Small Area Estimation. Jurnal Matematika Aplikasi dan Pembelajaran, Vol. 5 No.2 Jilid 2.

  

„ Handayani, D. dan Kurnia, A. 2006. Empirical Bayes Approach to

Estimate Finite Population Mean in Small Area Estimation.

  Jurnal Matematika Aplikasi dan Pembelajaran, Vol. 5 No.2 Jilid 2

  15 The Development of SAE Research in IPB nd

  Research plan in the 2 year :

„ The research will be focus on development of

method of small area estimation to increase the accuration

  „ Data simulation designed to mimick the BPS sampling, will be generated and utilized

  „

A survey will be conducted to compare the

estimates resulting from the method with the direct estimate.

  17 Thank You