Hotspot Analysis on Poverty, Unemployment, and Food Security in Java, Indonesia

(1)

D

SE

INST

DIAN KUSUMANINGRUM

SEKOLAH PASCASARJANA

STITUT PERTANIAN BOGOR

BOGOR

20100


(2)

INFORMATION

I hereby declare that the thesis Hotspot Analysis on Poverty, Unemployment, and Food Security in Java, Indonesia is my own work with guidance of my supervisors and has not been proposed to any other Universities in any form. Sources of information derived or quoted from a published or not published work conducted by other authors are mentioned in the text and included in the Reference at the end of this thesis.

Bogor, February 2010

Dian Kusumaningrum NRP G151070091


(3)

DIAN KUSUMANINGRUM. Hotspot Analysis on Poverty, Unemployment, and Food Security in Java, Indonesia. Supervised by ASEP SAEFUDDIN and MUHAMMAD NUR AIDI

Eradicating extreme poverty and developing strategies for decent and productive work for youth is a very important issue in Indonesia. Hence, it would be important to conduct a research to evaluate the circumstances of these issues in Indonesia. Geoinformatics techniques can be explored further to obtain premises for decision making and finding methods for better strategic efforts. Satscan is a geoinformatics tool widely used in hotspots detection. In this research the hotspots obtained by Satscan was compared with the hotspots obtained from ULS and the official food scarcity and poverty map developed by the Food Security Agency. The hotspot related to poverty, food scarcity and poverty were mainly in Central Java, East Java, and Yogyakarta. Afterwards, main factors causing the hotspot were analyzed by Ordinal Logistic Regression Model. Factors related to the hotspot were school facilities, village trade, village industry, village services, slum areas, proportion of families without electricity, and credit facilities.

Keywords: Food scarcity, poverty, unemployment, hotspot detection, ordinal logistic regression


(4)

Food Security in Java, Indonesia. Supervised by ASEP SAEFUDDIN and MUHAMMAD NUR AIDI

Eradicating extreme poverty and developing strategies for decent and productive work for youth is a very important issue in Indonesia. Hence, it would be important to conduct a research to evaluate the circumstances of these issues in Indonesia. Geoinformatics techniques can be explored further to obtain premises for decision making and finding methods for better strategic efforts. Satscan is a geoinformatics tool widely used in hotspots detection. In this research the hotspots obtained by Satscan was compared with the hotspots obtained from Upper Level Set (ULS) Scan Statistic and the official food scarcity and poverty map developed by the Food Security Agency.

Another problem faced by Satscan and ULS is the stability of the hotspot clusters obtained. Changing the maximum cluster size will lead to different hotspots. The default maximum-size setting of 50% seldom produces usable, informative results because the reported primary cluster often occupies a large proportion of the study area. Therefore, it is difficult to determine an optimal setting for scaling parameters. Thus a process for addressing the sensitivity issues of scan statistics method and enhancing the interpretation of scan statistics result was conducted (Chen et. all 2008). First, scan statistics methods were run multiple times, starting from a small maximum-size (5%) and systematically increased to the 50% default value. Second, the results were visualized in a map matrix for side-by-side comparison of different maximum-sizes. Third, the reliability of a region in a map was calculated and interpreted. Fourth, core clusters would be discriminated from heterogeneous clusters through interpretation of the reliability. Fifth, the interpretation of core clusters has been confirmed by comparing the results to other independent techniques and


(5)

Poverty Map accomplished by CBS and FSA.

Based on the comparison of ULS and Satscan on the poverty and food scarcity case in 78 districts in Java, ULS showed more accurate and stable results. The stability can be seen from the average stability score of clusters. For poverty and food scarcity ULS had an average stability score above Satscan. Meanwhile accuracy can be seen from the precision of ULS and Satscan in detecting high percentages of poor and food scarcity cases. These areas are known as first priority areas in Food Security Agency and CBS. The percentages of accuracy of ULS in detecting high cases of poverty and food scarcity are higher than Satscan. Therefore ULS was used to detect hotspots of poverty, unemployment, and food scarcity in all areas of Java.

The hotspot related to poverty, food scarcity and poverty were mainly in Central Java, East Java, and Yogyakarta. Areas such as Cilacap, Demak, Kab.Madiun, Kota Pekalongan, Kulon Progo, Pemalang, and Purworejo were identified as critical areas because these areas were poverty, unemployment, and food scarcity hotspots. Based on the stability analysis Cilacap was the core cluster of poverty, unemployment, and food scarcity. This indicated by using several maximum cluster sizes Artinya Cilacap was identified as a poverty, unemployment, and food scarcity hotspot. Hence, Cilacap should be given prioritization. Meanwhile Kota Batu, Salatiga City, and Serang City were not detected as critical areas.

Main factors that caused the hotspot were analyzed by Ordinal Logistic Regression Model. Factors related to the hotspot were school facilities, village trade, village industry, village services, slum areas, proportion of families without electricity, and credit facilities. The increase of school facilities, stimulating


(6)

attention to credit facilities, economical potential of a village in trade, villages without electricity, and small scale farm industry. It turned out that the increase of these factors increased the possibility of a municipality to become a critical area. From this study it was pointed out that credit facilities, farm Industry and trade in a village did not show indication that it could improve the welfare of people living in critical areas. Hence, these factors should be revitalized. Areas that had a high ratio of families living without electricity were also critical points in solving the problem of poverty, unemployment, and food scarcity. Therefore the government should have given more attention to people who lived in these areas. Keywords: Food scarcity, poverty, unemployment, hotspot detection, ordinal


(7)

DIAN KUSUMANINGRUM. Hotspot Analysis on Poverty, Unemployment, and Food Security in Java, Indonesia. Dibimbing oleh ASEP SAEFUDDIN dan MUHAMMAD NUR AIDI

Pemberantasan kemiskinan dan kerawanan pangan serta mengembangkan strategi untuk mengatasi permasalahan pengangguran pada usaha produktif untuk kaum muda adalah masalah yang sangat penting di Indonesia. Oleh karena itu, sangatlah penting untuk melakukan riset untuk mengevaluasi kondisi permasalahan tersebut di Indonesia. Teknik Geoinformatika dapat digunakan untuk memperoleh daerah kritis (hotspot) yang sangat penting dalam pengambilan keputusan dan menentukan upaya strategis yang lebih baik. Satscan adalah alat geoinformatika yang banyak digunakan untuk mendeteksi hotspot. Namun dalam beberapa penelitian, telah ditemukan kekurangan metode Satscan ketika mendeteksi hotspot di suatu area yang bentuknya tidak teratur. Oleh karena itu metode Satscan akan dibandingkan dengan Upper Level Set (ULS) Scan Statistics dan Peta Kerawanan Pangan dan Peta Kemiskinan yang dikembangkan oleh Dewan Ketahanan Pangan.

Permasalahan lain dalam Satscan dan ULS adalah kestabilan dari gerombol hotspot yang diperoleh dari kedua metode tersebut. Perubahan nilai ukuran gerombol maksimum (maximum cluster size) menyebabkan perubahan gerombol hotspot yang dieroleh. Ukuran gerombol maksimum 50% (default) menghasilkan gerombol hotspot yang kurang informatif karena gerombol hotspot utama yang diperoleh sering kali menempati sebagian besar daerah studi. Sehingga seringkali sangatlah sulit untuk menentukan ukuran gerombol maksimum yang optimal. Sebuah proses untuk menangani isu-isu sensitivitas metode scan statistik dan meningkatkan interpretasi hasil scan statistik telah diusulkan (Chen et. all 2008). Pertama, metode scan statistik harus disimulasikan berulang kali, dimulai dari ukuran maksimum yang kecil dan meningkat menjadi


(8)

berbeda. Ketiga, menghitung dan menafsirkan nilai kestabilan suatu daerah di peta. Keempat, kelompok inti harus dibedakan dari kelompok heterogen melalui interpretasi dari nilai kestabilan. Kelima, penafsiran kelompok-kelompok inti harus dibandingkan dengan hasil yang diperoleh dari teknik lainnya dan konsultasi dengan para pakar.

Berdasrkan perbandingan ULS dan Satscan pada kasus kemiskinan dan kerawanan pangan pada 78 kabupaten di Jawa, ULS cenderung menunjukkan hasil yang lebih akurat dan stabil. Kestabilan dapat dilihat dari hasil perhitungan nilai rata-rata stabilitas gerombol. Untuk kasus kemiskinan dan kerawanan pangan ULS memiliki nilai rata-rata stabilitas gerombol yang cenderung lebih tinggi dari Satscan. Sedangkan akurasi dapat dilihat dari persentase ketepatan ULS dan Satscan dalam mendeteksi daerah-daerah yang memiliki persentase kemiskinan atau kerawanan pangan yang tinggi. Daerah-daerah tersebut oleh Badan Ketahanan Pagan maupun BPS dinyatakan sebagai daerah prioritas pertama. Persenatse akurasi ULS dalam mendeteksi kemiskinan dan kerawanan pangan lebih tinggi dari Satscan. Oleh karena itu ULS digunakan untuk menentukan hotspot kemiskinan, kerawanan pangan, dan pengangguran untuk seluruh kota dan kabupaten di Jawa.

Mayoritas hotspot kemiskinan, kerawanan pangan dan pengangguran berada di Jawa Tengah, Jawa Timur, dan Yogyakarta. Daerah Cilacap, Demak, Kab.Madiun, Kota Pekalongan, Kulon Progo, Pemalang, dan Purworejo merupakan daerah kritis krena terdeteksi sebagai daerah hotspot kemiskinan, kerawanan pangan dan pengangguran. Berdasarkan analisis kestabilan, Cilacap merupakan daerah inti kemiskinan, pengangguran dan kerawanan pangan. Artinya dengan menggunakan berbagai ukuran maksimum gerombol yang berbeda daerah Cilacap terdeksi sebagai hotspot kemiskinan, pengangguan dan


(9)

Factor-faktor yang menyebabkan hotspot kemiskinan, kerawanan pangan dan pengangguran telah dianalisis dengan menggunakan Model Regresi Logistik Ordinal. Faktor-faktor yang terkait dengan hotspot-hotspot tersebut adalah rasio fasilitas sekolah, rasio potensi desa perdagangan, rasio potensi desa industri, rasio potensi desa jasa, rasio fasilitas kredit, dan rasio industri kecil di bidang pertanian, dan proporsi keluarga yang tidak memiliki listrik. Peningkatan jumlah fasilitas sekolah dan peningkatan rasio potensi ekonomi dari suatu desa di bidang industri dan jasa dapat mengurangi kemungkinan suatu daerah untuk menjadi daerah kritis. Sedangkan peningkatan fasilitas kredit, potensi ekonomi dari sebuah desa di bidang perdagangan, desa-desa tanpa listrik PLN, dan industri pertanian meningkatkan kemungkinan daerah untuk menjadi daerah kritis. Fasilitas kredit, potensi ekonomi di bidang perdagangan dan industri pertanian di sebuah desa belum dapat meningkatkan kesejahteraan penduduk di daerah-daerah kritis secara signifikan. Oleh karena itu fasilitas kredit, perindustrian dan perdagangan pertanian harus direvitalisasi. Daerah yang memiliki rasio keluarga yang hidup tanpa listrik PLN tinggi juga merupakan daerah yang memerlukan perhatian khusus dalam upaya memecahkan masalah kemiskinan, pengangguran, dan kelangkaan pangan. Jika pemerintah ingin memecahkan masalah kemiskinan, pengangguran, dan kerawanan pangan maka factor-faktor tersebut perlu mendapatkan perhatian khusus.

Kata Kunci: Kerawanan pangan, kemiskinan, pengangguran, pendeteksian hotspot, regresi logistik ordinal


(10)

©Copy Right BAU, 2010

The copy right is protected by the Indonesian law

It is prohibited to cite parts or all parts of this thesis without including or mentioning the source. Citation is only for educational purposes, research, writing papers, preparation of reports, preparation of criticism or review of a problem; and citation doesn’t harm reasonable interest of Bogor Agricultural University (BAU).

It is prohibited to announce or reproduce parts or or all parts of this thesis in any form without permission of BAU.


(11)

AND FOOD SECURITY IN JAVA, INDONESIA

DIAN KUSUMANINGRUM

Thesis as a requirement for Master Degree on Statistics

GRADUATE SCHOOL

BOGOR AGRICULTURAL UNIVERSITY

2010


(12)

Name : D

NRP : G

Study Program : S

Dr. Asep Saefuddin, M.S Chairman

Head of Statistics Progra

Dr.Ir. Aji Hamim Wigen

Examination Date: 12 Fe

: Dian Kusumaningrum : G151070091

: Statistics

Approved by, Supervising Commission,

.Sc Dr. Ir. Muhammad Nur Aid

Member

Acknowledged by,

gram

ena, M.Sc

Februari 2010 Graduation Date:


(13)

blessing that made this thesis entitled “Hotspot Analysis on Poverty, Unemployment, and Food Security in Java, Indonesia” completed.

This thesis could not have been completed without the support of many people. Therefore, in this opportunity, the author would like to acknowledge the following people:

1. Dr. Ir. Asep Saefuddin, M.Sc. and Dr. Ir. Muhammad Nur Aidi, MSi as my supervisors, for the valuable advice, theoretical guidance, support, and opinion during this research.

2. Dewan Ketahanan Pangan, Dr. Hari Wijayanto and Swastika Andi for their valuable contribution in providing data for this research.

3. Prof. GP Patil and associates in Pennsylvania State University for their valuable contribution in this study

4. Higher Education Directorate for funding my research through Hibah Penelitian Tim Pascasarjana

5. PHK-A2, Higher Education Directorate, for their valuable support in funding my studies through PHK-A2 scholorship

6. All lecturers at Department of Statistics for sharing their knowledge and support and also the staff of Department of Statistics for helping me in my study.

7. My beloved family and husband for all their patience, prayer, supports, love, and advices.

8. All my friends in Statistics 2007 class, especially my sisters Halimatus Sadiah and Triyani thank you for being there to understand the essence of struggle.

And many others whom I could not mention one by one in this opportunity thank you for everything. Hopefully, this thesis will be useful for the reseacheres and others who need the information in this thesis.


(14)

The author was born on June 4th 1981 as the first child of two children of Hermanto and Karyani. The author has graduated from SD Polisi IV Bogor in 1993, afterwards graduated from SMP Negeri IV Bogor in 1996, and graduated from SMU Negeri IV Semarang in 1999. After having the opportunity to have a brief education in Diponegoro University Semarang, the author enrolled in Statistics Department Bogor Agricultural University in 2000 through a National Selection Test to Enter State Universities (UMPTN) and took social economics as her minor. In 2007 she had the opportunity to continue her studies in Statistics at Post Graduate School of Bogor Agricultural University and married to Tosan Wiar Ramdhani in the same year. The author started to be involved with social studies related research in 2005 when she had the opportunity to become a research assistant in UNESCAP-CAPSA and since 2006 the author has been working as a Staff Lecturer at Department of Statistics Bogor Agricultural University.


(15)

LIST OF TABLES...xi

LIST OF FIGURES...xi

LIST OF APPENDENCIES...xii

I. INTRODUCTION 1.1. Background... 1

1.2. Objectives ... 3

II. LITERATURE REVIEW 2.1. Poverty... 5

2.2. Unemployment... 5

2.3. Food Scarcity... 6

2.4. Hotspot... 7

2.5. Hotspot Detection Method... 7

2.6. Scan Statistics (Satscan)... 7

2.7. Upper Level Satcan (ULS)... 9

2.8. Bernoulli Models... 11

2.9. Monte Carlo-Based Hypothesis Testing... 12

2.10. Hotspot Evaluation... 14

2.11. Joint Hotspots... 16

2.12. Ordinal Logistic Regression... 17

2.13. Testing the Model Significance... 18

2.14. Assumption of Logistic Regression... 19

2.15. Model Validation... 20

III.METHODOLOGY 3.1. Source of Data... 21

3.2. Method... 23

IV.RESULTS AND DISCUSSION 4.1. Data Description... 24

4.2. Satscan and ULS Evaluation... 28

4.2.1. Poverty... 29

4.2.2. Food Scarcity... 33

4.3. Joint Poverty, Food Scarcity, and Unemployment Hotspots... 36

4.4. Determining Factors Causing Poverty, Food Insecurity, and Unemployment... 38

V. CONCLUSION AND RECOMMENDATIONS 5.1. Conclusion... 41

5.2. Recommendation... 41


(16)

LIST OF TABLES

1. Multicriteria Hotspot Category... 17

2. List of Explanatory Variables per District and Its Sector... 22

3. The Percentage of Poor in Scan Window A and B... 32

4. Performance of Satscan and ULS Food Insecurity Hotspots... 35

5. Joint Hotspots in Java... 37

6. Ordinal Logistic Regression Table... 39

LIST OF FIGURES 1. Scan statistics zonation for circle (left) and space-time cylinders (right)... 9

2. Connectivity of tessellated regions... 10

3. A confidence set of hotspots on the ULS tree... 11

4. A logic process for analytics of scan statistics result... 15

5. Geoinformatics Research on Poverty, Unemployment, and Food Security Road Map... 23

6. Main Source of Fix Income of Poor Households... 25

7. Share of Food Income 2005... 27

8. Poverty’s ULS and Satscan Performance with a maximum spatial cluster size of 50%, 40%, 30%, 20%, 10%, and 5%... 31

9. Circle Scanning Window and a Comparison of ULS and Satscan Hotspots ... 32

10.Food Scarcity’s ULS and Satscan Performance with a maximum spatial cluster size of 50%, 40%, 30%, 20%, 10%, and 5%... 34

11.Multicriteria Hotspots Distribution in Java... 36


(17)

2. Municipalities with the Highest Food Insecurity Household

Rate in Java... 47 3. Municipalities with High Unemployment Rate in Java... 48 4. Side-by-side Comparison Map of different Poverty Hotspots

Using 50%, 40%, 30%, 20%, 10% and 5% Maximum

Cluster Size... 49 5. Satscan and ULS Poverty Hotspots Comparison Using 50%,

40%, 30%, 20%, 10% and 5% Maximum Cluster Size... 51 6. Comparison of ULS and Satscan Poverty Hotspots with

Critical Poverty Areas based on the Food Security Map

(Food Security Agency)... 53 7. Side-by-side comparison Map of different Food Scarcity

Hotspots Using 50%, 40%, 30%, 20%, 10% and 5%

Maximum Cluster Size... 54 8. Satscan and ULS Food Scarcity Hotspots Comparison

Using 50%, 40%, 30%, 20%, 10% and 5% Maximum

Cluster Size... 56 9. Indicator used for Food Insecurity Atlas... 58 10.Poverty Map of Poverty, Food Insecurity, and

Unemployment in Java (2005) Using ULS with a

Maximum Cluster Size of 50%... 59 11.The Stability of Poverty, Food Insecurity, and

Unemployment Hotspots in Java (2005) Using ULS with a Maximum Cluster Size of 50%, 40%, 30%,

20%, 10% and 5%... 61 12. Multicritria Poverty, Food Insecurity, and Unemployment

in Java (2005) Using ULS with a Maximum Cluster Size of 50%...63 13.Multicriteria Poverty, Food Security, and Unemployment Map

(2005)... 65 14.Distribution of Multicriteria Hotspots Based on Province ... 66 15.Variables used in the Ordinal Logistic Model ... 67 16.Correlation of Independent Variables used in the Ordinal

Logistic Model... 68

17.Ordinal Logistic Regression……….. 69


(18)

I. INTRODUCTION

1.1. Background

Based on The United Nations (2000), The Millennium Development Goals (MDGs) comprise eight goals that emerged from the 2000 Millennium Summit of world leaders in New York. The MDGs provide a set of time-bound and measurable targets for combating poverty, hunger, illiteracy, disease, discrimination against women, and environmental degradation. Eradicating extreme poverty and hunger is the main goal of MDGs. Meanwhile, in co-operation with developing countries, developing and implementing strategies for decent and productive work for youth is incorporated in the 8th goal to develop a global partnership for development. Hence, it would be important to conduct a research to evaluate the circumstances of these goals in Indonesia. The focus of this research would be Java Island where it represents 60% of Indonesia’s total population. Nevertheless, Based on Central Bureau of Statistics (CBS) 2005, at least 50% of the poor are found in Java Island.

The Semeru Research Institute (2002) pointed out that poverty is a significant and complex problem caused by the combination of cultural, social, political, and economic factors. Hence, poverty reduction strategies and programs require an integrated approach and must be implemented in stages that are both well-planned and sustainable. In Indonesia, poverty, unemployment, and food scarcity are the three main problems that the government faces. Strategic efforts to reduce poverty, unemployment, and food scarcity in many regions have been done, though the result was not quite satisfying. Observing poverty, unemployment, and food scarcity must be done comprehensively and holistically. Poverty, unemployment, and food scarcity data are gathered periodically at various levels and regions to observe the effectiveness of poverty, unemployment, and food scarcity reduction programs. Maps of poverty and unemployment in Indonesia are also available and have been studied from various aspects. There are many government institutions and research institutes that conduct studies on unemployment, poverty, and food scarcity mapping, however studies that emphasize on the spatial and geoinformatic techniques are uncommon.


(19)

Nevertheless, these techniques can be explored further to obtain premises for decision making and finding methods for quicker and better strategic effort.

Using geographical maps of poverty, unemployment, and food scarcity based on the location (spatial) attribute provides further information. Poverty, unemployment, and food scarcity are the three variables that are spatially related within themselves as well as with other variables. Spatial relationship is considered an important aspect in solving problems related to poverty, unemployment, and food security in Indonesia.

Information technology has enabled us to study the condition of poverty and unemployment and combine it with geographical information. Statistical techniques can be adopted to produce additional information by making use of the spatial concepts among variables. Therefore, practically, there is no technological resistance for us to obtain more and better knowledge of geographical information and also data which have already existed. Data is transformed into information, and then becomes knowledge. Hence, when the data has geographical information, it has developed into geoinformatics.

The main things that are visible by the use of the geoinformatics techniques are:

1. Identify unemployment, poverty, and food scarcity hotspot by using scan statistics method. The focus is on hotspots where a cluster was characterized by unemployment, poverty, and food scarcity and which are considered as an outbreak compared to other regions. Regions with high proportions of unemployment, poverty, and food scarcity were not hotspots if the surrounding areas of these regions also had high proportions of unemployment, poverty, and food scarcity. The output of statistical test is applicable to determine whether a cluster region is a hotspot. Information about the existence of hotspots will be vital in the effort of determining the regional priority of unemployment, poverty, and food scarcity reduction.

2. Identify joint hotspots. After knowing the poverty, unemployment, and food scarcity hotspot, this research took a step forward to analyze the relationship between poverty, unemployment, and food scarcity


(20)

hotspots. One of the questions raised was “If an area was a poverty hotspot, would it also be a unemployment and food scarcity hotspot?” Or was there any relation between poverty, unemployment, and food scarcity hotspots.

3. Identify variables that can spatially be an influence on the unemployment, poverty, and food scarcity hotspots. The ordinal logistic regression model was used to obtain variables that had significant effect on the hotspots of poverty, unemployment, and food scarcity. Information on these variables will be useful to determinate the type of poverty and unemployment reduction effort that is in line with the goals.

4. Identify the effectiveness of unemployment, food scarcity, and poverty reduction efforts. The efforts for Poverty, unemployment, and food scarcity reduction program and activity in selected regions will have an impact, especially for the nearby regions. But how far (in space context) has the effort had affected the society must be examined further on. Information on the level of extent can be used to determine the amount of locations where these programs can be implemented. This is an advantage because it reduces the risk of overlapping programs at a particular region and avoid certain regions from being neglected

The most recent method used to identify a candidate hotspot is Satscan. However, in several researches it was found that Satscan had several limitations, such as the circles that have been used for the scanning window caused low power for detection of arbitrarily shaped cluster. Hence, Upper Level Set (ULS) Scan Statistics has been used as a comparison to detect hotspots.

1.2. Objectives

Poverty, unemployment, and food scarcity have been a major problem in many developing countries, such as Indonesia. Finding the right methods to solve these problems would be promising. This research incorporated spatial attribute in


(21)

order to produce better outcomes. Through geoinformatics techniques the issues below were inquired:

1. Compared ULS and Satscan methods in detecting poverty and food scarcity hotspots in order to decide which method was more sufficient. 2. Identified the hotspots of unemployment, poverty, and food scarcity using

the selected method based on the first objective. 3. Developed a method to identify joint hotspots

4. Identified variables that have a significant influence on the unemployment, poverty, and food scarcity joint hotspots


(22)

II. LITERATURE REVIEW

2.1. Poverty

In general poverty is the state of being without the necessities of daily living, often associated with need, hardship and lack of resources across a wide range of circumstances. A person living in the condition of poverty is said to be poor or impoverished (The Free Dictionary). There are also several definitions on Poverty made by the NGOs and GoI Institutions. According to the Central Bureau of Statistics (CBS) a household is categorized as poor if they have an income per capita below the poverty line. The poverty line is a measure of the amount of money a government or a society believes is necessary for a person to live at a minimum level of subsistence or standard of living. Meanwhile, The World Bank defines extreme poverty as a person living on less than US$ 1 per day, and moderate poverty as a person living less than US$ 2 a day.Pre-prosperous family (Keluarga Pra Sejahtera) is a family that is unable to fulfil a minimum of basic human needs including spiritual needs, food, clothes, home, education and health. Prosperous family 1 (Keluarga Sejahtera I) is a family that is already able to fulfil its basic human needs, but unable to fulfil higher human needs (BKKBN 2004).

2.2. Unemployment

Unemployment occur when a person is available to work and seeking work but currently without work. The prevalence of unemployment is usually measured using the proportion of unemployed people, which is defined as the percentage of those in the labour force who are unemployed. The proportion of unemployed people is also used in economic studies and economic indices (Wikipedia 2009).

In seven years between 1994 and 2001, CBS changed its definition of open unemployment twice. First, in 1994, CBS removed the qualifying time period of actively seeking work. Prior to 1994, a person was considered to be actively looking for work if s/he had actually looked for work during the week preceding the survey. Starting from 1994, a person is considered to be actively looking for work if s/he had looked for work, regardless of when the last time s/he actually actively looked for work, as long as s/he is still waiting for the result of the job


(23)

search. In 2001, CBS altered the definition of unemployment to include three more groups of the unemployed on top of the traditionally measured-unemployed, which is defined as part of the labour force who are not working and actively looking for work. The three additional groups of unemployed people are: (i) those who are not working and are not actively looking for work because they do not believe work is available (discouraged workers), (ii) those who already have jobs but have not started working, and (iii) those who are preparing a business. Prior to 2001, these groups of people would be considered out of the labour force, hence not included in the calculation of proportion of unemployed people. The first group makes up the majority of the additional persons considered unemployed under this new definition of open unemployment (Semeru 2005).

2.3. Food Scarcity

Hunger is widely known as an extreme manifestation of a prolonged hunger or sudden occurrence of food insecurity. Food insecurity exists when people are undernourished as a result of unavailability of food, lack of social or economic access to adequate food, and inadequate food consumption and adsorption. Food insecurity can be a short term or long term phenomena. Macro level food security, in terms of self-sufficiency at a national or regional level does not guarantee a household or individual level of food security. Hence food security is a multi-dimensional issue and needs further in depth studies on host parameters (DKP 2005)

Ariani (2006) referred that World Bank since 1986 categorized food insecurity into chronic food insecurity and transitory/occasional food insecurity. Chronic food insecurity is a condition of frequent food scarcity over a certain period of time. On a household level, chronic food insecurity indicates that the share of food owned is slight lower than needed. Meanwhile on an individual level, chronic food insecurity is a situation where food consumption is lower than the biological needs. Transitory/occasional food insecurity is a cyclical food insecurity that occurs when there is a shock or an unexpected event. Transitory/occasional food insecurity is also categorized into two specific


(24)

categories, which are caused by repeated or cyclical factors and temporary or unpredicted factors.

2.4. Hotspot

Hotspot is defined as something unusual, anomaly, aberration, outbreak, elevated cluster, or critical area (Patil and Taillie 2004). Meanwhile according to Harran e.t. all (2006) hotspots are locations or regions that have consistently high levels of occurrences (such as the total amount of poor, unemployed, or people that suffer from food scarcity) and may have characteristics unlike those of surrounding areas.

Hotspot clusters were generated by setting the relative risk in some areas to be larger than one and (Song and Kulldorff 2003). Furthermore a poverty hotspot represents an area characterized by certain local characteristics which could also expand and affect other neighbouring areas (Betti et. all 2006).

2.5. Hotspot Detection Method

Hotspot detection method contains three components, which include a) identifying hotspot candidate, b) evaluating the statistical significant hotspot, and c) estimate the covariance related with hotspot. In Indonesia nowadays, the most recent method used to identify a candidate hotspot is spatial scan statistics. In Bungsu (2006) it is stated that spatial scan statistics suffers from several limitations, such as the circles that have been used for the scanning window caused low power for detection of arbitrarily shaped cluster. Hence Upper Level Satcan (ULS) will be used as a comparison to detect arbitrarily shaped hotspots. Likelihood ratio, relative risk, and hypothesis testing based on montecarlo simulation are techniques used to evaluate a candidate hotspot.

2.6. Scan Statistics (Satscan)

Scan statistic is a statistical method used to detect clusters in a cluster process. Spatial scan statistic is used to determine whether a spatial cluster process contains a localized cluster of points somewhere in a region of interest. The spatial scan statistic deals with the following situation. A region R of euclidian


(25)

space is subdivided into cells defined (denote by A). Data are available in the form of a count on each cell A. In addition, A size value PAis associated with each cell.

The cell sizes PA are assumed to be known and fixed, while the cell counts NA are

independent random variables.

The spatial scan statistic seeks to identify clusters of cells that have an elevated response compared with the rest of the region. Elevated response means large values for the rates, rA= NA/ PA, instead offor the raw counts NA. Cell counts

are thus adjusted for cell sizes before comparing cell responses. Kulldorf (1997) presented the following algorithm for a circular window of fixed diameter d on a homogeneous Poisson/Bernoulli (assuming homogeneous variance) process:

1. Pick a grid point. Calculate the distance to the different population points and sort those in increasing order. Memorize the sorted population points in an array

2. Repeat step 1 for each grid point 3. Pick a grid point

4. Create a circle cantered at the grid point and continuously increase the radius. For each population entering the circle, update the number of cases n and measure the population NW inside the circular area W

5. Repeat step 3 and 4 for each grid point. Report the largest likelihood based on all (n, NW) pairs as the scan statistics, where the likelihood is calculated

according to equation

6. Repeat steps 3 to 5 for each monte carlo replication

The relative risk is a non-negative number, representing how much more common a case is in the location and time period compared to the baseline. Setting a value of one is equivalent of not doing any adjustments and a value of less than one to adjust for lower risk A value of greater than one is used to adjust for an increased risk. A cluster with a relative risk (RR) value greater than one is defined as a candidate of hotspot. A relative risk of zero is used to adjust for missing data for that particular time and location (Kulldorff 2006). The relative risk is calculated by (Kulldorff 2006)

) (c E

n

RR= Z where z

n is the number of observed cases, and E(c) is the expected number of cases in a location which is


(26)

calculated by = × P C p c

E( ) where p is the number of population in the cluster of interest, while C and P are the total number of cases and total number of population.

Available scan statistic software is known to have several limitations. First, circles have been used for the scanning window, resulting in low power for detection of irregularly shaped clusters (Figure 1). Second, the response variable has been defined on the cells of a tessellated geographic region, preventing application to responses defined on a network (stream network, water distribution system, highway system, etc.). Third, response distributions have been taken as discrete (specifically, binomial or Poisson). Finally, the traditional scan statistic gives only a cluster estimate for the hotspot but does not attempt to assess estimation uncertainty (Patil 2006)

Figure 1 Scan statistic zonation for circles (left) and space-time cylinders (right)

2.7. Upper Level Set (ULS) Scan Statistics

Patil (2006) acknowledged a new version of the spatial scan statistic designed for detection of hotspots of arbitrary shapes and for data defined either on a tessellation or a network. This version looks for hotspots from among all connected components of upper level sets of the response rate and is therefore called the upper level set (ULS) scan statistic. The method is adaptive with respect to hotspot shape since candidate hotspots have their shapes determined by the data rather than by some a prioriprescription like circles or ellipses. This data dependence is taken into account in the Monte Carlo simulations used to

Cholera outbreak along a river flood-plain •S mall c irc les mis s much of the outbreak •La rge circles include many unwanted cells

Outbreak e xpanding in time

•S mall cy linders miss much of the outbreak •La rge cylinders include many unwanted cells

Space

T

im

e

Cholera outbreak along a river flood-plain •S mall c irc les mis s much of the outbreak •La rge circles include many unwanted cells

Outbreak e xpanding in time

•S mall cy linders miss much of the outbreak •La rge cylinders include many unwanted cells

Space

T

im


(27)

determine null distributions for hypothesis testing. The performance of the ULS scanning tool was compared with spatial scan statistic. The key element here is enumeration of a searchable list of candidate zones Z.

Figure 2 Connectivity for tessellated regions. The collection of shaded cells on the left is connected and, therefore, constitutes a zone. The collection on the right is not connected.

A zone is, first of all, a collection of vertices from the abstract graph. Secondly, those vertices should be connected (Fig. 2) because a geographically scattered collection of vertices would not be a reasonable candidate for a “hotspot.” Even with this connectedness limitation, the number of candidate zones is too large for a maximum likelihood search in all but the smallest of graphs. We reduce the list of zones to searchable size in the following way. The response rate at vertex a is Ga=Ya/Aa. These rates determine a function

a G

a→ defined over the vertices in the graph. This function has only finitely many values (called levels) and each level g determines an upper level set Ug

defined by . Upper level sets do not have to be connected but each upper level set can be decomposed into the disjoint union of connected components. The list of candidate zones Z for the ULS scan statistic consists of all connected components of all upper level sets. This list of candidate zones is denoted byΩULS. The zones in ΩULSare certainly plausible as potential hotspots

since they are portions of upper level sets. Their number is small enough for practical maximum likelihood search, the size of ΩULSdoes not exceed the number of vertices in the abstract graph (e.g., the number of cells in the tessellation). Finally, ΩULSbecomes a tree under set inclusion, thus facilitating computer

representation. This tree is called the ULS-tree (Figure 3); its nodes are the zones

ULS


(28)

nodes are (typically) singleton vertices at which the response rate is a local maximum; the root node consists of all vertices in the abstract graph.

Finding the connected components for an upper level set is essentially the issue of determining the transitive closure of the adjacency relation defined by the edges of the graph. Several generic algorithms are available in the computer science.

Figure 3 A confidence set of hotspots on the ULS tree. The different connected components correspond to different hotspot loci while the nodes within a connected component correspond to different delineations of that hotspot.

2.8. Bernoulli Models

According to Kulldorff (1997), let X denote a spatial cluster process where

XA is the random number of clusters in the set . As the window, moves over

the study area it defines a collection of zones Interchangeably, Z will be used to denote both a subset of G and a set of parameters defining the zones.

For the Bernoulli model we consider only measures N such that NA is an

integer for all subsets . Each unit of measures corresponds to an ‘entity’ or ‘individual’ that could be either one of two states. Individuals in one of these states are considered as clusters and the location of those individuals constitute the cluster process. In the model there is exactly one zone such that each individual within that zone has probability 1 of being a cluster, while the

probability for individuals outside the cluster is 2. The probability for any one

individual is independent from all the others. The null hypothesis is H0 : 1 = 2 MLE

Junction Node

Alternative Hotspot Locus Alternative

Hotspot Delineation

Tessellated Region R

MLE

Junction Node

Alternative Hotspot Locus Alternative

Hotspot Delineation


(29)

and the alternative hypothesis is H1 : 1 > 2, Z ∈ . Under H0, XA ~ Bin(NA, 1) ∀A. Under H1, XA ~ Bin(NA, 1) ∀A⊂Z, and XA ~ Bin(NA, 2) ∀A⊂Zc.

Let nz denote the observed number of cases in zone Z and nG is the total

number of cases. NZ is the total population in zone Z while NG is the total

population. Hence the likelihood function for the Bernoulli model is expressed as

To detect the zone that is most likely to be a cluster, we find the zone that maximizes the likelihood function. We do this in two steps. First we maximize the likelihood function conditioned on Z.

!" #

$%& ' (( ) (( * +, -.-/ + ,0 .-/ +

,1 ,

-.1 .-/ 2 +

,1 ,

-.1 .-/

3456 .,

-- 7

,1 ,

-.1 .

-+.,1 1/ +

.1 ,1

.1 / 8 9:;<=>?@<

A

Next we find the solution B C B D E. We are also interested in making statistical inference.

F5G H $%&!I # J K J L K and the likelihood ratio test

statistic( ) can be written as

M

N -O

NP

N - ! #

Q!RQ#STU Q!VQ#

STU N -

! !

In order to find the value of the statistic test, we need a way to calculate the likelihood ratio as it is maximized over the collection of cluster in the alternative hypothesis. This might seem like a daunting task since the number of cluster could easily be infinite. Two properties allows us to reduce it to a finite problem. The number of observed clusters is always finite and for a fixed number of clusters the likelihood decreases as the measure of the moving window increases.

2.9. Monte Carlo-Based Hypothesis Testing

Simulation is analytical method meant to imitate a real-life system, especially when other analyzes are too mathematically complex or too difficult to


(30)

reproduce (adithmc02 2008). Monte Carlo simulation can be defined as a method to generate random sample data based on some known distribution for numerical experiments (Teknomo 2008). Once the value of the test statistic has been calculated, it easy to do the inference. We can’t expect to find the distribution of the test statistic in closed analytical form. Instead we rely on Monte Carlo-Based hypothesis testing to test the hypothesis.

With a Monte Carlo test, the significance of an observed test statistic calculated from a set data is assessed by comparing it with a distribution obtained by generating alternative sets of data from some assumed model. If the assumed model implies that all data orderings are equally likely then this amounts to a randomization distribution.

In Kulldorff (1997) it was known that Monte Carlo-based hypothesis testing was proposed by Dwass (1957), who pointed out that probability of falsely rejecting the null hypothesis is exactly according to the significance level, in spite of the simulation involved. Mantel (1967) proposed its use in terms of spatial clusters processes, while Turnbull et. all (1990) was the first to use in the context of a multidimensional scan statistic. Monte Carlo hypothesis testing for a scan statistic is a four-step procedure:

1. Calculate the value of the test statistic for the real data.

2. Create a large number of random data sets generated under the null hypothesis.

3. Calculate the value of the test statistic for each of the random replications. 4. Sort the values of the test statistic from the real and random data sets , and

note the rank of the one calculated from the real data sets. If it is ranked in the highest percent, then reject the null hypothesis at percent significance level.

For example, when we condition on the total number of clusters nR. with

9999 such replications, the test is significant at the 5 percent level of if the value of the test statistic for the real data sets is among the 500 highest values of the test statistic coming from the replications.

The p-value is obtained through Monte Carlo hypothesis testing (Kulldorf 1997) by comparing the rank of the maximum likelihood from the real data sets


(31)

with the maximum likelihood from the random data sets. If this rank is R, then p-value = R / (1+ number of simulation).

2.10. Hotspot Evaluation

According to Chen et. all (2008)Satscan has made the spatial scan statistic widely accessible, substantially impacting numerous domains in which spatial clusters are of interest. However, two issues make using the method and interpreting its results nontrivial: (1) Satscan lacks cartographic support for understanding the clusters in geographic context and (2) results from the method are sensitive to the selection of scaling parameters, but the system provides no direct support for making these choices. Upper Level Set (ULS) Scan Statistics also suffer from these problems and each issue is discussed below.

First, scan statistics methods does not provide cartographic support to view the identified clusters, nor a visual interface to explore cluster characteristics. Geographic information about the identified clusters (e.g., the centre location, the cluster radius, data entities included in each cluster) is available only as text. In order to visualize the geographic location and size of the clusters, a user must process the textual output and export it to GIS software (e.g. ArcGIS or Mapinfo). This is a time-consuming process and inhibits interactive exploration of multiple parameter configurations for interpretation of the results. Because of this limitation, researchers may choose default parameters or make some other arbitrary choices that do not reflect characteristics of the geographic phenomena.

Second, it is difficult to determine an optimal setting for scaling parameters. Confusing and even misleading results are possible if the parameter choices are made arbitrarily. The focus of the research presented here is on the aforementioned maximum-size parameter. The default maximum-size setting of 50% seldom produces usable, informative results because the reported primary cluster often occupies a large proportion of the study area. The task of determining the most appropriate maximum-size is challenging. Thus a process for addressing the sensitivity issues of scan statistics method and enhancing the interpretation of scan statistics result was proposed and presented in Figure 4 (Chen et. all 2008).


(32)

Figure 4 A logic process for analytics of scan statisticsresult

First, scan statistics methods should be run multiple times, starting from a small maximum-size and increasing to the 50% default value. Second, the scan statistics methods results should be visualized in a map matrix for side-by-side comparison of different maximum-sizes. Third, calculate and interpret the reliability of a region in a map. Fourth, core clusters should be discriminated from heterogeneous clusters through interpretation of the reliability. Fifth, the interpretation of core clusters should be confirmed by comparing the results to other independent techniques and consultation with domain experts.

When analyzing county-level data, scan statistics methods reports many statistically significant clusters that contain a relatively high proportion of low-risk counties, which is describe as heterogeneous clusters. However, we noticed that there are often smaller, homogeneous subsets within heterogeneous clusters that exhibit values high enough to reject the null hypothesis on their own strength that is described as core clusters. To assist discrimination of stable, core clusters from heterogeneous and/or unstable ones, Jin Chen et. all (2008) has developed an advanced reliability visualization method. This method visualizes and calculates the reliability that a county is reported within a cluster when scan statistics methods is run multiple times with a systematically varying maximum-size

Run Multiple Maps

Visualize and Compare Clusters on Map

Calculate the Reliability of the Maps

Compare the results Identify Core Clusters


(33)

parameter. Reliability is estimated by the equation WX YXZ[, where Ri is the

reliability value for location i, S is the total number of scans, and Ci is the number

of scans for which that location i is within a significant cluster.

The reliability measure has a value range from '0' to '1,' where '0' means that the location is not found in a significant cluster in any of the scans and '1' means that the location is within a significant cluster in all scans. The reliability score measures the stability of clusters reported by multiple scans. Reliability is distinct from the concept of validity, which is a measure of the probability that the cluster represents a true high-risk region. Therefore, the goal of reliability visualization is to identify stable core clusters rather than to evaluate the validity of the core clusters. Since we are applying it to the results of an analysis that measures validity, the end result is to identify the locations that are reliable within a significant high risk cluster.

2.11.Joint Hotspots

Most of the time poverty, unemployment, and food security hotspot detection have been done by considering house hold income per capita, but we must keep in mind that these are multidimensional problem. Poverty, unemployment, and food security can be seen from monetary and non monetary dimension. The monetary dimension is based on income and consumption indicators. Meanwhile the non monitory indicators include health, education, social culture, land fertility, etc. Hence, poverty, unemployment, and food security are complex problems that require further research through joint hotspot detection. Hotspot detection using single criteria (household income per capita) based on CBS methods are suspected to be underestimated. Other criteria are needed and this is where joint hotspot detection techniques will be used for altering better results. In this research we will try to find the relationship between poverty, unemployment, and food scarcity hotspots. One of the questions being raised will be “If an area is a poverty hotspot, would it also be a unemployment and food scarcity hotspot?” Or is there any relation at all between poverty, unemployment, and food scarcity hotspots.


(34)

Table 1 Joint Hotspot Category

Area Poor Food

Scarcity Unemployment

Response Variable Category Score

A Yes (3) Yes (2) Yes(1) 6 B Yes (3) Yes (2) No (0) 5 C Yes (3) No (0) Yes (1) 4 D No (0) Yes (2) Yes (1) 3 E Yes (3) No (0) No (0) 3 F No (0) Yes(2) No (0) 2 G No (0) No (0) Yes (1) 1 H No (0) No (0) No (0) 0

To determine the importance between poverty, unemployment, and food security we will use the MDGs criteria: poverty reduction as target number one will be given a score of three, reducing the proportion of people who suffer from hunger as the second target will be given a score of two, and in cooperation with developing countries, develop and implement strategies for decent and productive work for youth as the 16th target will be given a score of one (UN 2000). Based on the addition of these scores we will develop a final category of multiciriteria hotspots shown in Table 1. These categories will be used as a response variable for ordinal logistic regression model. These methods will be used to identify the factors that are of significant influence towards poverty, unemployment, and food scarcity hotspots.

2.12.Ordinal Logistic Regression

Logistic regression extends categorical data analysis to data sets with binary response and one or more continuous factors (Freeman, 1987). Ordinal logistic regression perform logistic regression on an ordinal response variable. One way to use category ordering forms logit of cumulative probabilities for ordinal response Y with c categories, x are explanatory variables. The cumulative probability for each category can be formulated as: P(Yj|x)=Fj(x)

=

π

1

(

x

)

+

...

+

π

j

(

x

)

where


(35)

)

(

x

j

π

is the response probability of the jth category of explanatory variable x. Cumulative logits for each category j are defined as

− = ) ( 1 ) ( ln ) ( x F x F x L j j

j ; wherej=1,2,...,c−1

A model that simultaneously uses all cumulative logits can be written as \

] ^ _` a b] ^\

Each cumulative logit has its own intercept. The _`\ are increasing in j, since

)

|

(

Y

j

x

P

increases in j when x is fixed, and the logit is an increasing function of this probability (Agresti 2002). b] and _`\ are the maximum likelihood estimators for each _c andbd. These estimators represent the change in logits cumulative for each j category, if the other explanatory variables do not influence Lˆj(x). The interpretation of the b] is the change in logit cumulative for each j category, in other hand, odds ratio will change equal to exp b] for each change of explanatory variables x (Agresti 2002).

The estimate value for P(Yj|x) can be derived with inverse

transformation of logit cumulative function, the result will be shown below.

e f g hi^ j <^kl _` a b\ m^nd

a <^kl _` a b\ m^no345p5 hd q r s tp

e f g hi^ j a 5uv _

\

` bm^ o8 wt G4 Gd

e f g hi^ j a 5uv

x ] ^ o 2.13.Testing the Model Significance

Likelihood ratio test of the overall model is used to assess parameter

β

iwith

hypothesis:

H0 : β1=...=βp =0

H1 : at least there is one βi ≠ 0;i=1,2,..., p, where i is the number of


(36)

The likelihood-ratio test uses G statistic, which is G = -2 ln(L0/Lk) where L0

is likelihood function without variables and Lk is likelihood function with

variables (Hosmer & Lemeshow 2000). If H0 is true, the G statistic will follow

chi-square distribution with p degree of freedom and H0 will be rejected if value of

G > X2(p,α) or p-value < α .

A Wald test is used to test the statistical significance of each coefficient βi in the model. Hypothesis are

H0 : βi =0

H1 : βi ≠0;i=1,...,p where i is the number of explanatory variables. A Wald test calculates a W statistic, which is formulated as

) ˆ ( ˆ

ˆ

i i

SE W

i β

β

β =

Reject null hypothesis if |W| > Zα/2 or p-value < α (Hosmer & Lemeshow 2000).

2.14.Assumption of Logistic Regression

Logistic regression is popular in part because it enables the researcher to overcome many of the restrictive assumptions of OLS (Ordinary Least Square) regression:

1. Logistic regression does not assume a linear relationship between the dependent and the independent variables. It can handle nonlinear effects even when exponential and polynomial terms are not explicitly added as additional independents because the logit link function on the left-hand side of the logistic regression equation is non-linear.

2. The dependent variable does not need to be normally distributed (but does assume that its distribution is within the range of the exponential family distributions, such as normal, Poisson, binomial, gamma). Solutions may be more stable if the predictors have a multivariate normal distribution.

3. The dependent variable does not need to be homoscedastic for each level of the independents; that is, there is no homogeneity of variance assumption: variance does not need to be the same within categories.


(37)

5. Logistic regression does not require that the independents be an interval scale variable.

However, other assumptions still apply:

1. The data doesn’t have any outliers. As in OLS regression, outliers can affect results significantly. The researcher should analyze standardized residuals for outliers and consider removing them or modelling them separately.

2. Between explanatory variables there should be no multicollinearity: to the extent that one independent is a linear function of another independent, the problem of multicollinearity will occur in logistic regression, as it does in OLS regression. As the correlation among each other increase, the standard errors of the logit (effect) coefficients will become inflated. Multicollinearity does not change the estimates of the coefficients, only their reliability. High standard errors flag possible multicollinearity (www.chass.ncsu.edu).

2.15.Model Validation

Model validation was carried out by using Correct Classification Rate (CCR). CCR indicates the percentage of true (suitable) prediction. CCR can be calculated by using the equation below

yyz G45 6{|}5p t~ Gp{5 vp5•€•G€t6wG45 6{|}5p t~ t}w5p‚ G€t6w u ƒƒ„

The higher the percentage of CCR, the more accurate the model is (Hosmer and Lemeshow 2000)


(38)

III. METHODOLOGY

3.1. Source of Data

The source of data that used in this research were National Social Economic Survey (Susenas 2005), data analyzed by Insan Hitawasana Sejahtera (IHS) on the percentage of job seekers (2005), and Potensi Desa (Podes 2005) conducted by CBS. The data used for hotspot detection consisted of household monthly consumption per capita (used for poverty hotspot detection), proportion of unemployment per municipality (used for unemployment hotspot detection) and household total daily calorie intake (used for food scarcity hotspot detection) for 111 municipalities (78 districts and 33 cities) in Java, Indonesia.

There were several definitions on Poverty made by the NGOs and GoI Institutions. This research used a widely definition based on Central Bureau of Statistics (CBS). A household is categorized as poor if they have an income per capita below the poverty line. The poverty line is a measure of the amount of money a government or a society believes is necessary for a person to live at a minimum level of subsistence or standard of living. For food security, Ariani (2006) referred that World Bank since 1986 categorized food insecurity into chronic food insecurity and transitory/occasional food insecurity. This research focused on chronic food insecurity, where there is a condition of frequent food scarcity over a certain period of time. On a household level, chronic food insecurity indicates that the share of food owned is slight lower than needed. A household is considered to be food scarce if the total daily calorie intake is lower than <70% of the minimum calorie needed (±1400 kcal). Meanwhile unemployment occur when a person is available to work and seeking work but currently without work. The prevalence of unemployment is usually measured using the proportion of unemployment, which in this study was defined as the percentage of those in the labour force who are unemployed and aged 14-24 y.o.

The explanatory variables that used for ordinal logistic regression were from Podes. The 12 variables used were chosen from a wide variety of 23 variables that were not correlated and assumed to have a significant influence on poverty, unemployment, and food scarcity hotspots. These variables were related to


(39)

citizenship and labour, education and health, transportation, communication and information, economy, politics and security, housing and environment sectors, and also location. The variable and their specific sector can be seen in Table 2.

Table 2 List Of Explanatory Variables Per District And Its Sector

No Variable Sector Note

1. Economy Potentials of a Village: Farming, Industry, Trade, Services/Others

Economy Ratio of Potential Villages/Villages in a Municipality

2. The amount of farmers Citizenship and Labour

Ratio of

Farmers/Villages in a Municipality

3. The amount of farm labourers

Citizenship and Labour

Ratio of Farm

Labours/Villages in a Municipality

4. The average amount of education facilities

Education and Health

Ratio of School/Villages in a Municipality 5. The average distance

between the village to the capital state/city

Location Km

6. The amount of small and medium scale industries

Economy Ratio of industry/Villages in a Municipality

7. Credit facilities Economy Ratio of Credit/Villages in a Municipality 8. The presence of conflicts

within the society

Politics and Security

Ratio of Conflicts recorded /Villages in a Municipality

9. The average amount of families without electricity (PLN)

Housing and Environment

Ratio of

Families/Villages in a Municipality

10. Province Location Dummy Variable (Banten, West Java, Jakarta, Central Java, Yogyakarta, and East Java)

11. Slum areas Housing and Environment

Ratio of Slum Areas/Villages in a Municipality 12. Indonesian Labour Force Citizenship

and Labour

Ratio of People/Villages in a Municipality


(40)

3.2. Method

This study will be divided into three main phases (Figure 5). The first phase was data preparation and exploration that was done by using Microsoft Access and Microsoft Excel 2007. The second phase of the research was evaluating ULS and Satscan performance on poverty and food scarcity hotspots of 78 districts in Java. The evaluation included stability and accuracy performance of these tools in detecting hotspots. The most suitable hotspot detection method was used to detect poverty, unemployment, and food scarcity hotspots and also used to develop joint hotspots of 111 municipalities in Java. Afterwards, the research determined the most influencing factors causing poverty, unemployment, and food scarcity hotspots by using Ordinal Logistic Regression.

Figure 5 Geoinformatics Research On Poverty, Unemployment, And Food Security Roadmap

Poverty Hotspot Unemployment

Hotspot Food ScarcityHotspot Data Preparation and

Exploration: Susenas IHS Survey Data

Podes

ULS

Satscan

Joint Hotspots

Determining factors that influence poverty,

unemployment, and food scarcity hotspots

Identify effectiveness of reduction efforts

Ordinal Logistic Regression

1

2


(41)

IV. RESULTS AND DISCUSSION

4.1. Data Description

The number of households in SUSENAS 2005 (National Socioeconomic Survey) for Java was 86.708 households. Based on the poverty line issued by CBS, 16.36% of the households were categorized as poor. A poor household was a household having an income per capita below Rp. 150.799 for urban areas and Rp.117.259 for rural areas. There were 14.57% families in urban areas of Java categorized as poor and 21.25% household in rural areas of Java categorized as poor. The average per capita expenditure of poor household in urban areas in Java was Rp.120,188 while the average per capita expenditure of poor families in rural areas in Java was Rp 95.329.

There were 5 municipalities that had more than 35% poor households. Based on Food Security Agency (2005), an area that has a level of poverty exceeding 35% would be considered as the first priority zone, areas with a poverty level of 25.00-34.99% would be considered as the second priority zone, and areas with a poverty level of 20.00-24.99% would be considered as the third priority zone. The average percentage of poor in Java was 17.88%. From 78 regencies and 32 cities studied, Trenggalek and Batang (> 40%) had the highest percentage of poor families in the Java. While Depok City, East Jakarta, Central Jakarta, West Jakarta and South Jakarta had the lowest percentage of poor families (<1%) in Java. Appendix 1 showed municipalities that had the highest percentage of poor families in Java.

From Figure 6 it can be seen that there were six main source of income of the poor households. Most of the poor households worked in the field of agriculture and husbandry (46.18%). This indicated that people who worked in the field of agriculture and husbandry were more reluctant towards poverty. Agriculture and husbandry was followed by small scale retailers, building constructions, public transportation, money transfer, and housekeeping. Meanwhile other sources of income include industry, education, trade, forestry, government employees, and etc.


(42)

Figure 6 M

It is stated in Arian associated with poverty (2004), chronic food in insecurity as a food in China (2001) the causes such as the socio-politic agricultural land area pe anomalies, the low appl the low food production limited off-farm income of food insecurity in the to produce food. In othe policies (domestic and in crisis.

Java is very reluc (2005), the average perc There were 37 municip insecure households. K had the highest percenta Sukabumi, Ciamis, Cia

Small Sc Reta Sellers, 1 Public

Transportation, 6.8% Money Transfer, 4.07%

House Keeping, 3.07%

Others, 19.

Main Source of Fix Income of Poor Households

iani (2004) that chronic food insecurity condition ty issues. As mentioned in Simatupang (1999) an insecurity is increasing caused by poverty (chr

insecurity poverty gap). Meanwhile, according es of food insecurity at the household level is very tical situation of agriculture and farmers, lack of per capita, low productivity and fertility of the lan

plication of modern agricultural techniques the ion, low purchasing power of households as a re e. Meanwhile, according to Witoro (2003) The ma hese countries is the weakness of developing acce ther cases, lack of food and poverty can be cause

international) as well as disasters such as war or

luctant to food insecurity. Based on analysis o ercentage of food insecure household in Java wa cipalities in Java predicted to have more than

Kudus, Batang, Pati, Pemalang, Temanggung, a tage of food insecure families in Java (>80%). M

ianjur, Indramayu, Pandeglang, and Sumedan

Agricultu Husban

46.18

Building Construction, 8.

10% l Scale

tail 12.02% 9.74%

ons are often and AusAID chronic food g to UNDP ery complex, f productive land, climate e impact on result of the main causes ccess to land sed by trade or economic

of Susenas was 67.40%. n 70% food , and Jepara Meanwhile ang had the

lture and andry, 18 %


(43)

lowest percentage of food insecure families in the Java (<50%). Appendix 2 showed the percentage of food insecure households and the national ranks used by Food Security Agency and a ranked based on the quartiles.

Based on a benchmark research conducted by Food Security Agency (2005), Java had 7 food insecure municipalities, which included Bondowoso, Probolinggo, Situbondo, Jember, Brebes, Serang, and Lebak. This research also indicated that in Java there were 33 municipalities considered as the second priority food insecure municipalities. To decide whether an area is considered a food insecure area the Food Security Agency used a composite indicator using ten indicators which included food availability, food and livelihood access, and health nutrition indicators. There are differences between the priority based on the Food Security Agency and priority based on a quartile method on the calorie intake data. It can be seen that the priority based on composite indicators first priority had a relatively lower percentage of insecure households compared to other municipalities that are considered as the second and third priority. Hence using the food insecure map should be done cautiously. The reasons was food insecurity varied among areas, hence the interpretation of food security also differed from a place to another. An area could be in the group of Priority 1 mainly because this district had very low food and livelihood access, female literacy, high infant mortality, low life expectancy, and poor health infrastructure. Meanwhile another area was food insecure mainly because of infrastructure deficiencies and had very low level of self sufficiency in cereal productions. The list of indicators used in the Food Security Map can be seen in the Appendix 9.

The Food Security Composite Indicator (FSCI) was calculated by using Principal Component Analysis to calculate the weights for the indicators used for deriving FSCI final score for each municipality. The 10 indicators were first converted to z-score as a standardization pre requisite for conducting the analysis. PCA was used to assign weight to all 10 indicators and not eliminate indicators, because all indicators were considered to be important. Five principal components were extracted and the final FSCI were a result of multiplying weights from PCA with the corresponding z-scores of indicators which resulted :


(44)

FSCI= 0.955 Availabilit 0.862 Female Li Status Under Fi Centre

Further research on the enhance the accuracy indicators can be seen in

In 2005, the avera areas was relatively sma average amount of mon rural areas. The average was Rp.71.000, which w 64.79% (Rp. 78.000) of 69.24% (Rp. 66.000) of law, increasing the prop That's because there ass household food expendi budget to buy other tha and others (Ariani 2004)

Figu

The results of this Ariani (2004). Ariani stu

not poo poo not poo poo not poo poo a ll ur b a n ru ra l

lity + 0.858 Road + 0.635 Poor People + 0.653 El Literacy + 0.977 Inverse Life Expectancy + 0.792 Five + 0.979 IMR + 0.840 Clean Water + 0.6

the affectivity of these indicators should be an of the indicator. Food insecurity map based in Appendix 10.

erage share of income used for purchasing food maller than in rural areas for all households. Mea money spent for food was higher in urban areas co ge expenditure used to purchase food for poor h

was around 67.15% of the total income. For u of income was used to purchase food, while for of income was used to purchase food. Accordin roportion of food expenditure indicates declinin ssuming the budget constraint, increasing the pro diture will lead to further decline in the proport han food such as: for clothing, housing, educati

4).

gure 7 Share of Java Food Expenditure 2005

his research was also similar with a research con studied the trend of food-insecure households in

0% 20% 40% 60% 80%

poor poor poor poor poor poor Electricity + 92 Nutrition .657 Health

analyzed to ed on these od in urban eanwhile the compared to r households r urban poor or rural poor ing to Engel ing welfare. roportion of ortion of the ation, health,

onducted by n 1999-2000


(45)

using restriction of less than 1680 kcals/capita/day (80% from 2100 kcal/capita/day). Susenas data analysis of 2005 also showed that: (1) the proportion of vulnerable food households in Java was larger than in Outer Java, (2) the proportion of vulnerable food households in urban areas were lower than in rural areas, (3) the higher the average household income the lower the amount of household food insecurity, (4) the proportion of food-insecure households with livelihoods in the agricultural sector was lower than non-agricultural sectors.

Meanwhile for Unemployment, the percentage of job-seekers in Indonesia was quite high. Based on a research conducted by Survey IHS (2005) the average percentage of unemployment in urban and rural areas in Java was 38.6% and 33.8%. Therefore it has been estimated that from a population of 122,406,000 people in Java there are 42.842.100 unemployed people. Hence, overall the percentage of unemployment in Java was around 35%. Appendix 3 showed municipalities that had the highest unemployment percentage in Java. Sukabumi City, Pandeglang, Banjar, Kerawang, Lebak had the highest proportion of unemployment (> 50%) in Java. While Pekalongan, Semarang, Jepara, Wonosobo, Temanggung, and Pacitan has the lowest unemployment percentage (<20%) in Java island.

4.2. Satscan and ULS Evaluation

We have noted above that Satscan has made the spatial scan statistic widely accessible, substantially impacting numerous domains in which spatial clusters are of interest. However, Satscan is known to have limitations. Circles that have been used for the scanning window, had low power for detection of irregularly shaped clusters (Patil 2006). Therefore, Patil (2006) acknowledged a new version of the spatial scan statistic designed for detection of arbitrary shaped hotspots and for data defined either on a tessellation or a network. This version looks for hotspots from among all connected components of upper level sets of the response rate and is therefore called the upper level set (ULS) scan statistic. The method is adaptive with respect to hotspot shape since candidate hotspots have their shapes determined by the data rather than by some a priori prescription like circles or


(1)

consultation with domain experts. In this study for poverty and food scarcity, the results of Satscan and ULS were compared with the Food Security Map and Poverty Map accomplished by CBS and FSA.

Based on the comparison of ULS and Satscan on the poverty and food scarcity case in 78 districts in Java, ULS showed more accurate and stable results. The stability can be seen from the average stability score of clusters. For poverty and food scarcity ULS had an average stability score above Satscan. Meanwhile accuracy can be seen from the precision of ULS and Satscan in detecting high percentages of poor and food scarcity cases. These areas are known as first priority areas in Food Security Agency and CBS. The percentages of accuracy of ULS in detecting high cases of poverty and food scarcity are higher than Satscan. Therefore ULS was used to detect hotspots of poverty, unemployment, and food scarcity in all areas of Java.

The hotspot related to poverty, food scarcity and poverty were mainly in Central Java, East Java, and Yogyakarta. Areas such as Cilacap, Demak, Kab.Madiun, Kota Pekalongan, Kulon Progo, Pemalang, and Purworejo were identified as critical areas because these areas were poverty, unemployment, and food scarcity hotspots. Based on the stability analysis Cilacap was the core cluster of poverty, unemployment, and food scarcity. This indicated by using several maximum cluster sizes Artinya Cilacap was identified as a poverty, unemployment, and food scarcity hotspot. Hence, Cilacap should be given prioritization. Meanwhile Kota Batu, Salatiga City, and Serang City were not detected as critical areas.

Main factors that caused the hotspot were analyzed by Ordinal Logistic Regression Model. Factors related to the hotspot were school facilities, village trade, village industry, village services, slum areas, proportion of families without electricity, and credit facilities. The increase of school facilities, stimulating


(2)

economical potential of a village in industry and services, decreased the possibility of an area to become critical areas. The government should give more attention to credit facilities, economical potential of a village in trade, villages without electricity, and small scale farm industry. It turned out that the increase of these factors increased the possibility of a municipality to become a critical area. From this study it was pointed out that credit facilities, farm Industry and trade in a village did not show indication that it could improve the welfare of people living in critical areas. Hence, these factors should be revitalized. Areas that had a high ratio of families living without electricity were also critical points in solving the problem of poverty, unemployment, and food scarcity. Therefore the government should have given more attention to people who lived in these areas. Keywords: Food scarcity, poverty, unemployment, hotspot detection, ordinal


(3)

RINGKASAN

DIAN KUSUMANINGRUM. Hotspot Analysis on Poverty, Unemployment, and Food Security in Java, Indonesia. Dibimbing oleh ASEP SAEFUDDIN dan MUHAMMAD NUR AIDI

Pemberantasan kemiskinan dan kerawanan pangan serta mengembangkan strategi untuk mengatasi permasalahan pengangguran pada usaha produktif untuk kaum muda adalah masalah yang sangat penting di Indonesia. Oleh karena itu, sangatlah penting untuk melakukan riset untuk mengevaluasi kondisi permasalahan tersebut di Indonesia. Teknik Geoinformatika dapat digunakan untuk memperoleh daerah kritis (hotspot) yang sangat penting dalam pengambilan keputusan dan menentukan upaya strategis yang lebih baik. Satscan adalah alat geoinformatika yang banyak digunakan untuk mendeteksi hotspot. Namun dalam beberapa penelitian, telah ditemukan kekurangan metode Satscan ketika mendeteksi hotspot di suatu area yang bentuknya tidak teratur. Oleh karena itu metode Satscan akan dibandingkan dengan Upper Level Set (ULS) Scan Statistics dan Peta Kerawanan Pangan dan Peta Kemiskinan yang dikembangkan oleh Dewan Ketahanan Pangan.

Permasalahan lain dalam Satscan dan ULS adalah kestabilan dari gerombol hotspot yang diperoleh dari kedua metode tersebut. Perubahan nilai ukuran gerombol maksimum (maximum cluster size) menyebabkan perubahan gerombol hotspot yang dieroleh. Ukuran gerombol maksimum 50% (default) menghasilkan gerombol hotspot yang kurang informatif karena gerombol hotspot

utama yang diperoleh sering kali menempati sebagian besar daerah studi. Sehingga seringkali sangatlah sulit untuk menentukan ukuran gerombol maksimum yang optimal. Sebuah proses untuk menangani isu-isu sensitivitas metode scan statistik dan meningkatkan interpretasi hasil scan statistik telah diusulkan (Chen et. all 2008). Pertama, metode scan statistik harus disimulasikan berulang kali, dimulai dari ukuran maksimum yang kecil dan meningkat menjadi


(4)

50% (default). Kedua, hasil metode scan statistik harus digambarkan dalam peta matriks untuk perbandingan hasil hotspot dari ukuran gerombol maksimum yang berbeda. Ketiga, menghitung dan menafsirkan nilai kestabilan suatu daerah di peta. Keempat, kelompok inti harus dibedakan dari kelompok heterogen melalui interpretasi dari nilai kestabilan. Kelima, penafsiran kelompok-kelompok inti harus dibandingkan dengan hasil yang diperoleh dari teknik lainnya dan konsultasi dengan para pakar.

Berdasrkan perbandingan ULS dan Satscan pada kasus kemiskinan dan kerawanan pangan pada 78 kabupaten di Jawa, ULS cenderung menunjukkan hasil yang lebih akurat dan stabil. Kestabilan dapat dilihat dari hasil perhitungan nilai rata-rata stabilitas gerombol. Untuk kasus kemiskinan dan kerawanan pangan ULS memiliki nilai rata-rata stabilitas gerombol yang cenderung lebih tinggi dari Satscan. Sedangkan akurasi dapat dilihat dari persentase ketepatan ULS dan Satscan dalam mendeteksi daerah-daerah yang memiliki persentase kemiskinan atau kerawanan pangan yang tinggi. Daerah-daerah tersebut oleh Badan Ketahanan Pagan maupun BPS dinyatakan sebagai daerah prioritas pertama. Persenatse akurasi ULS dalam mendeteksi kemiskinan dan kerawanan pangan lebih tinggi dari Satscan. Oleh karena itu ULS digunakan untuk menentukan hotspot kemiskinan, kerawanan pangan, dan pengangguran untuk seluruh kota dan kabupaten di Jawa.

Mayoritas hotspot kemiskinan, kerawanan pangan dan pengangguran berada di Jawa Tengah, Jawa Timur, dan Yogyakarta. Daerah Cilacap, Demak, Kab.Madiun, Kota Pekalongan, Kulon Progo, Pemalang, dan Purworejo merupakan daerah kritis krena terdeteksi sebagai daerah hotspot kemiskinan, kerawanan pangan dan pengangguran. Berdasarkan analisis kestabilan, Cilacap merupakan daerah inti kemiskinan, pengangguran dan kerawanan pangan. Artinya dengan menggunakan berbagai ukuran maksimum gerombol yang berbeda daerah Cilacap terdeksi sebagai hotspot kemiskinan, pengangguan dan


(5)

kerawanan pangan oleh karean itu daerah Cilacap harus diutamakan. Sedangkan Kota Batu, Kota Salatiga, dan Kota Serang merupakan daerah yang tidak kritis.

Factor-faktor yang menyebabkan hotspot kemiskinan, kerawanan pangan dan pengangguran telah dianalisis dengan menggunakan Model Regresi Logistik Ordinal. Faktor-faktor yang terkait dengan hotspot-hotspot tersebut adalah rasio fasilitas sekolah, rasio potensi desa perdagangan, rasio potensi desa industri, rasio potensi desa jasa, rasio fasilitas kredit, dan rasio industri kecil di bidang pertanian, dan proporsi keluarga yang tidak memiliki listrik. Peningkatan jumlah fasilitas sekolah dan peningkatan rasio potensi ekonomi dari suatu desa di bidang industri dan jasa dapat mengurangi kemungkinan suatu daerah untuk menjadi daerah kritis. Sedangkan peningkatan fasilitas kredit, potensi ekonomi dari sebuah desa di bidang perdagangan, desa-desa tanpa listrik PLN, dan industri pertanian meningkatkan kemungkinan daerah untuk menjadi daerah kritis. Fasilitas kredit, potensi ekonomi di bidang perdagangan dan industri pertanian di sebuah desa belum dapat meningkatkan kesejahteraan penduduk di daerah-daerah kritis secara signifikan. Oleh karena itu fasilitas kredit, perindustrian dan perdagangan pertanian harus direvitalisasi. Daerah yang memiliki rasio keluarga yang hidup tanpa listrik PLN tinggi juga merupakan daerah yang memerlukan perhatian khusus dalam upaya memecahkan masalah kemiskinan, pengangguran, dan kelangkaan pangan. Jika pemerintah ingin memecahkan masalah kemiskinan, pengangguran, dan kerawanan pangan maka factor-faktor tersebut perlu mendapatkan perhatian khusus.

Kata Kunci: Kerawanan pangan, kemiskinan, pengangguran, pendeteksian

hotspot, regresi logistik ordinal


(6)

©Copy Right BAU, 2010

The copy right is protected by the Indonesian law

It is prohibited to cite parts or all parts of this thesis without including or mentioning the source. Citation is only for educational purposes, research, writing papers, preparation of reports, preparation of criticism or review of a problem; and citation doesn’t harm reasonable interest of Bogor Agricultural University (BAU).

It is prohibited to announce or reproduce parts or or all parts of this thesis in any form without permission of BAU.