111
Chapter 5 GENERAL DISCUSSION
Ranking, hotspot detection and modeling are important techniques for almost all fields of study. These three techniques have important roles for decision
makers, even in business, education, ecology, and socio economic, especially in government to increase the transparency of decision making. Every country in this
world has several policies to arrange for several affairs. Due to the limitation of the sources, the right and apt decision is very important and urgent. To support the
right decision in every area, the role of these techniques is needed. Optimistically, this dissertation is able to contribute ideas and thoughts to the government and
ministries in decision making process related to poverty reduction. Focus of study in this dissertation is modeling in Nested Generalized Linear
Model NGLM and Nested Generalized Linear Mixed Model NGLMM as an expansio
n of Zhang’s and Lin’s Model 2008, which is a GLMM as a strategy to detect hotspot through parameter estimates of spatial association in non-nested
study area using count response variable. Modeling in this study is GLM and GLMM with hotspot detection result as an explanatory variable, applied in nested
area using multinomial ordinal response variable. Before modeling, two studies, i.e
. a ranking method and 2 hotspot detection methods were studied. Ranking method was concentrated in Chapter 2, hotspot detection methods were studied in
Chapter 3, and model development was built and implemented in Chapter 4. ORDIT Ordering Dually in Triangle ranking method was studied and
implemented on poverty data in Chapter 2. Actually, this method was developed to handle ranking process of many individuals based on many indicators. It is not
easy to rank individuals with many indicators. This chapter explained how to rank many individuals based on many indicators using some mathematical concepts,
such as order theory, duality, and partial order set poset. Due to the limitation of the data, this method was implemented to order sub districts according to poverty
level based on only two indicators, i.e. surkin SKTM or poverty letters PL and askeskin
asuransi kesehatan untuk orang miskin or health insurance for the poor
112 HIP. Unit of observation of this data was sub district kecamatan. In this study,
1679 sub districts in Java Island were ordered based on poverty using those two indicators, HIP and PL.
The work of this ordering study was continued by grouping the ranking result into 3 parts based on ranking order. The three poverty levels of sub districts
are worst, moderate, and mild. Every sub-district has its own grade as 1 or 2 or 3. “One” is for the worst, “2” is for moderate, and “3” is for mild. The result of this
grouping ranking was kept as a report and would be used as response variable for modeling in Chapter 4.
Furthermore, two hotspot detection methods were studied in Chapter 3. In this study, comparison of two hotspot detection methods is carried out by
simulation based on diseases case data. The number of cases is assumed has Poisson distribution. Based on this assumption, 8 data sets were built and
computed in 10.000 times to obtain the output, which are the performances of the methods in 14 criteria. The mean and standard deviation of each criterion from
each simulation and each data set were computed and then compared. From these outputs, 14 criteria were summarized, analyzed and compared. As the result of
comparison, it is believed, ULS hotspot detection is better than Circle based Scan Statistics.
The research was continued on detection of bad nutrition hotspot in 8 districts that had been chosen randomly. In this result, we have hotspot status for
every sub district in these 8 districts; 0 means sub district is not in the hotspot area and 1 means sub district is in the hotspot area. This result would be included in
modeling as a dichotomy explanatory variable, to answer the question: does the hotspot of bad nutrition explain significantly on poverty level through Nested
GLM and Nested GLMM. Modeling in Chapter 4 was started with the data preparation, as follows.
Three districts from West Java, 2 districts from Central Java, and 3 districts from East Java were chosen randomly for model implementation. The names of these 8
districts that also used in Chapter 3 are Kuningan, Karawang, Majalengka, Cilacap, Boyolali, Ngawi, Blitar, and Jember. Level of poverty which was the
result of study in Chapter 2 was used as ordinal response in modeling, while
113 hotspot status which was the result of study in Chapter 3 was used as an
explanatory variable. Moreover, the other explanatory variables for modeling were number of farmer families, schools, and health personnel. The determination of
these variables for modeling was supported by the Policy Review Report by the Health Manpower Planning Bappenas in 2005 which stated that these variables
had relation to poverty. To simplify understanding in interpretation, values of explanatory variables were divided into three parts, i.e. low, moderate, and mild
which were appropriate to some resources: document of Bappenas about manpower and the result of research about GIS education by Hidayat 2004.
Based on Zhang’s and Lin’s model, modification was developed, that was 1 upgrading the model for nested data districts nested in province, with assumption
correlation of sub districts within district is higher than correlation of sub districts between districts, 2 using ordinal scale as response variable. Modeling was
undertaken for the Nested GLM and Nested GLMM. In Nested GLM, Generalized Estimating Equation GEE method was used as the model parameters estimation
method to tackle clustered and correlated data problem, while in Nested GLMM, Pseudo Likelihood was used as the model parameters estimation method. In
Nested GLMM, district was a random effect in the model. Some working correlation matrices could be implemented through GEE
method. Three types working correlation matrices WCM, i.e. exchangeable, unstructured,
and independent were studied. The objective of modeling was to know which WCM would give the best results. The poverty data was allegedly to
have unstructured pattern in correlation between sub districts in a district, but the result of modeling showed that the independent structured was most appropriate to
the data, where this structure gave the minimum ratio of robust and model based standard errors. Therefore, it is believed the data has independent correlation
structure. The results of the methods mentioned above are as follows. ORDIT ranking
gave the result that 6 of 10 most severe sub districts were in Jember district, and 5 of least severe sub districts were in Probolinggo and 3 of these were in Surabaya.
Based on the results of ranking method, it could be concluded globally that the
114 order of provinces from the least severe to the most severe level of poverty was
West Java, Central Java and East Java. Combination results from ranking method and hotspot detection method are
given at Table 22. It shows 19 sub districts in East Java are in bad nutrition hotspot and also categorized as the worst level of poverty. This number is the largest
among other numbers in the cells of the table. Furthermore, result of Nested GLM of model based and unstructured WCM supports this finding, where the hotspot of
bad nutrition is statistically significant as a contribution to poverty level in East Java with p-value = 0.018. This finding is an interlinking among the results of
ranking method, hotspot detection method, and modeling for nested area. Table 22 Sub districts in hotspot area of bad nutrition and poverty level
Poverty Level total sub district in
the hotspot worst
moderate mild
sub districs in bad nutrition
hotspot area West Java
4 4
13 21
Central Java 4
7 9
20 East Java
19 5
1 25
According to the Nested GLM there is indication that the number of farmer families and hotspot of bad nutrition is significant to determine poverty level of
sub district in Central Java, while number of schools is significant to determine level of poverty in West Java. In contrast, according to Nested GLMM, there is
indication that the number of schools, number of health personnel, and hotspot of bad nutrition is significant to determine poverty level of sub district only in
Central Java. While based on both models, there is no significant explanatory variable in East Java.
To obtain the deeper thought about the data, Nested GLM results showed that the independent correlation structure was more possible to be specified for the
data, where averages of ratio SE
R
SE
M
was smallest and the true classification percentage was the largest comparing to exchangeable and unstructured WCM.
Furthermore, the true classification result of Nested GLMM is higher than Nested GLM, that is 75.8 true classification of Nested GLMM for all types of
covariance matrix and 61 true classification of Nested GLM for independent correlation matrix.
115 Based on all those results, objectives of this research are answered as the
following. First objective is to determine the level of poverty of sub districts in Java Island, obtained at the results in Chapter 2. Five most severe sub districts
were in Jember, and 5 and 3 least severe sub districts were in Probolinggo city and Surabaya, respectively. The best hotspot detection method was ULS, obtained in
Chapter 3 as the answer of the second objective. The implementation of the ULS hotspot detection method was also yielded in this chapter, explicitly sub districts in
bad nutrition hotspot area. Furthermore, the third objective, model building for nested correlated data with multinomial ordinal response had been finished in
section 4.3. Parameter estimation techniques and their implementations had been completed in section 4.2 and subsection 4.3.2, respectively. The fourth research
objective was answered in section 4.4, where the influence of working correlation matrix on the model parameters estimates was shown by Figure 25 to 31 and Table
19 with their explanations. These figures and table showed the data most probably follows the independent correlation structure. Finally, the last objective, the
differences of model parameter estimations between Nested GLM and Nested GLMM has been described also in this section. Based on Nested GLM, there is
indication that number of farmer families and hotspot of bad nutrition is significant to determine poverty level of sub district in Central Java, while number of schools
is significant in West Java. According to Nested GLMM, there is indication that the number of schools, number of health personnel, and hotspot of bad nutrition is
significant to determine poverty level of sub district only in Central Java. While based on both models, there is no significant explanatory variable in East Java.
Random effect in Nested GLMM controls the correlation of the data and makes standard errors become smaller.
In line with those results and the thought, it is better to use Nested GLMM for the poverty data in Java Island. This model gives the highest percentage of true
classification and appropriate result to the procedure of obtaining the data, where districts were chosen randomly and treated as random effect in the model. It is
appropriate with the statistical theory about randomness in modeling.
116
117
Chapter 6 CONCLUSION AND RECOMMENDATION