34 The best method, as the result of comparison between the two methods will
be used to detect bad nutrition cases in some districts. Explanation about the application of this method will be discussed in section 3.5 and 3.6. Furthermore,
the result of this hotspot detection will be used as an explanatory variable in modeling at Chapter 4.
3.2 Theoretical Study
This section describes theory of hotspot detection. Explanation is divided into four parts, i.e. concept of hotspot detection and comparison Patil et.al. 2006,
Circle-based Scan Statistic kulldorff 1997, and ULS scan statistic Patil and
Taillie 2004.
3.2.1 The Concept of Hotspot Detection
The concept of hotspot detection described in this section is based on the Poisson model over study area or region R, developed by Patil et.al. 2006.
Poisson Model over Region R Let a study area or region R has m cells and
is the number of observed event cases,
is independent. The notations in the model are defined as follows, is intensity of cell a,
is casescount number of cases in cell a, is
population size of cell a, where , m is the number of cells in
study area. Furthermore, is the average of intensity,
is the total casescount,
is the total population size. has
distribution, , where , and
Consequently, , and
Appendix 3.
35 It can be shown that
Appendix 4 where
, relative risk of cell a or RR
a
and , population fraction
of cell a. This constitutes multinomial characterization of Poisson process. Scan setup
The rejection area of the null hypothesis is determined as follows. Hypothesis for hotspot candidate is
H : No clusterhotspot it means
, i.e. RR
a
= r
a
=1, for a M
H
A
: There exist a clusterhotspot z, the following conditions are fulfilled: 1
, a z, z is a zone or hotspot candidate
Intensity in cell a where a z equals to intensity in the zone area.
2 , a
z
, = R – z, is not a zone
Intensity in cell a where a
z
equals to intensity in the non-zone area.
3 , where
, a z,
Intensity in zone area is greater than intensity in non-zone area. , and
,
Figure 12 Study area with zone and non zone areas
Equation 3 becomes i.e
. Also note that and
,
36 therefore
and and
,
is intensity in cell a, is intensity in the zone z. It can be stated that H
A
alternatively in terms of relative risks: H
A
: There exist a clusterhotspot z, i.e, 1
, a z, and 2
, a . It means relative risk of cell a a
z equals to relative risk inside the zone area, . And relative risk of cell a a
equals to relative risk outside the zone area, . Intensity inside the zone area is greater than 1, while intensity outside the zone
area is smaller than 1.
Conditional simulation and conditional test Data simulation should have the following condition and test.
H :
1
a
r
:
a a
p
a M. relative risk equals to 1, there is no hotspot
H
A
: 1
z a
:
a z
A a
p r
p
,
1
z
r
, a z.
relative risk in the zone area is greater than 1 2
z a
:
a z
A a
p r
p
,
1
z
r
, a .
relative risk in non-zone area is smaller than 1 3
z z
, i.e.
z z
r r
. relative risk in the zone area is greater than relative risk in non-zone area
Maximum Likelihood Estimation Maximum Likelihood Estimations for the parameters of the Multinomial
distribution are as follows. Appendix 5, relative risk in cell a.
The following estimates will be obtained through the similar derivative operation of Appendix 5. Those parameters are relative risk in the zone area, in non zone
area, and in the study area:
z z
z z
y N
y
ˆ
,
z z
z z
y N
y
ˆ
, and
37 y
N y
N y
N N
a a
a
ˆ , respectively. Furthermore,
y y
r
a a
a
ˆ ˆ
ˆ ,
y y
r
z z
z
ˆ ˆ
ˆ
, and
y y
r
z z
z
ˆ ˆ
ˆ
are the relative risks described before.
Conditional Simulation Every simulation data satisfying the following conditions:
1. Conditional Simulation of the Data Set is
a a
a a
a
y y
r p
ˆ
ˆ
is the number of cases in cell a, given y the total number of cases in study area R.
is probability an individual has a certain case, as the product of relative risk and population fraction.
2. Conditional Simulation of the data set under H ,
probability an individual has a certain case under Ho.
3. Conditional Simulation of the Data Set under H
A
, inside zone area
, outside zone area
Patil et.al 2006 3.2.2 Hypothesis testing for comparison between SS and ULS
Comparison between circle based SS and upper level set scan statistic ULS hotspot detection methods are conducted with the following test.
Hypothesis test of ULS scan setup H
: No clusterhotspot:
a
, i.e. RR
a
= r
a
=1, for a M.
H
A
: There exist a clusterhotspot z, i.e., 1
z a
, a z;
a a
a z
N N
, a z,
2
z a
, a
z
= R - z,
38 3
z z
1 Note:
z z
=
1
z z
r r
and
z z
r r
Hypothesis test of Circular based scan setup H
: No clusterhotspot:
a
, i.e. RR
a
= r
a
=1, for a M.
1 :
z IA
A
r H
and
1
z
r
, IA: Inside
z
is compared with average .
The following shows that H
A
implies :
H
A
: but
z z
z z
N N
N
i.e
z z
z z
. because
z z
and
z z
,
0 i.e and
,
The following shows that implies H
A
:
IA A
H
implies H
A
component involving RR, but not the component of inside and outside homogeneities.
and
3.2.3 Circle-based Scan Statistic SS Hotspot Detection