categories, which are caused by repeated or cyclical factors and temporary or unpredicted factors.
2.4. Hotspot
Hotspot is defined as something unusual, anomaly, aberration, outbreak, elevated cluster, or critical area Patil and Taillie 2004. Meanwhile according to
Harran e.t. all 2006 hotspots are locations or regions that have consistently high levels of occurrences such as the total amount of poor, unemployed, or people
that suffer from food scarcity and may have characteristics unlike those of surrounding areas.
Hotspot clusters were generated by setting the relative risk in some areas to be larger than one and Song and Kulldorff 2003. Furthermore a poverty hotspot
represents an area characterized by certain local characteristics which could also expand and affect other neighbouring areas Betti et. all 2006.
2.5. Hotspot Detection Method
Hotspot detection method contains three components, which include a identifying hotspot candidate, b evaluating the statistical significant hotspot, and
c estimate the covariance related with hotspot. In Indonesia nowadays, the most recent method used to identify a candidate hotspot is spatial scan statistics. In
Bungsu 2006 it is stated that spatial scan statistics suffers from several limitations, such as the circles that have been used for the scanning window
caused low power for detection of arbitrarily shaped cluster. Hence Upper Level Satcan ULS will be used as a comparison to detect arbitrarily shaped hotspots.
Likelihood ratio, relative risk, and hypothesis testing based on montecarlo simulation are techniques used to evaluate a candidate hotspot.
2.6. Scan Statistics Satscan
Scan statistic is a statistical method used to detect clusters in a cluster process. Spatial scan statistic is used to determine whether a spatial cluster process
contains a localized cluster of points somewhere in a region of interest. The spatial scan statistic deals with the following situation. A region R of euclidian
space is subdivided into cells defined denote by A. Data are available in the form of a count on each cell A. In addition, A size value P
A
is associated with each cell. The cell sizes P
A
are assumed to be known and fixed, while the cell counts N
A
are
independent random variables.
The spatial scan statistic seeks to identify clusters of cells that have an elevated response compared with the rest of the region. Elevated response means
large values for the rates, r
A
= N
A
P
A
, instead of for the raw counts N
A
. Cell counts are thus adjusted for cell sizes before comparing cell responses. Kulldorf 1997
presented the following algorithm for a circular window of fixed diameter d on a homogeneous PoissonBernoulli assuming homogeneous variance process:
1. Pick a grid point. Calculate the distance to the different population points and sort those in increasing order. Memorize the sorted population points
in an array 2. Repeat step 1 for each grid point
3. Pick a grid point 4. Create a circle cantered at the grid point and continuously increase the
radius. For each population entering the circle, update the number of cases n and measure the population N
W
inside the circular area W 5. Repeat step 3 and 4 for each grid point. Report the largest likelihood based
on all n, N
W
pairs as the scan statistics, where the likelihood is calculated according to equation
6. Repeat steps 3 to 5 for each monte carlo replication The relative risk is a non-negative number, representing how much more
common a case is in the location and time period compared to the baseline. Setting a value of one is equivalent of not doing any adjustments and a value of
less than one to adjust for lower risk A value of greater than one is used to adjust for an increased risk. A cluster with a relative risk RR value greater than one is
defined as a candidate of hotspot. A relative risk of zero is used to adjust for missing data for that particular time and location Kulldorff 2006. The relative
risk is calculated by Kulldorff 2006 c
E n
RR
Z
= where
z
n is the number of observed cases, and
c E
is the expected number of cases in a location which is
calculated by ×
= P
C p
c E
where p is the number of population in the cluster of interest, while
C
and P are the total number of cases and total number of population.
Available scan statistic software is known to have several limitations. First, circles have been used for the scanning window, resulting in low power for
detection of irregularly shaped clusters Figure 1. Second, the response variable has been defined on the cells of a tessellated geographic region, preventing
application to responses defined on a network stream network, water distribution system, highway system, etc.. Third, response distributions have been taken as
discrete specifically, binomial or Poisson. Finally, the traditional scan statistic gives only a cluster estimate for the hotspot but does not attempt to assess
estimation uncertainty Patil 2006
Figure 1 Scan statistic zonation for circles left and space-time cylinders right
2.7. Upper Level Set ULS Scan Statistics