The Concept of Hotspot Detection

34 The best method, as the result of comparison between the two methods will be used to detect bad nutrition cases in some districts. Explanation about the application of this method will be discussed in section 3.5 and 3.6. Furthermore, the result of this hotspot detection will be used as an explanatory variable in modeling at Chapter 4.

3.2 Theoretical Study

This section describes theory of hotspot detection. Explanation is divided into four parts, i.e. concept of hotspot detection and comparison Patil et.al. 2006, Circle-based Scan Statistic kulldorff 1997, and ULS scan statistic Patil and Taillie 2004.

3.2.1 The Concept of Hotspot Detection

The concept of hotspot detection described in this section is based on the Poisson model over study area or region R, developed by Patil et.al. 2006. Poisson Model over Region R Let a study area or region R has m cells and is the number of observed event cases, is independent. The notations in the model are defined as follows, is intensity of cell a, is casescount number of cases in cell a, is population size of cell a, where , m is the number of cells in study area. Furthermore, is the average of intensity, is the total casescount, is the total population size. has distribution, , where , and Consequently, , and Appendix 3. 35 It can be shown that Appendix 4 where , relative risk of cell a or RR a and , population fraction of cell a. This constitutes multinomial characterization of Poisson process. Scan setup The rejection area of the null hypothesis is determined as follows. Hypothesis for hotspot candidate is H : No clusterhotspot it means , i.e. RR a = r a =1, for a  M H A : There exist a clusterhotspot z, the following conditions are fulfilled: 1 , a  z, z is a zone or hotspot candidate Intensity in cell a where a  z equals to intensity in the zone area. 2 , a  z , = R – z, is not a zone Intensity in cell a where a  z equals to intensity in the non-zone area. 3 , where , a  z, Intensity in zone area is greater than intensity in non-zone area. , and , Figure 12 Study area with zone and non zone areas Equation 3 becomes i.e . Also note that and , 36 therefore and and , is intensity in cell a, is intensity in the zone z. It can be stated that H A alternatively in terms of relative risks: H A : There exist a clusterhotspot z, i.e, 1 , a  z, and 2 , a  . It means relative risk of cell a a  z equals to relative risk inside the zone area, . And relative risk of cell a a  equals to relative risk outside the zone area, . Intensity inside the zone area is greater than 1, while intensity outside the zone area is smaller than 1. Conditional simulation and conditional test Data simulation should have the following condition and test. H : 1  a r : a a p   a  M. relative risk equals to 1, there is no hotspot H A : 1 z a  : a z A a p r p  , 1  z r , a  z. relative risk in the zone area is greater than 1 2 z a  : a z A a p r p  , 1  z r , a  . relative risk in non-zone area is smaller than 1 3 z z  , i.e. z z r r  . relative risk in the zone area is greater than relative risk in non-zone area Maximum Likelihood Estimation Maximum Likelihood Estimations for the parameters of the Multinomial distribution are as follows. Appendix 5, relative risk in cell a. The following estimates will be obtained through the similar derivative operation of Appendix 5. Those parameters are relative risk in the zone area, in non zone area, and in the study area: z z z z y N y   ˆ , z z z z y N y   ˆ , and 37 y N y N y N N a a a       ˆ , respectively. Furthermore, y y r a a a   ˆ ˆ ˆ , y y r z z z   ˆ ˆ ˆ , and y y r z z z   ˆ ˆ ˆ are the relative risks described before. Conditional Simulation Every simulation data satisfying the following conditions: 1. Conditional Simulation of the Data Set is   a a a a a y y r p     ˆ ˆ is the number of cases in cell a, given y the total number of cases in study area R. is probability an individual has a certain case, as the product of relative risk and population fraction. 2. Conditional Simulation of the data set under H , probability an individual has a certain case under Ho. 3. Conditional Simulation of the Data Set under H A , inside zone area , outside zone area Patil et.al 2006 3.2.2 Hypothesis testing for comparison between SS and ULS Comparison between circle based SS and upper level set scan statistic ULS hotspot detection methods are conducted with the following test. Hypothesis test of ULS scan setup H : No clusterhotspot:  a , i.e. RR a = r a =1, for a  M. H A : There exist a clusterhotspot z, i.e., 1 z a  , a  z;    a a a z N N , a  z, 2 z a  , a  z = R - z, 38 3 z z 1 Note: z z = 1  z z r r and z z r r  Hypothesis test of Circular based scan setup H : No clusterhotspot:  a , i.e. RR a = r a =1, for a  M. 1 :  z IA A r H and 1  z r , IA: Inside z is compared with average . The following shows that H A implies : H A : but z z z z N N N   i.e z z z z     . because z z  and z z   , 0 i.e and , The following shows that implies H A : IA A H implies H A component involving RR, but not the component of inside and outside homogeneities. and

3.2.3 Circle-based Scan Statistic SS Hotspot Detection