Ecogeographical Data Generation Data Preparation Utility

27 a Habitat suitability is the affordability of an area indicated by the availability of resources and environmental conditions necessary for relatively successful species survival and reproduction. b Habitat factor is spatial representation of the resources and environmental condition needed by the species for its survival. c Estimated land is a spatial representation of an area which has suitability value to be estimated. It is actually a collection of small arbitrary land unit. Each unit of area has suitability score. As well as observation unit data below, it is represented by vector-based grid or similar to adjacent isometric cell Hirzel, 2001. d Observation unit is an area where measurement to ecogeographic variable was conducted. It is represented by uniform polygon such as rectangle or circle. The observation and estimated land feature are categorized as evaluated feature. e Ecogeographic variable is spatial properties of a unit of area based on the arrangement of corresponding habitat factor. f Species distribution is a collection of species position in certain space related to their survival.

4.1.2.1. Data Preparation Utility

All processes built in the system are developed by exploiting basic geometric function of MapObjects considerably. Even the design of process is narrow in application, means that is specific for MapObjects application, the process is described here consider to the importance of documentation for better development in the future.

4.1.2.1.1. Ecogeographical Data Generation

According to the definition ecogeographical variable, the generation of ecogeographical variable means to measure the arrangement structure of correspond habitat factor. There are three basic types of spatial feature, i.e. point, line, and polygon. Therefore, the analysis of spatial features arrangement of an area considers these spatial features. 28 There are two type of spatial analysis developed for SUITSTAT, i.e.: content and proximity analysis. Content analysis is intended to know the structure of certain feature in the area. Proximity analysis is used to obtain the short distance to certain feature which elucidates the contiguity relation of an area to its surrounding. These analyses were further developed for obtaining the attribute information of feature which satisfied the analysis. A list of analysis available in SUISTAT is provided in the Table 4. Table 4. Several Types of Analysis to Spatial Features Features Type Analysis Type Detail analysis The existence of point The number of point The aggregation level of points Content Analysis The attribute value of point The short distance value to a point Point Proximity Analysis The attribute value of nearest point The length of line feature The number of segments Content Analysis The attribute value of line feature The short distance value to a line Line Proximity Analysis The attribute value of nearest line The area of polygon feature The number of polygon Content Analysis The attribute value of line feature The short distance value to a polygon Polygon Proximity Analysis The attribute value of nearest polygon Basically, any type of content analysis is using the same procedure or algorithm. Specifically, content analysis is used to find the existence, number of features, size of feature dimension such as area for polygon, length for line, and also the attribute value of considered feature inside or belonged to the evaluated feature. The algorithm of content analysis is illustrated by Figure 8. The algorithms above were developed using the SearchShape, SearchByDistance , and other geometric operation methods, which available built-in methods for shape and layer object of MapObject. Further description on these methods is available in the Appendix 3, 4, and 5. 29 Figure 8. Flowchart of Content Analysis In contrast to content analysis, neighbor or proximity analysis uses distance function, such as SearchByDistance and DistanceTo. Searching process begin with gradual distance to the searched layer. When records that containing shape was found, the process to determine the shortest distance among those shapes begins. Figure 9 shows the algorithm of proximity analysis. Among of all types of content analysis, the exception is given to analysis of vector-based point aggregation which has different algorithm. The method adopts He et al. 2000 or aggregation index AI, which applied for raster data. The very basic of AI idea is the relation between the number shared edges of cells in i-th class patch’s cells to its aggregation appearance over the area. The more clump the cells, the higher the shared edges among the cells. According to He et al. 2000, the maximum aggregation level is reached when the area clumps into one patch that has the largest e i;i it does not have to be a square. Formally, it could be defined as the proportion of shared edges between the patch’s cells e

i,i

with the maximum possible shared edges max_e

i,i

. The equation of AI of class i is given by i i i i i e e AI , , max_ = He et al., 2000. Sum the area of found shape Search shape inside Evaluated Shapes ES on variable layer Search shape crossing ES on variable layer Search shape contained by ES on variable layer Intersect ES with found shapes Sum the area of found shapes NO NO YES YES END START 30 Figure 9. Flowchart of Proximity Analysis Given a class i is composed by A i cells and n is the side of largest integer square smaller than A, then the disparity between A and n or m, is equal with A – n 2 . Afterward, the maximum shared edges in A i will take one of the three forms He et al., 2000: d = initial distance searched layer Cell grid distance tolerance initial distance d distance tolerance ? Search records in searched layer by ‘d’ distance upon cell Get shape from records Records count 0 ? Return dshort Get the distance from shape to cell d cell-shape Next Record End of Records ? If dshort d cell-shape No Yes dshort = d cell-shape Yes No Yes No d = d 2 No 31 1 2 max_ , − = n n e i i , when m = 0, or 4.2 1 2 1 2 max_ , − + − = m n n e i i , when m n, or 4.3 2 2 1 2 max_ , − + − = m n n e i i , when m ≥ 4.4 Method adjustment is needed for vector-based point aggregation measurement, since AI is originally developed for raster data. The attention is especially given to the shared edges measurement. It must be noticed that the use of shared edges in the clumping measurement method obviously needs the determination of cell size. In the vector-based, the virtual cell size is determined based on the shortest distance among points d min . Each point is assumed placed on the center of imaginary squared grid as illusrated in the Figure 10. The grid has maximum of four eligible adjacent grids, i.e. shared-edge adjacent grids with a point inside white grids, marked by roman capital number. It does not have a shared edge with the ineligible adjacent grid shaded grid. Hence, the main problem in measuring shared-edge is how to identify that a point is inside the eligible white grid and ineligible virtual grid. The solution is given by knowing the domain of inelegible and eligible grid. The definition of the domain is simple since the points are laid in the same coordinate system. Figure 10. The Illustration of Neighboring Grid of a Point Any point of Zx z , y z to the center point Ox o , y o has horizontal and vertical distance as defined as d x = | x o – x z | and d y = | y o – y z |, respectively. Every points which placed surround the center point Figure 10 have distance d x ≤ 1.5d min and d min P x , y 0.5 d min d min d min IV I II III 32 d y ≤ 1.5 d min , where d min is the shortest distance among the entire points. The point beyond these distance are overlooked. Any point in the center grid has distance d x ≤ 0.5 d min and d y ≤ 0.5 d min , whereas every neighboring point to the center grid has a distance d x 0.5 d min and d y 0.5 d min . Hence, points that placed on the eligible grids, i.e. for quadrant I and III have distance: D x = {d x | 0 ≤ d x ≤ 0.5d min } and D y = {d y | 0.5d min d y ≤ 1.5d min } 4.5 or it can be written as: D x = {d x | 0 ≤ | x o – x z | ≤ 0.5d min } and D y = {d y | 0.5d min | y o – y z | ≤ 1.5d min } 4.6 Respectively, the distance of points that placed on quadrant II and IV satisfies: D x = {d x | 0.5d min d x ≤ 1.5d min } and D y = {d y | 0 ≤ d y ≤ 0.5d min } 4.7 or it can be written as: D x = {d x | 0.5d min | x o – x z | ≤ 1.5d min } and D y = {d y | 0 ≤ | y o – y z | ≤ 0.5d min } 4.8 The points which laid beyond these distances are overlooked. And to prevent double counting, after the measurement was taken place, the point is removed. The procedure of point aggregation measurement is described in the Figure 11. 33 Figure 11. Aggregation Analysis Method Notes: Further Description is Available in the Text m n for i=1 to Ai No START n=Int √ Ai m = Ai – n 2 Max_e

i,i

= 2nn-1 m = 0 m ≥ 0 Max_ei,i = 2nn-1 + 2m - 1 Max_ei,i = 2nn-1 + 2m - 2 Get Points Number Ai Set Point Collection Point collection = 0? next point satisfied in D x and D y ? remove point P in Point Collection Shared_edge = Shared_edge + 1 Set point P Set point Z Get d x and d y END Ai i = Shared_edge Max_ei,i No No Yes Yes Yes Yes No Yes