Land Use Change Modeling in Siak District, Riau Province, Indonesia Using Multinomial Logistic Regression

(1)

LAND USE CHANGE MODELING

IN SIAK DISTRICT, RIAU PROVINCE, INDONESIA

USING MULTINOMIAL LOGISTIC REGRESSION

CHANDRA IRAWADI WIJAYA

GRADUATE SCHOOL

BOGOR AGRICULTURAL UNIVERSITY BOGOR


(2)

STATEMENT

I am Chandra Irawadi Wijaya stated that this thesis entitled:

Land Use Change Modeling in Siak District, Riau Province, Indonesia

Using Multinomial Logistic Regression

is result of my own works during the period March 2009 to March 2011 and it has not been published before. The contents of thesis have been examined by the advising committee and an external examiner.

Bogor, April 2011

Chandra Irawadi Wijaya G051070051


(3)

(4)

ABSTRACT

CHANDRA IRAWADI WIJAYA. Land Use Change Modeling in Siak District, Riau Province, Indonesia Using Multinomial Logistic Regression. Under the supervision of HARTRISARI HARDJOMIDJOJO and LILIK BUDI PRASETYO.

Siak District is an enlargement from some parts of Bengkalis District that was established in 1999. The development, that has been conducted so far, has altered land uses which involve land conversion from a type of use to other uses. In this study, land use change modeling has been developed in Siak District that may facilitate the understanding of the process of land use change and its relevant factors in the research site after the enlargement process done completely. Land use change modeling in Siak District has been conducted in order to analyze the land use change during 2002 – 2005 and 2005 – 2008, to develop the land use change scheme of Siak District, to identify the driving factors of land use change and develop the land use change model of Siak District, and to examine the performance of Multinomial Logistic Regression (MLR) model in modeling the land use change.

During 2002 – 2008, Siak District was dominated by Forest land, Cropland and Grassland, whereas Wetlands, Settlements, and Other lands occupied smaller area. Based on the transition probability matrices, Forest land, Cropland, Grassland, Wetland, and Settlement tend to be in stable condition, and only Other lands which changes dynamically. However, the probabilities of each land use to transform into other land uses are also quite significant. In this research, the land use change model has been developed in two scenarios: (1) using all significant variables determined by the MLR model analysis and (2) using observed variables determined by the observation of existing condition in the field. Both scenarios indicate that the final models of land use change in Siak District which were developed using MLR model were good models. The models could explain most of the variability of land use change which happen in the research site. However, the model validations have been conducted spatially for both scenarios indicate that the final model could not fit the actual spatial data layers into the actual condition of land use change 2005 – 2008.

Based on the result of model validations, there are some research findings which can be concluded. Land use change model that has been developed by using MLR model is a generalized model of logistic regression. The MLR model forces every land use transitions to be driven by all significant parameters, while in the real condition each land use transition probably has unique combination of parameters which drive its land use transition.

Keywords: land use change; driving factors; schemes; modeling; multinomial logistic regression; Siak District, Indonesia.


(5)

ABSTRAK

CHANDRA IRAWADI WIJAYA. Pemodelan Perubahan Penggunaan Lahan di Kabupaten Siak, Provinsi Riau, Indonesia Menggunakan Multinomial Logistic Regression. Dibimbing oleh HARTRISARI HARDJOMIDJOJO and LILIK BUDI PRASETYO.

Kabupaten Siak merupakan pemekaran dari beberapa kecamatan di Kabupaten Bengkalis pada tahun 1999. Pembangunan yang telah dilakukan selama ini, telah mengubah penggunaan-penggunaan lahan dari satu tipe lahan menjadi penggunaan lain. Pada penelitian ini, pemodelan perubahan penggunaan lahan dilakukan di Kabupaten Siak dalam rangka memahami proses perubahan penggunaan lahan yang terjadi setelah proses pemekaran kabupaten. Pemodelan perubahan penggunaan lahan di Kabupaten Siak ini bertujuan untuk menganalisis perubahan penggunaan lahan yang terjadi selama periode 2002 – 2005 dan 2005 – 2008, membangun skema perubahan penggunaan lahan, mengidentifikasi faktor-faktor yang mempengaruhi perubahan penggunaan lahan dan membangun model perubahan penggunaan lahan, dan menguji performa model Multinomial Logistic Regression (MLR) dalam memodelkan perubahan penggunaan lahan.

Selama tahun 2002 – 2008, Kabupaten Siak didominasi oleh Lahan Hutan, Lahan Perkebunan, Lahan Rerumputan, sedangkan Lahan Basah, Pemukiman, dan Lahan Lain menempati lahan yang lebih sempit. Berdasarkan matrik-matrik kemungkinan transisi (transition probability matrices), Lahan Hutan, Lahan Perkebunan, Lahan Rerumputan, Lahan Basah, dan Pemukiman cenderung berada dalam kondisi stabil, dan hanya Lahan Lain yang berubah secara dinamis. Namun demikian, kemungkinan masing-masing penggunaan lahan berubah menjadi lahan-lahan lain juga cukup signifikan. Pada penelitian ini, model perubahan penggunaan lahan dibangun menggunakan dua skenario: (1) menggunakan seluruh variabel-variabel penting hasil analisis model MLR dan (2) menggunakan variabel-variabel hasil obeservasi lapangan. Kedua skenario tersebut menunjukkan bahwa model-model perubahan penggunaan lahan yang telah dibangun merupakan model yang baik. Model-model tersebut mampu menjelaskan sebagian besar variasi dari perubahan penggunaan lahan di lokasi penelitian. Akan tetapi, validasi-validasi model yang telah dilakukan secara spasial menunjukkan model-model tersebut tidak dapat mencocokkan data spasial aktual dengan kondisi perubahan penggunaan lahan yang terjadi pada tahun 2005 – 2008.

Berdasarkan hasil validasi-validasi model tersebut, terdapat beberapa temuan penelitian yang dapat disimpulkan. Model perubahan penggunaan lahan yang telah dibangun adalah model generalisasi dari regresi logistic. MLR model memaksa setiap transisi penggunaan lahan dipengaruhi oleh seluruh variabel-variabel penting, sedangkan pada kondisi nyata, masing-masing transisi penggunaan lahan kemungkinan memiliki kombinasi parameter yang mempengaruhi perubahan penggunaan lahan tersebut.


(6)

SUMMARY

CHANDRA IRAWADI WIJAYA. Land Use Change Modeling in Siak District, Riau Province, Indonesia Using Multinomial Logistic Regression. Under the supervision of HARTRISARI HARDJOMIDJOJO and LILIK BUDI PRASETYO.

Siak District as a new district, which is an enlargement from some parts of Bengkalis District that was established in 1999, has been developing their region in order to support the people activities and also try to be at the same level as other districts. The development, that has been conducted so far, has altered land uses which involve land conversion from a type of use to other uses. In this study, land use change modeling would be developed in Siak District that may facilitate the understanding of the process of land use change and its relevant factors in the research site. The objectives of the research of Land Use Change Modeling in Siak District are (1) to analyze the land use change during 2002 – 2005 and 2005 – 2008, (2) to develop the land use change schemes of Siak District, (3) to identify the driving factors of land use change and develop the land use change model of Siak District using Multinomial Logistic Regression (MLR) model, and (4) to examine the performance of MLR model in modeling the land use change.

The research location is Siak District which is located in Riau Province, Indonesia. Geographically, Siak is bounded by latitudes 0°21’19.50” - 1°14’43.87” North and longitudes 100°54’46.31” - 102°58’27.34” East and located in 0 - 110 m above sea level. Siak is loacated in Siak Watershed with Siak River as the main river. The landscape of Siak mostly is wetlands, and only little part in west side is hilly. Based on the spatial analysis, Siak District has area about 868,117.82 Ha, and about 59% of the total area is allocated for crop and timber plantation and 9% for production forest. Furthermore, Siak Government has also allocated land for other uses such as agriculture area. There are two types of agriculture area in Siak: wetland agriculture and dry land agriculture. The preeminent commodities of crop plantation, which are managed, are rubber, oil palm, coconut and coffee which are run by private company and community. Siak also has large oil and gas resources, incorporated by international company which contributes in increasing the economic growth there (Siak District Government 2008).

The research of land use change modeling would be developed in Siak District that might facilitate the understanding of the process of land use change and its relevant factors in the research site. Land use change modeling in Siak District would be conducted in four main activities: (1) Field data collection, (2) Land use classification, (3) Land use change detection, and (4) Land use change modeling. Field data collection has been done in order to collect primary and secondary data which would be used in the research, while land use classification done in order to derive the information of land use categories of Siak District in 2002, 2005, and 2008. Land use change detection aimed to identify the transformations of land uses during 2002 – 2005 and 2005 - 2008 in Siak District. Furthermore, the land use change modeling aimed to determine the significant


(7)

variables of land use change and to find a good model which can represent the land use change in Siak District.

Based on the result of land use classification, during 2002 – 2008, Siak District was dominated by Forest land, Cropland and Grassland. In 2002, Forest land occupied up to 46.8% of the total area of Siak District, Cropland occupied 31.3% of Siak District, and Grassland for about 16.2%. Forest land decreased dramatically which occupied only 36.9% of Siak District in 2005 and then drop to 27.2% in 2008, in the same time Cropland and Grassland were increased gradually. Moreover in 2008, Cropland area exceeded Forest land area by occupying 43% of Siak District. In 2002, Settlements occupied for only 7,909 ha, but in 2005 it increased almost twice became 14,054 ha, and in 2008 became 19,340.58 ha or 2.2% of Siak District. During 2002 – 2008, Other lands were the land use category which changed dynamically, and Wetlands were assumed in stable condition.

The land use change scheme 2002 – 2005 and 2005 - 2008 shows that all land use categories tended to not transform into other land uses (stable condition) with high probabilities. The land use change scheme 2002 – 2005 also show that the three dominant land use categories in Siak District, which are Forest land, Cropland, and Grassland, transformed each other which constructed the triangle of major land use transitions with reciprocal transitions. However, during 2005 – 2008 the transformations from Forest land to Cropland and Grassland (deforestation) happen in one-way transitions, and the reforestation did not count as major land use transitions.

In this research, the land use change model has been developed in two scenarios: (1) using all significant variables determined by the MLR model analysis and (2) using observed variables determined by the observation of existing condition in the field. In the 1st scenario, the likelihood ratio tests for each independent variable show that there were 24 variables from total 28 variables which were considered as significant variables of land use change in Siak District. Natural environment contributed 6 variables, human environment contributed 15 variables, and policy contributed 3 variables to the final model. Otherwise, the variables were not included to the model were altitude, slope, distance from health service, and the area of sub district. The two tests for final model have been conducted, likelihood ratio test for the final model and pseudo r-squared, indicate that the final model of land use change in Siak District which was developed using MLR model is a good model that could explain most of the variability of land use change happen in the research site. However, the model validation for the 1st scenario which has been conducted spatially indicates that the final model could not fit the actual spatial data layers into the actual condition of land use change 2005 – 2008.

In the 2nd scenario, the observed variables included in the MLR model analysis were the existences of crop and timber plantation, the existences of road network, the spatial plans at national, province, and district level. The result of likelihood ratio tests for each observed variable done in MLR model analysis show that all observed variables may be considered as significant variables of land use change in Siak District and would be included into the final model. The two tests for final model which have been conducted, likelihood ratio test for the final


(8)

Siak District is a good model that can explain most of the variability of land use change in the research site. However, similar with the 1st scenario, the statistical properties produced from model validation show that the final model could not fit the actual spatial data layers into the actual condition of land use change 2005 – 2008.

The two scenarios done in statistical MLR model indicate the land use change models which have been developed are adequate models which can explain many variability of the land use transitions. However, the model validations which have been conducted spatially indicate the final model could not fit the actual spatial data layers into the actual condition of land use change 2005 – 2008. This result may be caused by the nature of MLR model as generalized logistic regression model which forces every land use transitions to be driven by all significant parameters that have been determined, while in the real world each land use transition probably has unique combination of parameters which drive each land use transition.

By considering the research findings on the performance of MLR model in this research, the binary logistic regression would like to be recommended for the future research in order to develop the adequate land use change model which is good statistically and spatially. The binary logistic regression would find the best fit model for each land use transition individually by considering unique combination of parameters of each land use transition which involve into its model. Hopefully, the binary logistic regression may produce the conditional probability maps of land use transitions which can cover the whole area or most of the research area, increase the probability values for each projected land use transition, and also narrow the range and data distribution of the probability values of each projected land use transition compared to its actual land use transition.

Another issue considered in this research is about the effects of the predictor variables/parameters of MLR model on the dependent variable which can only be interpreted for direct effects on the dependent variable when the other predictor variables are held constant. Millington et al. (2007) proposed Hierarchical Partitioning (HP) method might be chosen to address that issue. Hopefully, HP can observe the effects of the predictor variables/parameters to the land use change, both independently and in conjunction with all other variables.

Based on the land use change scheme 2002 – 2008, the situation that should be highlighted and considered by Siak District is the increasing of deforestation probability as one of major land use transitions during 2002 – 2008. This situation was getting bad, since the reforestation was not done significantly and was not also visible as major land use transitions according to the land use change scheme 2005 – 2008. If this situation continues, it is not impossible that the Forest land in Siak District, which the majority is Peatland Forest, will continue to decline and probably in the future will be exhausted and will be replaced by Cropland as the most increasing land use category during 2002 – 2008. Synchronized spatial plans among different administrative levels (national, province, and district) may prevent the undesirable land use change and furthermore may support the sustainable natural resources management in Siak District.


(9)

Copyright @2011, Bogor Agricultural University Copyright are protected by law,

1. It is prohibited to cite all of part of this thesis without referring to and mentioning the source:

a. Citation only permitted for the sake of education, research, scientific writing, report writing, critical writing or reviewing scientific problem b. Citation does not inflict the name and honor of Bogor Agricultural

University

2. It is prohibited to republish and reproduce all part of this thesis without written permission from Bogor Agricultural University


(10)

LAND USE CHANGE MODELING

IN SIAK DISTRICT, RIAU PROVINCE, INDONESIA

USING MULTINOMIAL LOGISTIC REGRESSION

CHANDRA IRAWADI WIJAYA

A thesis submitted for the Degree of Master of Science in Information Technology for Natural Resources Management Program Study

GRADUATE SCHOOL

BOGOR AGRICULTURAL UNIVERSITY BOGOR


(11)

External Examiner: Dr. Suria D. Tarigan


(12)

Research Title : Land Use Change Modeling in Siak District, Riau Province, Indonesia Using Multinomial Logistic Regression

Name : Chandra Irawadi Wijaya

Student ID : G051070051

Study Program : Master of Science in Information Technology for Natural Resource Management

Approved by, Advisory Board

Dr. Ir. Hartrisari Hardjomidjojo, DEA Prof. Dr. Ir. Lilik Budi Prasetyo, M.Sc

Supervisor Co-Supervisor

Endorsed by,

Program Coordinator Dean of Graduate School

Dr. Ir. Hartrisari Hardjomidjojo, DEA Dr. Ir. Dahrul Syah, M.Agr.Sc.

Date of Examination: Date of Graduation:


(13)

ACKNOWLEDGEMENT

First of all, I would like to express my gratitude for the blessings and mercy of God Almighty who has given the strength when I was studying in Bogor Agricultural University until this thesis can be completed.

I would like to express my great appreciation to Dr. Ir. Hartrisari Hardjomidjojo, DEA and Dr. Ir. Lilik Budi Prasetyo, M.Sc as my supervisors, and Dr. Surya D. Tarigan as an external examiner for their guidance, especially during the development of this thesis research. Thank you to all lecturers of Master of Science in Information Technology for Natural Resources Management, Bogor Agricultural University (MIT-IPB) for sharing the valuable knowledge during the courses, and to the staff members of MIT-IPB which always support in creating conducive situations in MIT-IPB campus. I would like to express my high appreciation to Dr. Yuji Murayama as my supervisor in University of Tsukuba for his guidance during the exchange research student program, to SATO Foundation for the scholarship during in Japan, and also to all members of Spatial Information Sciences Laboratory, University of Tsukuba for sharing the knowledge, friendship, and togetherness. Thank you to Tropenbos International Indonesia Programme, Siak District Government, and its staffs, which have supported in conducting the fieldwork in Siak District and sharing data and information.

My deep appreciation would like to be expressed to all my family members, in particular my beloved father (Widji Soekirno) and beloved mother (Sunarni), and my gorgeous sisters (Mbak Ellyn and Mbak Ine), for their patient, continuous support, and encouragement. Thank you to all my families in Bandung, Semarang, Solo, and Banjarnegara for their kind motivation.

I would like to thank to my classmates at MIT-IPB class 2007 and also to all my friends at MIT-IPB, for pleasant friendship, hard work, and cooperation. Thank you to all my friends in Bogor, Tsukuba, and Tokyo, and also to my colleagues in ICRAF and CIFOR, for your friendship and motivation. Last but not least, thank you to all my colleagues that could not be mentioned here for your kind support.


(14)

CURRICULUM VITAE

Chandra Irawadi Wijaya was born in Bogor, West Java, Indonesia on September 18th 1982. He graduated from Bogor Agricultural University, Faculty of Forestry, Department of Forest Resources Conservation in 2005. He started to enroll as private student in Master of Science in Information Technology for Natural Resources Management, Bogor Agricultural University in 2007, and he also enrolled as exchange research student in Tsukuba University, Japan during September 2009 – August 2010. He completed his master study in Bogor Agricultural University in 2011 with final thesis titled “Land Use Change Modeling in Siak District, Riau Province, Indonesia Using Multinomial Logistic Regression”.


(15)

TABLE OF CONTENTS

Table of Contents ... i

List of Tables ... iii

List of Figures ... iv

List of Appendices ... vi

I. INTRODUCTION ... 1

1.1.Background ... 1

1.2.Objectives ... 3

1.3.Output ... 3

II. LITERATURE REVIEW ... 4

2.1.Representing Land Use Change ... 4

2.1.1. Land Use Categories ... 4

2.1.2. Land Use Change Detection ... 6

2.2.Remotely Sensed Image Classification ... 7

2.2.1. Remote Sensing ... 7

2.2.2. Image Pre-processing ... 8

2.2.3. Image Processing ... 9

2.3.Land Use Change Modeling... 11

2.4.Multinomial Logistic Regression Model ... 12

2.4.1. Fitting The Multinomial Logistic Regression Model ... 14

2.4.2. Significance Tests for The Coefficients... 15

2.5.Spatial Logistic Regression ... 17

2.6.Accuracy Assessment ... 18

III.METHODS ... 20

3.1.Time and Research Location ... 20

3.2.Data Source ... 21


(16)

3.4.Methods ... 23

3.4.1. Field Data Collection ... 24

3.4.2. Land Use Classification ... 25

3.4.3. Land Use Change Detection ... 29

3.4.4. Land Use Change Modeling ... 30

3.5.Assumption of Research Study ... 35

IV.RESULT AND DISCUSSION ... 36

4.1.Land Use Classification ... 36

4.1.1. Image Pre-processing ... 36

4.1.2. Image Processing ... 40

4.1.3. Land Use Categories of Siak District ... 45

4.2.Land Use Change Detection ... 48

4.2.1. Land Use Change 2002 – 2005 ... 49

4.2.2. Land Use Change 2005 – 2008 ... 51

4.3.Land Use Change Representation ... 53

4.3.1. Land Use Transition Probability ... 54

4.3.2. Land Use Change Schemes ... 56

4.4.Land Use Change Model and Significant Variables ... 59

4.4.1. Data Preparation ... 60

4.4.2. MLR Model using All Significant Variables ... 67

4.4.3. MLR Model using Observed Variables ... 81

4.5.Research Findings on Land Use Change Modeling using MLR Model ... 92

V. CONCLUSION ... 94

5.1.Conclusion ... 94

5.2.Recommendation ... 96


(17)

LIST OF TABLES

Table 1. Top-level Land Use Categories ... 5

Table 2. Stepwise Method for Building MLR Model ... 16

Table 3. Organization of the data ... 22

Table 4. Accuracy Assessment Requirement ... 28

Table 5. Adequate MLR Model Requirement ... 33

Table 6. Accuracy Assessment Report ... 44

Table 7. The Area of Land Use Categories in Siak District ... 46

Table 8. Coding Procedure of Land Use Transition Matrix ... 48

Table 9. Transition Area Matrix 2002 – 2005 ... 50

Table 10. Transition Area Matrix 2005 – 2008 ... 53

Table 11. Transition Probability Matrix 2002 – 2005 ... 55

Table 12. Transition Probability Matrix 2005 – 2008 ... 55

Table 13. Data Layers Used in This Research ... 61

Table 14. Number of Sampling Points for Each Land Use Transition ... 66

Table 15. Likelihood Ratio Tests for All Significant Variables ... 69

Table 16. Likelihood Ratio Test of the Final Model (1st Scenario) ... 70

Table 17. Pseudo R-Square Statistics (1st Scenario) ... 71

Table 18. MLR Model Equation ... 75

Table 19. The Parameter Estimates for Spatial Plan at different levels ... 85

Table 20. Likelihood Ratio Tests for Observed Variables ... 86

Table 21. Likelihood Ratio Test of the Final Model (2nd Scenario) ... 87


(18)

LIST OF FIGURES

Figure 1. Transfer characteristic of a radiation detector ... 8

Figure 2. The process and the elements involved in GRID’s logistic regression ... 18

Figure 3. The research location: Siak District, Riau Province ... 20

Figure 4. General flow chart of the research ... 23

Figure 5. The flow chart of field data collection ... 25

Figure 6. The flow chart of land use classification process ... 26

Figure 7. Illustration of Error Matrix and Mathematical Expression for Accuracy Assessment ... 29

Figure 8. The flow chart of land use change detection ... 30

Figure 9. The flow chart of land use change modeling: A. Data Preparation ... 32

Figure 10. The flow chart of land use change modeling: B. MLR Modeling ... 34

Figure 11. SLC-OFF Gap Filling ... 37

Figure 12. Geometric Correction ... 38

Figure 13. LANDSAT 7 ETM+ of Siak District ... 39

Figure 14. Land Use Classification Process ... 41

Figure 15. Post-Classification Process ... 43

Figure 16. Land Use Maps of Siak District ... 47

Figure 17. Land Use Change Map 2002 – 2005 (Snapshot) ... 49

Figure 18. The Graph of Land Use Transitions 2002 – 2005 ... 50

Figure 19. Land Use Change Map 2005 – 2008 (Snapshot) ... 52

Figure 20. The Graph of Land Use Transitions 2005 – 2008 ... 53

Figure 21. Land Use Change Scheme 2002 – 2005 in Siak District ... 57

Figure 22. Land Use Change Scheme 2005 – 2008 in Siak District ... 57

Figure 23. Crop Plantation (Cropland Category) in Siak District ... 59

Figure 24. Data Layers of Independent Variables: Natural Environment Theme ... 62


(19)

Figure 25. Data Layers of Independent Variables:

Human Environment Theme ... 63 Figure 26. Data Layers of Independent Variables: Policy Theme ... 65 Figure 27. Spatial Distribution of Sampling Points ... 66 Figure 28. Conditional Probability Maps of

Land Use Transitions (1st scenario) ... 76 Figure 29. The Aggregation of Conditional Probability Maps

of Land Use Transitions ... 78 Figure 30. The Statistical Properties of MLR Conditional Probability

Values in Actual Land Use Change 2005 – 2008 (1st Scenario)... 80 Figure 31. The Distribution of MLR Conditional Probability Values

in Actual Land Use Change 2005 – 2008 (1st Scenario)... 80 Figure 32. Development of new crop plantation and settlements

stimulated by existing Cropland ... 82 Figure 33. The establishment of new road in Siak District threatens

Forest land on the side of road ... 83 Figure 34. Shrub land (was Forest land) on the side of road

allocated for new settlement area ... 83 Figure 35. Major land use transitions vs. the distance from road ... 83 Figure 36. Conditional Probability Maps of

Land Use Transitions (2nd Scenario) ... 88 Figure 37. The Statistical Properties of MLR Conditional Probability

Values in Actual Land Use Change 2005 – 2008

(2nd Scenario) ... 91 Figure 38. The Distribution of MLR Conditional Probability

Values in Actual Land Use Change 2005 – 2008


(20)

LIST OF APPENDICES

Appendix 1. Land Use Categories ... 100

Appendix 2. Population Density and Spatial Plans Category... 102

Appendix 3. Coefficients of the Significant Variables (β) ... 103

Appendix 4. Data Layers/Parameters 2005 (xi) for Model Validation ... 105

Appendix 5. The SPSS Outputs of MLR Model Analysis (1st Scenario) ... 107

Appendix 6. Coefficients of the Observed Variables (β) ... 112

Appendix 7. The SPSS Outputs of MLR Model Analysis (2nd Scenario) ... 113


(21)

I. INTRODUCTION

1.1 Background

Land use is a clear evidence of human interaction with natural resources. Land use shows how human manages and acts to their natural resources. A good land use management may result mutualism benefits for both human and natural resources, but in the other hand the interactions between human and natural resources may also have undesirable consequences when the land is managed inappropriately. People, managing the land wisely, will consider the carrying capacity of the land, which will not push the land to exceed its carrying capacity. Here, an integrated comprehensive land use planning plays a significant role to consider and measure the carrying capacity of the land.

The need of people of land may alter land uses which involve land use conversion from a type of use to other uses to fulfill human activities. Along with the population growth, the need of land for settlement and crop plantation may shift several types of land use, such as forest land and agriculture land. There are also some cases that the settlement and crop plantation development alters other lands (for example wetlands and protected lands) which are not considerable to be developed for settlement and crop plantation. Beside the population growth and demography characteristics, infrastructure development (i.e. road and market place) may also drive the land use change in some areas. Biophysical characteristics of land, such as soil type, elevation, and slope, are also considered as relevant factors that may influence people decision to use the land.

Conversion of land may impact soil, water, and climate which are directly related to environmental issue. Land preparation for timber and crop plantation has caused soil degradation in many places in tropical areas. This also will intrude the water balance in its watershed which rain season could lead flood, whereas in dry season drought is occurred. Changes in land use will also change carbon stocks in the pools, and the accumulation of carbon in the atmosphere will cause global warming and furthermore climate change phenomenon (Watson et al.

2000). Therefore, land use change is also closely related to climate change, and the relationship between them is interdependent which means the changes in land


(22)

use may impact on the climate, and reversely climatic change will also influence land use change in the future (Koomen et al. 2007).

Spatial plans in national and local (province and district) levels act as guidelines for stakeholder in managing the land under their authority. Spatial plans arrange the use of land and human activities, which may be done in its area that usually need land conversion to adjust the use of land and the human need. The development of spatial plans should consider the future consequences of its implementation, so that the changes of land use planned before will bring prosperity for human and also sustainability for natural resources. For the development of spatial plans, the involved stakeholders normally make use of models that simulate possible spatial developments. Such models can support the analysis of the causes and consequences of land use change, and they facilitate the understanding of the processes at hand and help producing figures of possible future land use patterns (Verburg et al. 2004; Koomen et al. 2007).

Siak District is an enlargement from some parts of Bengkalis District that was established in 1999. In last 10 years, Siak District as a new district has been developing their region in order to support the people activities and also try to be at the same level as other districts. The development, that has been conducted so far, has altered land uses which involve land conversion from a type of use to other uses. In this study, land use change modeling would be developed in Siak District that may facilitate the understanding of the process of land use change and its relevant factors in the research site. The future land use hopefully might also be predicted by considering the existing conditions of relevant factors which drive the land use change. Furthermore, the predicted future land use data can be used as basis data for assessing the consequences of land use change in the future related to environmental issue, such as climate change, disaster risk management, and food security.


(23)

1.2 Objectives

The objectives of the research of Land Use Change Modeling in Siak District are:

1. to analyze the land use change during 2002 – 2005 and 2005 – 2008, 2. to develop the land use change schemes of Siak District,

3. to identify the driving factors of land use change and develop the land use change model of Siak District using Multinomial Logistic Regression (MLR) model, and

4. to examine the performance of MLR model in modeling the land use change.

1.3 Outputs

The expected outputs of this research are:

1. Land use change maps of Siak District during year 2002 – 2005 and 2005 – 2008,

2. The land use change schemes of Siak District,

3. Identified significant variables (driving factors) and the final model of land use change of Siak District, and

4. Research findings on MLR model performance in modeling the land use change.


(24)

II. LITERATURE REVIEW

2.1 Representing Land Use Change 2.1.1 Land Use Categories

Land use is a clear evidence of human interaction with the natural resources. Land use shows how human manage and acts to their natural resources. The use of land is normally reflected in its outward appearance (land cover), but this relation is more complex than is initially apparent. Land can simultaneously be used for different functions (e.g., agriculture and recreation) or locally have different main functions related to the same cover (e.g., nature reserve and wood production) (Koomen et al. 2007). Land cover is terminology related to the appearance above earth surface, while land use relates to human activities in certain land area (Lillesand and Kiefer 1993).

Intergovernmental Panel on Climate Change (IPCC) has recognized that the names of land categories are a mixture of land cover (e.g., Forest land, Grassland, Wetlands) and land use (e.g., Cropland, Settlements) classes. For convenience, they are here referred to land use categories. Good Practice Guidance for Land Use, Land Use Change and Forestry published by IPCC has defined general characteristics of good practice approaches for representing land use categories (IPCC 2003):

1. The approaches should be adequate, i.e., capable of representing carbon stock changes and greenhouse gases emissions and removals and the relations between these and land use and land-use changes.

2. They should be consistent, i.e., capable of representing management and land-use change consistently over time, without being unduly affected either by artificial discontinuities in time series data or by effects due to interference of sampling data with rotational or cyclical patterns of land use (e.g., the harvest-regrowth cycle in forestry, or managed cycles of tillage intensity in cropland).

3. The approaches should be complete, which means that all land area within a country should be included, with increases in some areas balanced by decreases in others where this occurs in reality, and should recognize


(25)

subsets of land used for estimation and reporting according to definitions agreed in the Marrakesh Accords for Parties to the Kyoto Protocol.

4. The approaches should be transparent, i.e., data sources, definitions, methodologies and assumptions should be clearly described.

By considering the general characteristics above, IPCC has defined six land use categories that can be applied in most countries and to accommodate differences in national classification systems. The top-level land use categories are shown in Table 1 below.

Table 1. Top-level Land Use Categories

No. Land Use

Category Description

1 Forest land This category includes all land with woody vegetation consistent with thresholds used to define forest land in the national GHG inventory, sub-divided into managed and unmanaged, and also by ecosystem type as specified in the IPCC Guidelines. It also includes systems with vegetation that currently fall below, but are expected to exceed, the threshold of the forest land category.

2 Cropland Includes arable and tillage land, and agro-forestry systems where vegetation falls below the thresholds used for the forest land category, consistent with the selection of national definitions.

3 Grassland Includes rangelands and pasture land that is not considered as cropland. It also includes systems with vegetation that fall below the thresholds used in the forest land category and are not expected to exceed, without human intervention, the threshold used in the forest land category. The category also includes all grassland from wild lands to recreational areas as well as agricultural and silvi-pastural systems.

4 Wetlands Includes land that is covered or saturated by water for all or part of the year (e.g., peatland) and that does not fall into the forest land, cropland, grassland or settlements categories. It includes reservoirs as a managed sub-division and natural rivers and lakes as unmanaged sub-divisions.

5 Settlements Includes all developed land, including transportation infrastructure and human settlements of any size, unless they are already included under other categories. This should be consistent with the selection of national definitions.

6 Other lands Includes bare soil, rock, ice, and all unmanaged land areas that do not fall into any of the other five categories. It allows the total of identified land areas to match the national area, where data are available.


(26)

2.1.2 Land Use Change Detection

Land use change is an important research area in global environmental change research that attracts broad attention, since it can produce significant ecological impacts to the environment (Chen et al. 2003; Fang et al. 2006). Change detection is the process of identifying differences in the state of an object or phenomenon by observing it in different times and identifying the differences of land use in different time may be very useful for the policy makers in order to maintain and improve ecosystem (Singh 1989; Fang et al. 2006).

Land use change detection can be analyzed by using remote sensing technology through extracting changes in radiance values of multi-temporal satellite image. The basic premise in using remote sensing data for change detection is that changes in the object of interest will result in changes in radiance values or local texture that are separable from changes caused by other factors, such as differences in atmospheric conditions, illumination and viewing angle, soil moisture, etc. It may further be necessary to require that changes of interest be separable from expected or uninteresting events, such as seasonal, weather, tidal or diurnal effects (Deer 1995).

The U.S. Geological Survey's (USGS) Urban Dynamics Research (UDR) program defines that the geographic understanding of land use change in urban areas is a key aspect of the UDR program. The analysis requires understanding a region's land use history. Population data, timelines of historical events, and related information are all used to explain the mapped changes. Population data are correlated with the temporal database so that human movement can be tracked and factored into these interpretations. Population increases suggest economic growth and the availability of jobs in an area, and population declines suggest a decline in livability or economic issues that cause people to leave a region. Timelines of past events and other historical compilations aid in identifying the issues that affected the development of the region. In addition to gathering statistical and historical information, scientists must have a physiographic understanding of the place and its greater region. Topographic features, climate, and adequate supplies of water and other natural resources can limit or encourage growth and change (USGS 1999).


(27)

2.2 Remotely Sensed Image Classification 2.2.1 Remote Sensing

Remote sensing is the science and art of obtaining information about an object, area, or phenomenon through the analysis of data acquired by a device that have no any contact with the object, area, or phenomenon investigated (Lillesand and Kiefer 1993). Earth observation by remote sensing is the interpretation and understanding of measurements made by airborne or satellite-borne instruments of electromagnetic radiation that is reflected from or emitted by objects on the Earth’s land, ocean, or ice surfaces or within the atmosphere, and the establishment of relationships between these measurements and the nature and distribution of phenomena on the Earth’s surface or within the atmosphere (Mather 2004).

The characteristics of imaging remote sensing instruments operating in the visible and infrared spectral region can be summarized in terms of their spatial, spectral and radiometric resolutions. The spatial resolution of an imaging system is not an easy concept to define, because it can be measured in a number of different ways, depending on the user’s purpose. Most sensors operating in the visible and infrared bands collect multispectral or multi-band images, which are sets of individual images that are separately recorded in discrete spectral bands, and the term spectral resolution refers to the width of these spectral bands measured in micrometers (μm) or nanometers (nm). Radiometric resolution or radiometric sensitivity refers to the number of digital quantization levels used to express the data collected by the sensor. In general, the greater the number of quantization levels, the greater the detail in the information collected by the sensor (Mather 2004).

An important principle underlying the use of remotely-sensed data is as follow different objects on the Earth’s surface and in the atmosphere reflect, absorb, transmit or emit electromagnetic energy in different proportions, and that such differences allow these components to be identified (Mather 2004). The great advantage of having remotely-sensed data available digitally is that it can be processed by computer either for machine assisted information extraction or for


(28)

enhancement of its visual qualities in order to make it more interpretable by a human analyst (Richards and Jia 2006).

2.2.2 Image Pre-processing

Image pre-processing is concerned to perform correction on any errors that occurred in the remotely sensed images. There are two common types of image pre-processing techniques, namely radiometric correction and geometric correction. Radiometric correction, aim to remove errors on remotely-sensed data, either resulted from the presence of the atmosphere as a transmission medium through which radiation must travel from its source to the sensors, or from instrumentation effects (Richards and Jia 2006). Atmospheric correction might be a necessary pre-processing technique to compute a ratio of the values in two bands of a multispectral, relate upwelling radiance from a surface to some property of that surface in terms of a physically based model, and compare results or ground measurements made at one time to results achieved at a later (Mather 2004). Radiometric errors within a band and between bands may due to effects of design and operation of the sensor system which normally ignored by comparison to band errors from atmospheric effects. An ideal radiation detector should have a transfer characteristic (radiation in, signal out) as shown in Figure 1.a. which should be linear, and therefore there is a proportional increase and decrease of signal with detected radiation level (Richards and Jia 2006).

a b

Figure 1. Transfer characteristic of a radiation detector: a. Ideal transfer characteristic, b. Hypothetical mismatches in detector characteristics in the same


(29)

The transformation of a remotely sensed image is called geometric correction or geo-referencing. A related technique, called registration, is the fitting of the coordinate system of one image to that of the second image of the same area. Accurate image registration is needed if a time sequence of images is used to detect changes in, for example, the land covers of an area date (Mather 2004).

2.2.3 Image Processing

Image data, available in digital form, can be quantized spatially and radiometrically. There are several approaches are possible in extracting the information. Two approaches are usually used to extract information from digital image data: quantitative analysis or also called classification and photo-interpretation or sometimes called visual image photo-interpretation. Photo-interpretation is aided substantially if a degree of digital image processing is applied to the image data beforehand, while quantitative analysis depends for its success on information provided at key stages by an analyst (Richards and Jia 2006).

Photo-interpretation which involves direct human interaction and therefore it needs high level decisions. It is good for spatial assessment but poor in quantitative accuracy. Area estimated by photo-interpretation, for instance, would involve planimetric measurement of regions identified visually; in which, boundary definition errors will prejudice area accuracy. By contrast, quantitative analysis, requiring little human interaction, has poor spatial ability but high quantitative accuracy. Its high accuracy comes from the ability of a computer, if required, to process every pixel in a given image and to take account of the full range of spectral, spatial and radiometric detail present. Its poor spatial properties come from the relative difficulty with which decisions about shape, size, orientation and texture can be solved by using standard sequential computing techniques (Richards and Jia 2006).

In computer-based quantitative analysis, the attributes of each pixel (such as the spectral bands available) are examined in order to give the pixel a label which identify it as belong to a particular class of pixels of interest to the user (Richards and Jia 2006). The process of classification consists of two stages:


(30)

recognition of categories of real-world objects and labeling of the classified entities (normally pixels). In the context of remote sensing of the land surface these categories could include, for example, woodlands, water bodies, grassland and other land cover types, depending on the geographical scale and nature of the study. In digital image classification the labels are numerical, so that group of pixels that are recognized as belonging to the class ‘water’ may be given the label ‘1’, ‘woodland’ may be labeled ‘2’, and so on (Mather 2004).

There are several methods used in image classification, but generally those methods can be categorized as unsupervised classification and supervised classification. Unsupervised classification is an analytical procedure based on clustering using some algorithms. Application of clustering partitions the image data in multispectral space into a number of spectral classes, and then labels all pixels of interest as belonging to one of those spectral classes. The process followed segmentation of the multispectral space to cluster pixels into ground cover types, by the analyst (Richards and Jia 2006).

Supervised classification methods are based on external knowledge of the area shown in the image. Unlike some of the unsupervised methods, supervised methods require some input from the user before the chosen algorithm is applied. This input maybe derived from fieldwork, air photo analysis, reports, or from the study of appropriate maps of the area of interest. In the main, supervised methods are implemented by using either statistical or neural algorithms. Statistical algorithms use parameters derived from sample data in the form of training classes, such as the minimum and maximum values on the features, or the mean values of the individual clusters, or the mean and variance–covariance matrices for each of the classes. Neural methods do not rely on statistical information derived from the sample data but are trained on the sample data directly. This is an important characteristic of neural methods of pattern recognition, for these methods make no assumptions concerning the frequency distribution of the data. In contrast, statistical methods such as the maximum likelihood procedure are based on the assumption that the frequency distribution for each class is multivariate normal in form. Thus, statistical methods are said to be parametric


(31)

(because they use statistical parameters derived from training data) whereas neural methods are non-parametric (Mather 2004).

2.3 Land Use Change Modeling

Model is a simplification of real-world system while a system is a mechanism in which various component interact in such a way as to perform a function in a real world (Handoko 2005; Shenk and Franklin (eds) 2001). The purpose of using a model is to easily understand the system’s behavior by simplifying its process. There are three objectives for constructing a model, namely: (1) to understand the process, (2) to make prediction, and (3) to support management (Handoko 2005).

Models can be categorized in three classes of models: theoretical, empirical (statistical), and decision-theoretical. The theoretical models are developed to suggest mechanisms and thus lead to predictions even before data are collected. Theoretical models are used to investigate systems responses and trajectories that are possible under specific hypotheses. These uses do not include comparison of model predictions with data or observation. Statistical models, by contrast, are used to make inferences from data. Statistical models may also be used to test hypotheses which may require the complementary skills of theoreticians and empiricists. Decision-theoretical models can be used to indicate which decisions are likely to meet management objectives in line of uncertainty and dynamics systems. In decision-theoretical models, scientific models are used to project the consequences of hypotheses about how a system behaves in order to derive wise, or even optimal, management actions. Models are used to project system’s responses to the various management actions that could be employed in order to assist in deciding which action is most appropriate (Shenk and Franklin (eds) 2001).

Models can be used to address specific issues in natural resource management, and it has led to development of various disciplines. Population viability analysis and wildlife resource selection are some of modeling disciplines (Shenk and Franklin (eds) 2001). The strength of a modeling technique lies in its ability to model many variables, some of which may be on different measurement


(32)

scales (Hosmer and Lemeshow 2000). Modeling land use change is another modeling discipline that has attracted many scientists in the world in order to study the causal relationship of land management to changes of land use. Modeling land use change studies the changes of land use in consequences of the response of land management that has been done by human in order to fulfill their need. Hopefully, land use change model, which has been developed, may facilitate the understanding of the process of land use change and its driving factors (Verburg et al. 2004; Koomen et al. 2007).

2.4 Multinomial Logistic Regression Model

Land use change and its driving factors can be categorized as binary, continuous, or categorical variables. There are several ways to model binary, continuous and categorical variables, and the most important model for categorical response data is logistic regression model (Agresti 2002). The dependent variables of logistic regression could be binary or categorical variables, whereas its independent variables could be a mixture of continuous and categorical variables (Xie et al. 2005). The logistic regression model is used increasingly in a wide variety, and it is also commonly used in modeling land use change (Agresti 2002; Verburg et al. 2004; Fang et al. 2006; Dewantara 2006; Koomen et al. 2007).

The goal of an analysis by using logistic regression method is the same as that any model-building technique in statistics: to find the best fitting and most parsimonious, yet biologically reasonable model to describe the relationship between an outcome (dependent or response) variable and a set of independent (predictor or explanatory) variables. The outcome variable in logistic regression is binary or dichotomous. However, the model can be easily modified to handle the case where the outcome variable is nominal with more than two levels (Hosmer and Lemeshow 2000).

Modeling land use change may consider many factors or variables in the model which will be referred as the multivariable case. In this case, McFadden (1974) in Hosmer and Lemeshow (2000) proposed a modification of the logistic regression model and called it a discrete choice model. As a result the model is


(33)

frequently referred to as the discrete choice model in business and econometric literature. It is called the multinomial, polychotomous or polytomous logistic regression model in the health and life sciences (Hosmer and Lemeshow 2000). The term multinomial is used in this research.

Multinomial Logistic Regression (MLR) uses an outcome variable with any number of levels to illustrate the extension of the model and methods. However, the details are most easily illustrated with three categories. To develop the model, assume we have p covariates and a constant term, denoted by the vector, x, of length p + 1 where x0 = 1. The two logit functions for this model as

Equation 1. MLR Model Equation: Logit Functions

It follows that the conditional probabilities of each outcome category given the covariate vector are

Equation 2. MLR Model Equation: Conditional Probability of Each Outcome Category

Following the convention for the binary model, let πj (x) = P(Y = j|x) for j = 0,1,2. Each probability is a function of the vector of 2(p + l) parameters β' = (β'1, β'2)


(34)

2.4.1 Fitting The Multinomial Logistic Regression Model

The most popular method for parameter estimation is maximum likelihood estimation. The method of maximum likelihood produces values for the unknown parameters which maximize the probability of obtaining the observed set of data. In order to apply the maximum likelihood method, the likelihood function should be constructed. The likelihood function expresses the probability of the observed data as a function of the unknown parameters (Hosmer and Lemeshow 2000). The likelihood function for a sample of n independent observation in MLR is

The principle of maximum likelihood states that we use as our estimate of β the value which maximizes the expression in likelihood function equation. However, it is easier mathematically to work with the log of likelihood function equation. The log-likelihood, is defined as

The likelihood equations are found by taking the first partial derivatives of L(β) with respect to each of the 2(p + l) unknown parameters. The general form of these equations is:

for j = 1, 2 and k = 0, l, 2, ..., p, with x0i = 1 for each subject.

The maximum likelihood estimator, , is obtained by setting these equations equal to zero and solving for β. The solution requires iterative computation in order to obtain the adequate model (Hosmer and Lemeshow 2000).

The r-squared statistic, which measures the variability in the dependent variable that is explained by a linear regression model, cannot be computed for multinomial logistic regression models. The pseudo r-squared statistics are designed to have similar properties to the true squared statistic. Larger pseudo r-squared statistics indicate that more of the variation is explained by the model, to a maximum of 1. There are three approaches of pseudo r-squared statistics which


(35)

McFadden. The following equations are the formula to get pseudo r-squared statistics in logistic regression model (Tabachnick and Fidell 2006).

Cox and Snell:

Nagelkerke:

McFadden:

LL(B) : Log-likelihood Final Model LL(0) : Log-likelihood Constant Model

Equation 3. Pseudo R-Squared

2.4.2 Significance Tests for The Coefficients

The MLR analysis can be done by using several methods in order to select the important variables for building a model. The popular method is stepwise method which is a stepwise procedure for selection or deletion of variables from a model based on a statistical algorithm that checks for the importance of variables, and either includes or excludes them on the basis of a fixed decision rule. The importance of a variable is defined in terms of a measure of the statistical significance of the coefficient for the variable (Hosmer and Lemeshow 2000). There are four terms of stepwise method which provided in SPSS software as shown in the Table 2.

The SPSS Logistic Regression offers forward or backward statistical regression, either of which can be based on either the likelihood ratio or Wald statistic, with user specified tail probabilities (Tabachnick and Fidell 2006). The likelihood ratio test, G-test, is used to test the significance of the parameters which involved in the model. The formula of likelihood ratio as follows:


(36)

with hypothesis of test:

H0 = β0 = β1 = β2 = β3 = … = βp = 0

H1 = at least one βi is not the same as zero.

Furthermore, the Wald test is used to test the significance of parameter βj,

where j = 1,2,3, .., p partially. The formula of Wald test:

where βj is the coefficient and SEβj is its standard error. Wald test denotes a

random variable following the standard normal distribution (Hosmer and Lemeshow 2000). Hauck and Donner (1977) in Hosmer and Lemeshow (2000) examined the performance of the Wald test and found that it behaved in an aberrant manner, often failing to reject the null hypothesis when the coefficient was significant. They recommended that the likelihood ratio test be used.

Table 2. Stepwise Method for Building MLR Model

Stepwise Terms Description

Forward entry This method begins with no stepwise terms in the model. At each step, the most significant term is added to the model until none of the stepwise terms left out of the model would have a statistically significant contribution if added to the model. Backward

elimination

This method begins by entering all terms specified on the stepwise list into the model. At each step, the least significant stepwise term is removed from the model until all of the remaining stepwise terms have a statistically significant contribution to the model.

Forward stepwise This method begins with the model that would be selected by the forward entry method. From there, the algorithm alternates between backward elimination on the stepwise terms in the model and forward entry on the terms left out of the model. This continues until no terms meet the entry or removal criteria. Backward stepwise This method begins with the model that would be selected by

the backward elimination method. From there, the algorithm alternates between forward entry on the terms left out of the model and backward elimination on the stepwise terms in the model. This continues until no terms meet the entry or removal criteria.


(37)

2.5 Spatial Logistic Regression

Regression can be considered as a process to extract the coefficients of the empirical relationships from observations. Commonly used regression approaches include linear regression, log-linear regression and logistic regression. The dependent variable of logistic regression could be binary or categorical. The independent variables of logistic regression could be a mixture of continuous and categorical variables. Normality assumption is not needed for logistic regression. Hence, logistic regression is advantageous compared to linear regression and log-linear regression (Xie et al. 2005).

Logistic regression is an approach to extract the coefficients of explanatory factors from the observation of land use conversion, since urbanization does not usually follow normal assumption and its influential factors are usually a mixture of continuous and categorical variables. The spatial heterogeneity of spatial data should be considered when employing logistic regression to model land conversion. Spatial statistics like spatial dependence and spatial sampling also have to be considered in logistic regression to remove spatial auto-correlation. Otherwise, unreliable parameter estimation or inefficient estimates and false conclusions regarding hypothesis test will result. There are two fundamental approaches to consider spatial dependence: building a more complex model incorporating an autogressive structure and designing a spatial sampling scheme to expand the distance interval between sampled sites. Spatial sampling leads to a smaller sample size that loses certain information and conflicts with the large-sample of asymptotic normality of maximum likelihood method, upon which logistic regression is based. Nevertheless, it is a more sensible approach to remove spatial auto-correlation and a reasonable design of spatial sampling scheme will make a perfect balance between the two sides (Xie et al. 2005).

Data layer preparation in order to apply logistic regression into spatial manner is the most fundamental process which time-consuming trial and error process. The dependent and independent variables of modeling land use change have different types of data (binary, continuous, or categorical) and spatial resolution. Converting non-spatial data into spatial data is also another case of


(38)

data layer preparation in land use change modeling. In order to accommodate the logistic regression model into spatial manner, Friesen and Lydon (1999) has proposed the utilization of ARC/INFO’s GRID in term of data layer preparation. GRID’s logistic regression command allows the generation of probability surfaces for each of the land use types.

Figure 2. The process and the elements involved in GRID’s logistic regression (Friesen and Lydon 1999)

In logistic regression model, the presence or absence of the outcome variable, i.e., land use type, is predicted on the basis of the explanatory variables. Given input grids of these variables and a set of sample points indicating where a particular land use type is and is not found, an output grid can be generated in which each cell contains the probability value that the land use type will be found at that location (Friesen and Lydon 1999).

2.6 Accuracy Assessment

Accuracy assessment determines quality of the map created from remotely sensed data. Accuracy assessment can be qualitative or quantitative, expensive or inexpensive, quick or time consuming, well designed and efficient or haphazard.


(39)

The purpose of quantitative accuracy assessment is identification and measurement of map errors. Quantitative accuracy assessment involves the comparison of a site on a map against reference information for the same site (Congalton and Green 1999).

There are two types of map accuracy assessment: positional and thematic. Positional accuracy deals with the accuracy of the location of map features, and measures how far a spatial feature on a map is from its true or reference location on the ground (Bolstad 2005 in Congalton and Green 2009). Thematic accuracy deals with the labels or attributes of the features of a map, and measures whether the mapped feature labels are different from the true feature label (Congalton and Green 2009).

Accuracy assessments include three fundamental steps (Congalton and Green 2009): (1) designing the sample, (2) collecting data for each sample, and (3) analyzing the results. Each step must be rigorously planned and implemented. First, the accuracy assessment sampling procedures are designed, and the sample areas on the map are selected. We use sampling because time and funding limitations preclude the assessment of every spatial unit on the map. Next, information is collected from both the map and the reference data for each sample site. Thus, two types of information are collected from each sample:

• Reference accuracy assessment sample data: The position or class label of the accuracy assessment site, which is derived from data collected that are assumed to be correct.

• Map accuracy assessment sample data: The position or class label of the accuracy assessment site, which is derived from the map or image being assessed.

Third, the map and reference information are compared, and results of the comparison are analyzed for statistical significance and for reasonableness. In summary, effective accuracy assessment requires (1) design and implementation of unbiased sampling procedures, (2) consistent and accurate collection of sample data, and (3) rigorous comparative analysis of the sample map and reference data.


(40)

III. METHODS

3.1 Time and Research Location

The research was started from problem identification, searching related references, and developing research methodology in order to develop the research proposal. The research was continued with collecting, processing, and analyzing data which completed with result and discussion, and conclusion. The research was conducted during 17 months that started from March 2009 to July 2010. Writing the research proposal and data collection was done during March – August 2009 in Bogor Agricultural University and Siak District (Indonesia), while processing and analyzing data, writing the result and discussion, and constructing the conclusion of the research have been done in the University of Tsukuba (Japan), during September 2009 – July 2010 in term of exchange research student program has been conducted. The detail of the schedule is shown in Appendix 6.

Figure 3. The research location: Siak District, Riau Province

The research location is Siak District which is located in Riau Province. Siak District is an enlargement from some parts of Bengkalis District that was done in 1999. In this research, the land use change study in Siak District would be started in 2002 which means 3 years after the district enlargement done administratively. The consideration in taking 2002 as starting time of the research is during 1999 – 2002 was the transition period which some activities in Siak District related to the spatial planning and land management still used some


(41)

policies and the land management plans of Bengkalis District as mother district. Since 2002, Siak District has started to manage their district independently.

Geographically, Siak is situated in 0°21’19.50” - 1°14’43.87” north latitude and 100°54’46.31” - 102°58’27.34” east longitude and located between 0 - 110 m above sea level. Siak neighbors with Bengkalis District in North, Kampar District and Pekanbaru Municipality in South, Rokan Hulu District in West, and Bengkalis and Pelalawan District in East. Siak is loacated in Siak Watershed with Siak River as the main river. The landscape of Siak mostly is wetlands, and only little part in west side is hilly. The majority of Forest land in Siak District is Peatland Forest.

Based on the spatial analysis, Siak District has area about 868,117.82 Ha, and about 59% of the total area is allocated for crop and timber plantation and 9% for production forest. Furthermore, Siak Government has also allocated land for other uses such as agriculture area. There are two types of agriculture area in Siak: wetland agriculture and dry land agriculture. The preeminent commodities of crop plantation, which are managed, are rubber, oil palm, coconut and coffee which are run by private company and community. Siak also has large oil and gas resources, incorporated by international company which contributes in increasing the economic growth there (Siak District Government 2008).

3.2 Data Source

Data required in this research were divided into two categories: spatial data and non-spatial data. Spatial data can be distinguished into two types: raster and vector data. Raster data, which were used in this research, is LANDSAT 7 ETM+ (LANDSAT) images year 2002, 2005, and 2008. LANDSAT images would be interpreted to produce land use categories data year 2002, 2005, and 2008 which would be used either as object of the study (land use change) or relevant factors of land use changes. Other raster data used were SRTM-DEM 90m of Siak District that would be processed to produce altitude and slope data. Vector data consisted administration boundary, land use/land cover data from previous years, infrastructure development such as road and market place, river


(42)

map, spatial planning data, and other relevant vector data which might be useful for modeling land use change in Siak District.

Non spatial data in this research functioned as supporting data that were considered as relevant factors that might drive land use change in Siak District. Non spatial data used was demography data, such as population and population density, which might represent the socio-economic condition of people that live in the research area. The organization of the data is listed in Table 3.

Table 3. Organization of the data

No. Data Category Data Type Data Source

LANDSAT 7 ETM+ images year 2002, 2005, 2008

http://glovis.usgs.gov/ SRTM-DEM 90m of Siak

District

http://srtm.csi.cgiar.org/ Raster

Scanned map related with land cover/land use, spatial

planning, infrastructure development, etc

Ministry of Forestry, Ministry of Public Work, Local Government, NGOs.

Administration boundary Ministry of Forestry , Statistics Indonesia, Local Government Land use/land cover map of

Siak District

Ministry of Forestry, NGOs Spatial planning map of Riau

Province and Siak District

Ministry of Forestry, Ministry of Public Work, Local Government.

Road and River Network Ministry of Forestry, Local Government, NGOs 1 Spatial Data

Vector

Infrastructure development, etc

Ministry of Public Work, Local Government.

2 Non Spatial Data

Socio-economic

Demography data Statistics Indonesia, Local Government

3.3 Hardware and Software

In this research hardware and software were used as the following: 1. Software:

• ArcGIS 9.3 used for spatial data analysis and spatial model simulation

• Hawths Analysis Tools extension, an additional tools for ArcGIS 9.3

• ERDAS Imagine 9.3 used for image processing

• SPSS version 19.0, used for statistical processing 2. Hardware:

• Notebook Apple MacBook (MB062ZP/A).


(43)

3.4 Methods

The research of land use change modeling would be developed in Siak District that might facilitate the understanding of the process of land use change and its relevant factors in the research site. Land use change modeling in Siak District would be conducted in four main activities:

1. Field data collection, 2. Land use classification,

3. Land use change detection, and 4. Land use change modeling.

LAND USE CLASSIFICATION LANDSAT 7 ETM+

Images (time series) Land Use Maps (time series) LAND USE CHANGES DETECTION Land Use Change Maps Relevant Factors of Land Use

Change

Land Use Change Model LAND USE CHANGE

MODELING Land Use Change Driving Factors 2 3 4 Result of Model Validation FIELD DATA COLLECTION

Sample of Land Use/Land Cover 1

Start


(44)

Field data collection was intended to collect LANDSAT images 2002, 2005, and 2008, relevant driving factors of land use change, and sample data of land use/land cover in Siak District in the form of primary data by collecting samples using GPS and secondary data of land use/land cover originated from other institutions. These sample data would be used as reference data in land use classification activity. During field data collection, non-spatial data were also collected.

Land use classification was done in order to produce land use maps of Siak District. Then, the land use maps, that have been produced, would be processed in the land use changes detection activity to produce land use change maps and the transition and probability matrices of land use changes. The land use change maps was used as dependent variable in logistic regression modeling to determine the land use change driving factors and also produce land use change model. The general flow chart of land use change modeling in Siak District is shown in Figure 4.

3.4.1 Field Data Collection

Field data collection in the form of sample of land use/land cover of Siak District was done by using Global Positioning System (GPS) to record geographical coordinates of samples of land use/land cover. Field data collection was conducted by using simple random sampling that is appropriate with the target of land use/land covers which have been defined and may represent all land uses in Siak District. Furthermore, this activity also collected secondary data, such as LANDSAT images, and land use/land cover from previous years in the form of maps or coordinate points which have been used as reference in conducting land use classification of 2002, 2005 and 2008.

In this field data collection, non-spatial data in term of demography data were also collected, directly through interviews and indirectly through secondary data from related institutions. These data may represent the socio-economic condition of people living in the research area, and can be considered as relevant factors that drive the land use changes in Siak District.


(45)

Figure 5. The flow chart of field data collection

3.4.2 Land Use Classification

Time-series data of LANDSAT images 2002, 2005, and 2008 were used in this research in order to generate the information about time-series data of land use in Siak District. The land use categories were generated by image classification process that was done in each LANDSAT image. In order to have a good classification result, there were some processes done which can be divided into two processes: image pre-processing and image processing.

Image pre-processing consists of processes to prepare image data for subsequent analysis that attempts to correct or compensate for systematic errors. LANDSAT images delivered to the user may consist of some errors which may be caused by the atmospheric or sensor condition when capturing data from the earth surface. The Landsat 7 scan-line corrector (SLC), a mechanism designed to correct the undersampling of the primary scan mirror, failed on May 31, 2003. With the SLC now permanently turned off (SLC-OFF), the ETM+ is losing approximately 22% of the data due to the increased scan gap (Scaramuzza, et al. 2004). Thus, in this research LANDSAT images might need to be subjected to several corrections, such as SLC-OFF Gap Filling and geometric correction. The SLC-OFF Gap Filling process was done to each LANDSAT 7 ETM+ SLC-OFF


(1)

97 REFERENCES

Agresti, A. 2002. Categorical Data Analysis. Second Edition. New Jersey: John Wiley and Sons, Inc.

CCRS-NRC (Canada Centre for Remote Sensing - Natural Resources Canada). 2005. Glossary of Remote Sensing Terms. 588 Booth Street Ottawa, Ontario, K1A 0Y7.

http://www.ccrs.nrcan.gc.ca/glossary/index_e.php?id=2965

Chen, J., Gong, P., He, C., Pu, R., and Shi, P. 2003. Land-Cover Change Detection Using Improved Change-Vector Analysis. Photogrammetric Engineering & Remote Sensing Vol. 69, No. 4, April 2003, pp. 369–379. American Society for Photogrammetry and Remote Sensing.

Congalton, R. G. and Green, K. 1999. Assessing the Accuracy of Remotely Sensed Data: Principles and Practices. CRC Press Inc.

Congalton, R. G. and Green, K. 2009. Assessing the Accuracy of Remotely Sensed Data: Principles and Practices. Second Edition. CRC Press Inc. CSC-NOAA (Coastal Services Center - National Oceanic and Atmospheric

Administration). 2010. Land Cover Analysis. NOAA Coastal Services Center, 2234 South Hobson Avenue Charleston, SC 29405-2413. http://www.csc.noaa.gov/crs/lca/faq_tech.html#q4.

Deer, P. 1995. Digital Change Detection Techniques in Remote Sensing. DSTO Electronics and Surveillance Research Laboratory. Commonwealth of Australia.

Dewantara, B. H. 2006. Performance of Logistic Regression Model and Spatial Method. Case: Predicting of Deforestation in Cikepuh Wildlife Reserve and Cibanteng Natural Reserve. Master Thesis. Bogor: Bogor Agriculture University.

Fang, S., Gertner, G. Z., and Anderson, A. B. 2006. Prediction of Multinomial Probability of Land Use Change using a bisection decomposition and logistic regression. Landscape Ecol (2007) 22: 419-430. Springer.

Friesen, P. and Lydon, B. 1999. Modeling Land Use Change: Using GRID to Develop Scenarios for Colorado Springs’ Comprehensive Plan. 1999 ESRI


(2)

98

User Conference Proceedings. California: Environmental Systems Research Institute, Inc.

Handoko, I. 2005. Quantitative Modeling of System Dynamics for Natural Resource Management. Bogor: SEAMEO BIOTROP.

Hosmer, D. W. and Lemeshow, S. 2000. Applied Logistic Regression. Second Edition. New York: John Wiley & Sons, Inc.

IPCC (Intergovernmental Panel on Climatic Change). 2000. Land use, land-use change, and forestry. Watson, R.T., Noble I. R., Bolin, B., Ravindranath, N.H., Verardo, D. J., and Dokken D. J. (eds). A special report of the Intergovernmental panel on climatic change. Cambridge University Press, Cambridge.

Koomen, E., Rietveld, P., and De Nijs, T. 2007. Modelling land-use change for spatial planning support. Ann Reg Sci (2008) 42:1–10. Published online: 6 September 2007. Springer-Verlag.

Leica Geosystem. 2005. ERDAS Field Guide. Leica Geosystems Geospatial Imaging, LLC. Norcross, Georgia, USA.

Lillesand, T. M. and Kiefer, R. W. 1993. Remote Sensing and Image Interpretation. Second Edition. New York: John Wiley & Sons.

Mather, P. M. 2004. Computer Processing of Remotely-Sensed Images: An Introduction. Third Edition. Nottingham: John Wiley.

Millington, J. D. A., Perry, George L. W., and Romero-Calcerrada, R. 2007. Regression Techniques for Examining Land Use/Cover Change: A Case Study of Mediterranean Landscape. Ecosystem (2007) 10: 562-578. Springer Science+Business Media, LLC.

Richards, J. A. and Jia, X. 2006. Remote Sensing Digital Image Analysis: An Introduction. Fourth Edition. Springer.

Scaramuzza, P., Micijevic, E., and Chander, G. 2004. SLC Gap-Fill Methodology. http://landsat.usgs.gov/documents/SLC_Gap_Fill_Methodology.pdf.

Shenk, T. M. and Franklin, A. B. 2001. Modeling in Natural Resource Management: Development, Interpretation, and Application. Washington: Island Press.


(3)

99 Siak District Government. 2008. Geografi Kabupaten Siak. Rencana

Pembangunan Jangka Menengah Kabupaten Siak 2006 – 2011. (http://siakkab.go.id/index.php?categoryid=44).

Singh, A., 1989. Digital change detection techniques using remotely sensed data. International Journal of Remote Sensing, 10:989-1003.

Tabachnick, Barbara, G. and Fidell, L. S. 2007. Using Multivariate Statistics. Fifth Edition. Pearson Education, Inc.

Teknomo, Kardi. 2006. K-Nearest Neighbors Tutorial. http:\\people.revoledu.com\kardi\ tutorial\KNN\

USGS (U. S. Geological Survey). 1999. Analyzing Land Use Change In Urban Environments. USGS Fact Sheet 188-99. USGS’s Urban Dynamic Research (UDR) Program.

Verburg, P. H., Schot, P. P., Dijst, M. J., and Veldkamp, A. 2004. Land use change modelling: current practice and research priorities. GeoJournal 61:309-324.

Xie, C., Huang, B., Claramunt, C., and Chandramouli, M. 2005. Spatial Logistic Regression and GIS to Model Rural-Urban Land Conversion. PROCESSUS Second International Colloquium on the Behavioural Foundations of Integrated Land-use and Transportation Models: Frameworks, Models and Applications. Canada: University of Toronto.


(4)

SUMMARY

CHANDRA IRAWADI WIJAYA. Land Use Change Modeling in Siak District, Riau Province, Indonesia Using Multinomial Logistic Regression. Under the supervision of HARTRISARI HARDJOMIDJOJO and LILIK BUDI PRASETYO.

Siak District as a new district, which is an enlargement from some parts of Bengkalis District that was established in 1999, has been developing their region in order to support the people activities and also try to be at the same level as other districts. The development, that has been conducted so far, has altered land uses which involve land conversion from a type of use to other uses. In this study, land use change modeling would be developed in Siak District that may facilitate the understanding of the process of land use change and its relevant factors in the research site. The objectives of the research of Land Use Change Modeling in Siak District are (1) to analyze the land use change during 2002 – 2005 and 2005 – 2008, (2) to develop the land use change schemes of Siak District, (3) to identify the driving factors of land use change and develop the land use change model of Siak District using Multinomial Logistic Regression (MLR) model, and (4) to examine the performance of MLR model in modeling the land use change.

The research location is Siak District which is located in Riau Province, Indonesia. Geographically, Siak is bounded by latitudes 0°21’19.50” - 1°14’43.87” North and longitudes 100°54’46.31” - 102°58’27.34” East and located in 0 - 110 m above sea level. Siak is loacated in Siak Watershed with Siak River as the main river. The landscape of Siak mostly is wetlands, and only little part in west side is hilly. Based on the spatial analysis, Siak District has area about 868,117.82 Ha, and about 59% of the total area is allocated for crop and timber plantation and 9% for production forest. Furthermore, Siak Government has also allocated land for other uses such as agriculture area. There are two types of agriculture area in Siak: wetland agriculture and dry land agriculture. The preeminent commodities of crop plantation, which are managed, are rubber, oil palm, coconut and coffee which are run by private company and community. Siak also has large oil and gas resources, incorporated by international company which contributes in increasing the economic growth there (Siak District Government 2008).

The research of land use change modeling would be developed in Siak District that might facilitate the understanding of the process of land use change and its relevant factors in the research site. Land use change modeling in Siak District would be conducted in four main activities: (1) Field data collection, (2) Land use classification, (3) Land use change detection, and (4) Land use change modeling. Field data collection has been done in order to collect primary and secondary data which would be used in the research, while land use classification done in order to derive the information of land use categories of Siak District in 2002, 2005, and 2008. Land use change detection aimed to identify the transformations of land uses during 2002 – 2005 and 2005 - 2008 in Siak District. Furthermore, the land use change modeling aimed to determine the significant


(5)

variables of land use change and to find a good model which can represent the land use change in Siak District.

Based on the result of land use classification, during 2002 – 2008, Siak District was dominated by Forest land, Cropland and Grassland. In 2002, Forest land occupied up to 46.8% of the total area of Siak District, Cropland occupied 31.3% of Siak District, and Grassland for about 16.2%. Forest land decreased dramatically which occupied only 36.9% of Siak District in 2005 and then drop to 27.2% in 2008, in the same time Cropland and Grassland were increased gradually. Moreover in 2008, Cropland area exceeded Forest land area by occupying 43% of Siak District. In 2002, Settlements occupied for only 7,909 ha, but in 2005 it increased almost twice became 14,054 ha, and in 2008 became 19,340.58 ha or 2.2% of Siak District. During 2002 – 2008, Other lands were the land use category which changed dynamically, and Wetlands were assumed in stable condition.

The land use change scheme 2002 – 2005 and 2005 - 2008 shows that all land use categories tended to not transform into other land uses (stable condition) with high probabilities. The land use change scheme 2002 – 2005 also show that the three dominant land use categories in Siak District, which are Forest land, Cropland, and Grassland, transformed each other which constructed the triangle of major land use transitions with reciprocal transitions. However, during 2005 – 2008 the transformations from Forest land to Cropland and Grassland (deforestation) happen in one-way transitions, and the reforestation did not count as major land use transitions.

In this research, the land use change model has been developed in two scenarios: (1) using all significant variables determined by the MLR model analysis and (2) using observed variables determined by the observation of existing condition in the field. In the 1st scenario, the likelihood ratio tests for each independent variable show that there were 24 variables from total 28 variables which were considered as significant variables of land use change in Siak District. Natural environment contributed 6 variables, human environment contributed 15 variables, and policy contributed 3 variables to the final model. Otherwise, the variables were not included to the model were altitude, slope, distance from health service, and the area of sub district. The two tests for final model have been conducted, likelihood ratio test for the final model and pseudo r-squared, indicate that the final model of land use change in Siak District which was developed using MLR model is a good model that could explain most of the variability of land use change happen in the research site. However, the model validation for the 1st scenario which has been conducted spatially indicates that the final model could not fit the actual spatial data layers into the actual condition of land use change 2005 – 2008.

In the 2nd scenario, the observed variables included in the MLR model analysis were the existences of crop and timber plantation, the existences of road network, the spatial plans at national, province, and district level. The result of likelihood ratio tests for each observed variable done in MLR model analysis show that all observed variables may be considered as significant variables of land use change in Siak District and would be included into the final model. The two tests for final model which have been conducted, likelihood ratio test for the final model and pseudo r-squared, indicate that the final model of land use change in


(6)

Siak District is a good model that can explain most of the variability of land use change in the research site. However, similar with the 1st scenario, the statistical properties produced from model validation show that the final model could not fit the actual spatial data layers into the actual condition of land use change 2005 – 2008.

The two scenarios done in statistical MLR model indicate the land use change models which have been developed are adequate models which can explain many variability of the land use transitions. However, the model validations which have been conducted spatially indicate the final model could not fit the actual spatial data layers into the actual condition of land use change 2005 – 2008. This result may be caused by the nature of MLR model as generalized logistic regression model which forces every land use transitions to be driven by all significant parameters that have been determined, while in the real world each land use transition probably has unique combination of parameters which drive each land use transition.

By considering the research findings on the performance of MLR model in this research, the binary logistic regression would like to be recommended for the future research in order to develop the adequate land use change model which is good statistically and spatially. The binary logistic regression would find the best fit model for each land use transition individually by considering unique combination of parameters of each land use transition which involve into its model. Hopefully, the binary logistic regression may produce the conditional probability maps of land use transitions which can cover the whole area or most of the research area, increase the probability values for each projected land use transition, and also narrow the range and data distribution of the probability values of each projected land use transition compared to its actual land use transition.

Another issue considered in this research is about the effects of the predictor variables/parameters of MLR model on the dependent variable which can only be interpreted for direct effects on the dependent variable when the other predictor variables are held constant. Millington et al. (2007) proposed Hierarchical Partitioning (HP) method might be chosen to address that issue. Hopefully, HP can observe the effects of the predictor variables/parameters to the land use change, both independently and in conjunction with all other variables.

Based on the land use change scheme 2002 – 2008, the situation that should be highlighted and considered by Siak District is the increasing of deforestation probability as one of major land use transitions during 2002 – 2008. This situation was getting bad, since the reforestation was not done significantly and was not also visible as major land use transitions according to the land use change scheme 2005 – 2008. If this situation continues, it is not impossible that the Forest land in Siak District, which the majority is Peatland Forest, will continue to decline and probably in the future will be exhausted and will be replaced by Cropland as the most increasing land use category during 2002 – 2008. Synchronized spatial plans among different administrative levels (national, province, and district) may prevent the undesirable land use change and furthermore may support the sustainable natural resources management in Siak District.