Image Processing Land Use Classification
41 The supervised classification was done by using K-nearest neighbor which
basically considers the Euclidean distance in n-dimensional space of the target to the elements in the training data. The K-nearest neighbor is simple and intelligible
which classifies the image by examining the root of square differences between coordinates of a pair of objects. In term of supervised classification which was
done in this research, the spectral properties values of the training data were calculated, such as minimum, maximum, average, and standard deviation from
each band, then its properties were used to classify the object using K-nearest neighbor into a known feature type. Figure below illustrate the processes which
have been explained above.
Figure 14. Land Use Classification Process. A LANDSAT image, B Result of Training Data Generation, C Training Data Selection, and D Result of
Supervised Classification The basic difference between the traditional digital image interpretation
and the land use classification by using the combination of image segmentation and K-nearest neighbor in this research is the traditional image interpretation
working on pixel-based classification, whereas land use classification applied in this research working on region-based classification. The pixel-based
classification emphasizes on calculation of the spectral properties from each pixel
A B
C D
42 on training data to classify the entire image, and resulting the classes for each
single pixel, whereas the region-based classification emphasizes on calculation of the spectral properties from each region of pixels on training data to classify the
entire image, and resulting the classes for each region of pixels which has been produced in image segmentation process. The selection whether pixel-based or
region-based classification depends on the objective and scale level of the research. This research emphasizes on the land use study in landscape level, and
for this reason the region-based classification has been chosen. The land use classification which has been done in this research may
contain cloud and shadow covers, and some misinterpreted land uses. In this situation, the post-classification should be done in order to replace the
cloudshadow covers with the appropriate land use and correct the misinterpreted land uses. The post-classification process has been done by using manual
correction by replacing the misinterpreted land uses with the appropriate land uses. The post-classification must be done carefully in order to produce land use
maps with fine accuracy. The land use history data, GPS data, other LANDSAT images which have clear appearances on specific area, and other related data were
used as reference in order to correct misinterpretation problem that usually occurs in land use classification. The post-classification process can be seen in Figure 15.
In order to evaluate the quality of land use classification result, the accuracy assessment was conducted by comparing the result of land use
classification with the reference data that were available for this research. The accuracy assessment has been done by creating the error matrix, accuracy report
and the Kappa Analysis. The accuracy assessment was started by creating random points which would be used as sampling point in order to compare the classified
image classes to reference data. The sampling points for accuracy assessment were created in equalized random distribution, so that for each class would have
an equal number of random points. The equalized random distribution was chosen to accommodate the assumption that every classes to be treated in equal
conditions.
43 Figure 15. Post-Classification Process. A Before post-classification
and B After post-classification The sampling points which have been created were 50 random points for
each class, so in total it would be 350 random points 6 land use categories + 1 no data class. The comparison between the classified image and reference data has
been done in each sampling point. The error matrix was created by comparing the reference points to the classified points in a c × c matrix, where c is the number of
classes including class 0, and the accuracy report calculates statistics of the percentages of accuracy, based upon the results of the error matrix. The
producer’s accuracy expresses a measure of how accurately the analyst classified the image data by category columns, while the user’s accuracy expresses a
measure of how well the classification performed in the field by category rows CSC-NOAA 2010. The Overall Accuracy expresses the percentage of correctly
classified pixels CCRS-NRC 2005, whereas the Kappa Statistics incorporates
A
B
44 the off diagonal observations of the rows and columns as well as the diagonal to
give a more robust assessment of accuracy than overall accuracy measures CSC NOAA 2010.
Table 6a and 6b are the results of accuracy assessment which have been done for land use maps 2002, 2005, and 2008 of Siak District. Table 6a shows the
Overall Accuracy and Kappa Statistics for land use maps that have been produced, which are the ways to represent the overall classification accuracies. Table 6b
shows the producer’s and user’s accuracy for each land use categories in 2002, 2005 and 2008, which are the ways to represent individual land use category
accuracies. Based on Table 6a and Table 6b, the land use maps of Siak District have reached the requirements of accuracy assessment refer to Table 4 in Chapter
3 which have been appointed for this research, where both producer’s and user’s accuracy must be higher than 70 for each land use category, overall
classification accuracy exceeded 80, and the Kappa coefficient exceeded 0.8.
Table 6a. Accuracy Assessment Report: Overall Classification Accuracy and Kappa Statistics for Land Use Maps of Siak District
Accuracy Assessment 2002
2005 2008
Overall Classification Accuracy 88.86
92.00 89.71
Overall Kappa Statistics 0.87
0.91 0.88
Table 6b. Accuracy Assessment Report: Producer’s and User’s Accuracy of Land Use Maps of Siak District
2002 2005 2008
Land Use Category
Producer’s Accuracy
User’s Accuracy
Producer’s Accuracy
User’s Accuracy
Producer’s Accuracy
User’s Accuracy
Forest land
94.00 94.00 92.31 96.00 96.08 98.00 Cropland
70.69 82.00 91.49 86.00 84.31 86.00 Grassland
70.59 72.00 79.25 84.00 72.55 74.00 Wetlands
100.00 98.00 100.00 96.00 100.00 96.00 Settlements 97.78
88.00 100.00
86.00 93.62
88.00 Other
lands 95.65 88.00 87.27 96.00 82.69 86.00
Land use maps 2005 of Siak District had the best accuracy compared to land use maps 2002 and 2008, which overall classification accuracy and Kappa
coefficient were over 90 and 0.9 respectively. This was because the LANDSAT
45 image 2005 of Siak District had the best conditions rather than LANDSAT image
2002 and 2008 see Figure 13, so that land use misinterpretation could be avoided and might produced an accurate land use map.
When comparing the accuracies for all land use categories during 2002 – 2008, the Grassland had poor accuracies rather than other land use categories.
These conditions might be caused by the characteristics of Grassland which was scattered and grouped in smaller area than other land uses. Furthermore, due to the
region-based classification which was applied in this research and the Grassland i.e. paddy field and agriculture area occupied large area only in few locations,
the accuracies for Grassland were not as good as the other land use categories. However, in general the land use categories in each year, which have been
classified in this research, had good accuracies for both producer’s and user’s accuracies.