Image analysis problem with contextual information could be reduced to find a suitable configuration of
F i.e. f with maximum a posteriori MAP when an observed data is
available. Joint probability could be computed using the equivalence between MRF and the so called Gibbs random field
GRF For more details see Li, 2009. GRF is a set of random variables with Gibbs distribution which takes the following
form:
T f
U
e Z
f P
1
8
F f
T f
U
e Z
9 Where
Z is a normalizing factor, T is the temperature and f
U is the energy function. The energy function is defined as:
C c
c
f V
f U
10 where summation is over the set of all possible cliques,
C
. Note that a clique is a simple subset of sites.
f V
c
is called clique potential and depends on the local configuration on the
clique
c
. Given labelling
i
f
and observed multispectral feature vector
i
x
i
x
R
d
, the posterior probability is: |
|
i i
i i
i i
x P
f P
f x
P x
f P
11
where
i
f P
is the prior probability,
|
i i
f x
P
is the conditional p.d.f of the observation i
x
i.e. likelihood function and
i
x P
is a constant, independent from labelling
i
f
called the density of
i
x
. Thus:
| |
i i
i i
i
f P
f x
P x
f P
12 Assume that the label field f is a MRF, then
|
i i
f x
P
is also a MRF For more details see Geman Geman, 1984.
The equivalence of MRF and GRF implies that:
| |
i i
i i
i
f U
f x
U x
f U
13 where
|
i i
x f
U
is the posterior energy for a pixel,
|
i i
f x
U
denotes conditional energy and
i
f U
also can be denoted as
|
Ni i
f f
U
is the prior energy function for neighbourhood system.
The posterior energy for entire image is defined as:
n
i i
i
x f
U x
f U
1
| |
14
2.3 SVM-MRF-based Proposed Method for Classification
of Remotely Sensed Data
This research adopted Radial Basis Function RBF as a kernel function for SVM method as it has been shown its ability in the
classification of remotely sensed images. To integrate contextual information based on MRF model to
non-contextual SVM, the posterior energy is derived based on Equation 13.
According to Equation 4, SVM method assigns class labels to pixels based on decision function
x f
. To apply SVM model as conditional probability function likelihood function
|
i i
f x
P
for deriving the respective conditional energy, it is required to produce class probabilities instead of class labels.
This is done by implementation of Plott’s posteriori
probabilities theory for SVM. For more details see Lin, Lin, Weng, 2007.
To control the contribution of prior and conditional energy function in Equation 13
, an additional parameter λ is defined:
| 1
|
i i
i i
i
f U
f x
U x
f U
15 Where
1
. In the case that λ=0, the prior model contextual information is completely ignored. Contrary to this
case, if λ=1, only the prior model is considered and non-
contextual information is ignored. Since this study is concentrated on the integration of contextual characteristic to
non-contextual SVM classifier based on MRF model, the value of λ should be defined in the range of
1
. According to Equation 8, to maximize
f P
value, the energy function has to be minimized. Thus the problem of finding of
optimal labelling is turn into energy function minimization problem.
The problem of finding optimal labelling is difficult if the energy function is not convex. Simulated annealing S.A is one
of the stochastic-based global optimization methods which is widely used in the image analysis context. According to the
abilities of this optimization technique in context of optimization in the field of image analysis, this method was
adopted in this research. S.A tries to minimize the energy function iteratively. In the
k th iteration of S.A a new solution,
1 k
s
, is generated from the previous iteration,
k
s . Let
1 k
k
E E s
E s
be the difference between the energy of solutions of these two consecutive iterations. If
E
then
1 k
s
is accepted as the new solution, since it improves the objective function, otherwise,
1 k
s
is accepted with probability
exp
k
E T
, where
k
T
is the temperature at iteration
k
. Next, the current temperature is updated decreased and the process continues until some stopping
criterion is hold, e.g. after 3 successive “temperature stages”
with no acceptance. A “temperature stage” occurs when
12n
SMPR 2013, 5 – 8 October 2013, Tehran, Iran
This contribution has been peer-reviewed. The peer-review was conducted on the basis of the abstract. 443
perturbations are accepted or
100n
perturbations are attempted, where
n is the number of variables. The temperature updating rule plays critical role in the
convergence of the method to global optimum. A traditional choice is to use geometric formulation, that is,
1 k
k
T T
,
where
is a constant. Another updating rule, known as adaptive annealing, may provide better solutions, so it was
adopted in this research. It is defined as
1 k
k k
T D T
where:
min ,
k k
k
E D
D E
16 with
D =0.5 to 0.9,
k
E : the minimal accepted energy during the iteration
k
and
k
E : the average energy accepted during
the iteration
k
. Note that at high temperature since
k k
E E
is small then temperature is lowered quickly. For more details see Li, 2009 and Pétrowski Taillard, 2006.
The proposed model was implemented in R programming language.
2.4 Study Area