presented the fast cross correlation technique, and applied box filtering to measure stereo matching.
Stereo matching methods are generally categorized into two classes: local and global. The local methods are fast and
efficient in computation based on area or windows Cai,2006 ;2010 ; Cox, 1996. On the other hand, global methods based on
specific energy function and are computationaly expensive Boykov, 2001, 2004. However, stereo matching method
demonstrates more noise when the smaller window in area based method is used. Upon increasing the window size, the
noise is less affected, but the computational complexity increases with the increase in the window size. For the good
construction of 3D, the surface should be continuous and fully textured. The variation in intensity is not covered for small
window size, and if we increase the size, then occlusion and discontinuities in disparity occur.
Area based methods are used to measure similarity between two blocks using different types of window to measure disparity
map from stereo images. The maximum similarity between two stereo images in stereo matching depends upon the
costsimilarity function. The efficient designing of the costsimilarity function produces fast and robust stereo
matching. Global optimization algorithms like Graph-Cut and Belief
propagation sometimes require extra parameters which are computationally more expensive Boykov, 2001, 2004. These
algorithms are not suitable for real time processing due to higher running time. These algorithms can be used for non-real
time processing of data where higher accuracy is required. However, the Graph-Cut Algorithm is more accurate than Belief
Propagation and dynamic programming algorithms Sun, 2006. Therefore, Graph-Cut is suitable candidate for stereo matching
for estimation of disparity maps or depth maps. The Graph-Cut produced new energy minimization algorithm and give good
architecture for stereo matching problems. Boykov and Kolmogorov show graph-cut based energy minimization
algorithms, which are faster by 2 or 5 times as compared to traditional push-reliable approaches Scharstein, 2002. Graph-
Cut for energy minimization using Potts model are used in segmentation, stereo, object recognition, shape reconstruction
and augmented reality. The Boykov produces excellent algorithms that are expansion move and swap-move. These
algorithms are based on pixel labelling for large pixel sets. Stereo matching based on multi-labelling problems and these
labels are called disparities.
3. METHODOLOGY 3.1
Proposed framework
The Pleiades satellite was successfully launched with two sensors Pleiades-1A and Pleiades-1B sensor on 16 December
2011 and 1 December 2014. Pleiades-1AB has the capability to acquire stereo imagery in one pass, with a few
second differences. It also has ability to provide stereo-pairs color images with 20 km swath width and 70 cm resolution
obtained with base-to height ratio from 0.15 to 2 Lebègue, 2012. The Pleiades has been placed on the same sun-
synchronous orbit at 694 km. It has been acquiring the panchromatic stereo images with resolution of 50 cm and
multispectral images with resolution 200cm and also in bundle form 50 cm black and white and 200 cm multispectral
Lebègue, 2012. The Pleiades satellite has high resolution and low weight and also low cost for acquiring the images of small
area. We have area of 10x10 km square in Sabah in east Malaysia, so that’s why we choose high resolution, small
satellite sensor like Pleiades. This satellite has varieties probable various acquisition plans, such as a monoscopic cover
up to 100x100 km or a stereoscopic instantaneous cover up to 60x60 km. The stereoscopic coverage is comprehended by only
a single flyby of the area, which allows collection of a homogeneous product quickly. A classical forward and
backward looking stereo pair provides the highest accuracy, but this combination is limited to areas with moderate terrain. A
nadir and forwardbackward looking stereo pair can be used in most kinds of terrain. The depth estimation was calculated on
selected patches of imagery by employing the proposed dynamic programming and Graph-Cut algorithm.
The acquired data is first preprocessed and cropped. The two stereo images are then used to calculate the disparity maps.
These disparity maps are then further used to find depth via disparity map algorithms. Furthermore, the depth maps are
compared with previously recorded satellite data to find the area, where vegetation strikes the power transmission poles.
Figure1 shows how our framework gets the desired information by using disparity maps and depth estimation technique blank
line.
Figure 1. Proposed framework for monitoring of vegetation near
power poles
3.2 Disparity Map Generation
Depth information is computed from a pair of stereo images by calculating the pixel wise distance between the location of a
feature in one image and its corresponding location in the second image, hence generating a disparity map. Consequently,
it gives a depth map because the pixels with larger disparities are closer to the camera, and those with smaller disparities are
farther from the camera.
Figure 2. Stereo camera model
Boyer, 1988
. Pre-
processing Cropping
Satellite Image
Acquisition
Stereo
MatchingDisparity Map Dynamic
programming and Graph-Cut
Depth Estimation
from disparity map
Monitoring of vegetation near
power poles
This contribution has been peer-reviewed. doi:10.5194isprsarchives-XL-7-W3-489-2015
490
In the Figure 1, we have left and right camera images, where the left image have a center at 0 and right has a center at 0’.
Therefore, we can calculate 3D depth point at coordinates X0, Y0, Z0. We have the following relation from the above
diagram Boyer, 1988. Solving equation 4 and equation 5, we have the value of Zo.This value of z depends upon the value
of the denominator factor which is called disparity value.
λ λ
Z y
y x
x
L L
− =
= 1
λ λ
Z y
y x
x x
x
R R
− =
= ∆
+ ∆
+ 2
Solving equation 1 and equation 2,we obtain equation 3. x
x x
x Z
R L
∆ +
− ∆
+ =
λ λ
3 The distance in pixels between the first and second image of the
stereo pair is used to estimate the depth information and this information is called a disparity map. Pixels with smaller
disparity are far from the camera and the pixels having large disparities are near to the camera. In other words, depth is
inversely proportional to the disparity map as shown in the equation 3. We discussed Graph-Cut and dynamic
programming Algorithms for stereo matching on plaids satellite stereo images.
3.3
Graph-Cut algorithms for stereo matching
Stereo matching is a classical vision problem, where graph based energy minimization method has been successfully
applied. Three basic graph-based methods are used to solve stereo corresponding problems: pixel labelling with the Potts
model, stereo matching with occlusion handling, and multicamera scene reconstruction. The multicamera scene
reconstruction method is used for more than three stereo cameras. We are interested to handle the stereo matching with
occlusion and also detect objects in stereo vision at textureless region. We used satellite stereo images that have low textures in
some regions. In this paper, our work is closest to the formulation based on graph-Cut introduced by Kolmogorov
Zabih.They used symmetrical images in both stereo pair and used binary labels to pixel from each pair instead of assigning
labels to individual pixel. If the pixel pair have the same correspondence in stereo pair, it assigns label ‘1’ in the final
disparity map, otherwise it is assigned ‘0’ label. They further create a disparity map that imposes the uniqueness constraint.
The Boykov introduced the similar work based on energy minimization using an expansion move algorithm. This
algorithm minimizes the energy function in an iterative manner. It minimizes energy function by transforming into minimum cut
problem on the graph and cuts the graph at each iteration to solve such problem at each iteration. The algorithm is run until
convergence is achieved, and the result is a pretty strong local minimum of the energy function. The stereo correspondence
algorithms based on graph cut discussed here endow with the base, from which innovative algorithms have emerged. The
expansion-move algorithm [12] has the following chraractestics.
• Large number of pixels can change their labels
simultaneously •
Finding an optimal move is computationally interactive •
It takes almost less than one minute to complete an execution as compared with other energy minimization
algorithms like simulated annealing and iterated- conditional model which take 19 hours to complete
execution in early days. •
Finds local minimum of energy with respect to small “one-pixel” moves.
• Initialization is important practice. Theoretically,
solution reaches the global minima. Kolmogorov Zabih introduced the energy function which
comprises three terms: a data term, an occlusion term and a smoothness term penalizing neighboring pixels pairs for having
different labels.Based on energy function f of Kolmogorov and zabih, different energy functions can be defined as
f E
f E
f E
f E
f E
unique smooth
occ data
+ +
+ =
4 We can define these energy terms one by one as the following.
f E
data
define the matching cost of corresponding pixel and this matching cost can be calculated using four matching
cost function given as •
Sum of absolute difference SAD •
Sum of Squared difference SSD •
Normalized cross-correlation NCC •
Zero-mean normalized
cross-correlation ZNCC
The kolmogrov and zabih discussed squared difference of intensity values. We used sum of absolute difference which is
easy and cost effective. The formula of the data cost function is given below.
a f
B q
p sity
rightInten ty
LeftInteni data
q I
p I
f E
∑
∈
− =
,
5 Where a is may be 1 for SAD and 2 for SSD.
f E
occ
adds a constant value to total energy function for each occluded pixel in the stereo corresponding of the stereo
pair. .
= =
∑
∈
f U
F K
f E
p P
p p
occ
6 Where
F
evaluates 1 if its argument is true otherwise zero.
f E
smooth
If the neighboring pixels have different disparity this smooth energy function imposes the penalty and can be
defined as
{ }
.
2 1
1 ,
,
2 1
2 1
b f
b f
F U
f E
N b
b b
b smooth
≠ =
∑
∈
7 The smoothness term will be zero if the assignment
1
b
and
2
b
have the same disparity in the
1 N
neighbourhood system for 4-neighbours in the input images otherwise it imposes
penalty for different
disparity of the neighbouring pixels.
f E
unique
confines the possible solutions of the optimisation problem to unique solutions. If pixel is containing
more than one value in the crossponding image in stereo pair then it assign penalty for infinite value otherwise null value
assign.
This can be defined as
This contribution has been peer-reviewed. doi:10.5194isprsarchives-XL-7-W3-489-2015
491
. .
1 ∞
〉 =
∑
∈p P
p Unique
f N
F f
E
8 We introduced the ordering term in the above total energy
function for calculating stereo matching.
f E
order
can be written as
{ }
. .
1
2 2
1
, 2
1
∞ =
= =
∑
∈N b
b order
b f
b f
F f
E
9 Where
2
N
is a neighbourhood system and can be explain as in such a way that
q p
b ,
1
=
and
q p
b ′
′ =
,
2
are neighbours pixels.They must fillfull the order as if
x x
p p
′ 〉
and
x x
q q
′ 〈
is true. The final energy function can be written as
f E
f E
f E
f E
f E
unique smooth
occ data
+ +
+ =
+
f E
order
10 The energy function minimized using Graph-Cut algorithm
gives a general solution of the correspondence between stereo images.
3.4 Dynamic programming