cv08 part17 reconstruction3
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Computer Vision – Lecture 17
Structure-from-Motion
21.01.2009
Bastian Leibe
RWTH Aachen
http://www.umic.rwth-aachen.de/multimedia
[email protected]
(2)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Course Outline
•
Image Processing Basics
•
Segmentation & Grouping
•
Object Recognition
•
Local Features & Matching
•
Object Categorization
•
3D Reconstruction
Epipolar Geometry and Stereo Basics
Camera calibration & Uncalibrated Reconstruction
Structure-from-Motion
•
Motion and Tracking
(3)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Recap: A General Point
•
Equations of the form
•
How do we solve them? (always!)
Apply SVD
Singular values of A = square roots of the eigenvalues of
A
TA.
The solution of Ax=0 is the
nullspace
vector of A.
This corresponds to the
smallest singular vector
of A.
3
Ax 0
11 11 1
1
T N T
NN N NN
d
v
v
d
v
v
�
��
�
�
��
�
�
��
�
�
��
�
�
��
�
A UDV
U
L
O
M O
M
L
SVD
Singular values
Singular vectors
(4)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Properties of SVD
•
Frobenius norm
Generalization of the Euclidean norm to matrices
•
Partial reconstruction property of SVD
Let
i
i=1,…,N
be the singular values of
A
.
Let
A
p
= U
pD
pV
pTbe the reconstruction of
A
when we set
p+1,…,
Nto zero.
Then
A
p
= U
pD
pV
pTis the best rank-p approximation of
A
in
the sense of the Frobenius norm
(i.e. the best least-squares approximation).
4
2
1 1
m n
ij F
i j
A
a
 
��
min( , ) 21
m n
i i
�
(5)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Recap: Camera Parameters
•
Intrinsic parameters
Principal point coordinates
Focal length
Pixel magnification factors
Skew (non-rectangular pixels)
Radial distortion
•
Extrinsic parameters
Rotation R
Translation t
(both relative to world coordinate system)
•
Camera projection matrix
General pinhole camera: 9 DoF
CCD Camera with square pixels: 10 DoF
General camera:
11 DoF
5
B. Leibe
0 0
1 1 1
x x x
y y y
m f p x
K m f p y
 
� �� � � �
� �� � � �
� �� � � �
� �� � � �
� �� � � �
s
s
(6)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Recap: Calibrating a Camera
Goal
•
Compute intrinsic and
extrinsic parameters using
observed camera data.
Main idea
•
Place “calibration object”
with known geometry in the
scene
•
Get correspondences
•
Solve for mapping from scene
to image: estimate P=P
intP
ext6
B. Leibe
Slide credit: Kristen Grauman P?
Xi xi
(7)
P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9
Recap: Camera Calibration (DLT
Algorithm)
•
P has 11 degrees of freedom.
•
Two linearly independent equations per independent
2D/3D correspondence.
•
Solve with SVD (similar to homography estimation)
Solution corresponds to smallest singular vector.
•
5 ½ correspondences needed for a minimal solution.
7
B. Leibe
Slide adapted from Svetlana Lazebnik
0
p
A
0
P
P
P
X
0
X
X
X
0
X
0
X
X
X
0
3 2 1 1 1 1 1 1 1
T n n T T n T n n T n T T T T T T Tx
y
x
y
(8)
P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9
•
Two independent equations each in terms of
three unknown entries of X.
•
Stack equations and solve with SVD.
•
This approach nicely generalizes to multiple cameras.
Recap: Triangulation – Linear Algebraic
Approach
8 B. LeibeX
P
x
X
P
x
2 2 2 1 1 1
0
X
P
x
0
X
P
x
2
2
1
1
0
X
P
]
[x
0
X
P
]
[x
2
2
1
1
Slide credit: Svetlana Lazebnik
O1 O2
x1 x2
X? R1 R2
(9)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Recap: Epipolar Geometry –
Calibrated Case
9
B. Leibe
X
x x’
Camera matrix:
[I|0]
X
= (
u
,
v
,
w
,
1
)
Tx
= (
u
,
v
,
w
)
TCamera matrix:
[
R
T| –
R
Tt
]
Vector
x
’
in second
coord. system has
coordinates
Rx
’
in the
first one.
t
The vectors
x
,
t
, and
Rx’
are coplanar
R
(10)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Recap: Epipolar Geometry –
Calibrated Case
10
B. Leibe
X
x x’
Slide credit: Svetlana Lazebnik
Essential Matrix
(Longuet-Higgins, 1981)
0
)]
(
[
t
R
x
(11)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Recap: Epipolar Geometry –
Uncalibrated Case
•
The calibration matrices
K
and
K’
of the two
cameras are unknown
•
We can write the epipolar constraint in terms of
unknown
normalized coordinates:
11
B. Leibe
X
x x’
Slide credit: Svetlana Lazebnik
0
ˆ
ˆ
E
x
(12)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Recap: Epipolar Geometry –
Uncalibrated Case
12
B. Leibe
X
x x’
Slide credit: Svetlana Lazebnik
Fundamental Matrix
(Faugeras and Luong, 1992)
0
ˆ
ˆ
E
x
x
T
x
K
x
x
K
x
ˆ
ˆ
1
with
0
F
K
E
K
x
F
(13)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
•
Problem: poor numerical conditioning
Recap: The Eight-Point
Algorithm
13
B. Leibe
x
= (
u
,
v
, 1)
T,
x’
= (
u’
,
v’
, 1)
TMinimize:
under the constraint
|
F
|
2= 1
2
1
)
(
i
N
i
T
i
F
x
x
(14)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Recap: Normalized Eight-Point
Algorithm
1.
Center the image data at the origin, and scale it so
the mean squared distance between the origin and
the data points is 2 pixels.
2.
Use the eight-point algorithm to compute
F
from
the normalized points.
3.
Enforce the rank-2 constraint using SVD.
4.
Transform fundamental matrix back to original
units: if T and T’ are the normalizing
transformations in the two images, than the
fundamental matrix in original coordinates is
T
TF T’.
14
B. Leibe [Hartley, 1995]
Slide credit: Svetlana Lazebnik
11 11 13
22
33 31 33
T T
d
v
v
F
d
d
v
v
�
��
�
�
��
�
�
��
�
�
��
�
�
��
�
UDV
U
L
M O
M
L
SVD
Set d
33to
zero and
reconstruct F
(15)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Recap: Comparison of Estimation
Algorithms
15
B. Leibe
8-point Normalized 8-point Nonlinear least squares Av. Dist. 1 2.33 pixels 0.92 pixel 0.86 pixel
Av. Dist. 2 2.18 pixels 0.85 pixel 0.80 pixel
(16)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Recap: Epipolar Transfer
•
Assume the epipolar geometry is known
•
Given projections of the same point in two
images, how can we compute the projection of
that point in a third image?
16
B. Leibe
x
1x
2x
3l
32l
31l
31= F
T13
x
1l
32= F
T23
x
2(17)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Recap: Active Stereo with
Structured Light
•
Optical triangulation
Project a single stripe of laser light
Scan it across the surface of the object
This is a very precise version of structured light scanning
17B. Leibe
Digital Michelangelo Project
http://graphics.stanford.edu/projects/mich/
(18)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Topics of This Lecture
•
Structure from Motion (SfM)
Motivation
Ambiguity
•
Affine SfM
Affine cameras
Affine factorization
Euclidean upgrade
Dealing with missing data
•
Projective SfM
Two-camera case
Projective factorization
Bundle adjustment
Practical considerations
•
Applications
18
(19)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Structure from Motion
•
Given:
m
images of
n
fixed 3D points
x
ij=
P
iX
j,
i =
1
, … , m, j =
1
, … , n
•
Problem: estimate
m
projection matrices
P
iand
n
3D points
X
jfrom the
mn
correspondences
x
ij19
x
1jx
2jx
3jX
jP
1P
2P
3B. Leibe
(20)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
What Can We Use This For?
20
B. Leibe
•
E.g. movie special effects
Video
(21)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Structure from Motion
Ambiguity
•
If we scale the entire scene by some factor
k
and, at the same time, scale the camera matrices
by the factor of 1/
k
, the projections of the scene
points in the image remain exactly the same:
It is impossible to recover the absolute scale of
the scene!
21
B. Leibe
)
(
1
X
P
PX
x
k
k
(22)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Structure from Motion
Ambiguity
•
If we scale the entire scene by some factor
k
and, at the same time, scale the camera
matrices by the factor of 1/
k
, the projections
of the scene points in the image remain
exactly the same.
•
More generally: if we transform the scene
using a transformation
Q
and apply the
inverse transformation to the camera
matrices, then the images do not change
22
B. Leibe
Slide credit: Svetlana Lazebnik
PQ
QX
PX
x
x
PX
PQ
-1
QX
-1
(23)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Reconstruction Ambiguity:
Similarity
23
B. Leibe
PQ
Q
X
PX
x
-1
S
S
(24)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Reconstruction Ambiguity:
Affine
24
B. Leibe
Slide credit: Svetlana Lazebnik
PQ
Q
X
PX
x
-1
A
A
(25)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Reconstruction Ambiguity:
Projective
25
B. Leibe
PQ
Q
X
PX
x
-1
P
P
(26)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Projective Ambiguity
26
B. Leibe
(27)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
From Projective to Affine
27
B. Leibe
(28)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
From Affine to Similarity
28
B. Leibe
(29)
P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9
Hierarchy of 3D
Transformations
•
With no constraints on the camera calibration matrix or on the
scene, we get a
projective
reconstruction.
•
Need additional information to
upgrade
the reconstruction to
affine, similarity, or Euclidean.
29 B. Leibe
v
Tv
t
A
Projectiv
e
15dof
Affine
12dof
Similari
ty
7dof
Euclidea
n
6dof
Preserves intersection and tangency Preserves parallellism, volume ratios Preserves angles, ratios of length
1
0
t
A
T
1
0
t
R
Ts
1
0
t
R
T Preserves angles,
lengths
(30)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Topics of This Lecture
•
Structure from Motion (SfM)
Motivation
Ambiguity
•
Affine SfM
Affine cameras
Affine factorization
Euclidean upgrade
Dealing with missing data
•
Projective SfM
Two-camera case
Projective factorization
Bundle adjustment
Practical considerations
•
Applications
30
(31)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Structure from Motion
•
Let’s start with
affine cameras
(the math is
easier)
31
B. Leibe
center at infinity
(32)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Orthographic Projection
•
Special case of perspective projection
Distance from center of projection to image plane is
infinite
Projection matrix:
32
B. Leibe
Slide credit: Steve Seitz
(33)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Affine Cameras
33
B. Leibe
Orthographic Projection
Parallel Projection
(34)
P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9
Affine Cameras
•
A general affine camera combines the effects of
an affine transformation of the 3D space,
orthographic projection, and an affine
transformation of the image:
•
Affine projection is a linear mapping + translation
in inhomogeneous coordinates
34 B. Leibe
1
0
b
A
P
1
0
0
0
]
affine
4
4
[
1
0
0
0
0
0
1
0
0
0
0
1
]
affine
3
3
[
21 22 23 21 13 12 11
b
a
a
a
b
a
a
a
x
X
a
1a
2b
AX
x
2 1 23 22 21 13 12 11b
b
Z
Y
X
a
a
a
a
a
a
y
x
Projection of
world origin
(35)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Affine Structure from Motion
•
Given:
m
images of
n
fixed 3D points:
•
x
ij=
A
iX
j+
b
i, i =
1
,… , m, j =
1
, … , n
•
Problem: use the
mn
correspondences x
ijto estimate
m
projection matrices A
iand translation vectors b
i,
and
n
points X
j•
The reconstruction is defined up to an arbitrary
affine
transformation Q (12 degrees of freedom):
•
We have 2
mn
knowns and 8
m
+ 3
n
unknowns (minus
12 dof for affine ambiguity).
Thus, we must have 2
mn
>= 8
m
+ 3
n
– 12.
For two views, we need four point correspondences.
35
B. Leibe
1
X
Q
1
X
,
Q
1
0
b
A
1
0
b
A
1(36)
P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9
Affine Structure from Motion
•
Centering: subtract the centroid of the image
points
•
For simplicity, assume that the origin of the
world coordinate system is at the centroid of
the 3D points.
•
After centering, each normalized point x
ijis
related to the 3D point X
iby
36 B. Leibe
j i n k k j i n k i k i i j i n k ik ij ijn
n
n
X
A
X
X
A
b
X
A
b
X
A
x
x
x
ˆ
1
1
1
ˆ
1 1 1
  j
i
ij
A
X
x
ˆ
(37)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Affine Structure from Motion
•
Let’s create a 2
m
×
n
data (measurement)
matrix:
37
B. Leibe
mn
m
m
n
n
x
x
x
x
x
x
x
x
x
D
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
2
1
2
22
21
1
12
11
Cameras
(2
m)
Points (n)
C. Tomasi and T. Kanade. Shape and motion from image streams under orthography:
A factorization method. IJCV, 9(2):137-154, November 1992.
(38)
P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9
Affine Structure from Motion
•
Let’s create a 2
m
×
n
data (measurement)
matrix:
•
The measurement matrix
D = MS
must have rank
3!
38
B. Leibe
C. Tomasi and T. Kanade. Shape and motion from image streams under orthography:
A factorization method. IJCV, 9(2):137-154, November 1992.
Slide credit: Svetlana Lazebnik
Cameras
(2
m × 3)
n
m
mn
m
m
n
n
X
X
X
A
A
A
x
x
x
x
x
x
x
x
x
D
2
1
2
1
2
1
2
22
21
1
12
11
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
(39)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Factorizing the Measurement
Matrix
39
B. Leibe
(40)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Factorizing the Measurement
Matrix
•
Singular value decomposition of D:
40 Slide credit: Martial Hebert
(41)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Factorizing the Measurement
Matrix
•
Singular value decomposition of D:
41 Slide credit: Martial Hebert
(42)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Factorizing the Measurement
Matrix
•
Obtaining a factorization from SVD:
42 Slide credit: Martial Hebert
(43)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Factorizing the Measurement
Matrix
•
Obtaining a factorization from SVD:
43 Slide credit: Martial Hebert
This decomposition minimizes
(44)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Affine Ambiguity
•
The decomposition is not unique. We get the same D
by using any 3×3 matrix C and applying the
transformations M → MC, S →C
-1S.
•
That is because we have only an affine transformation
and we have not enforced any Euclidean constraints
(like forcing the image axis to be perpendicular, for
example). We need a
Euclidean upgrade
.
44
B. Leibe
(45)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Estimating the Euclidean
Upgrade
•
Orthographic assumption: image axes are
perpendicular and scale is 1.
•
This can be converted into a system of 3
m
equations:
45
B. Leibe
x
X
a
1a
2a
1· a
2= 0
|a
1|
2= |a
2
|
2= 1
Slide adapted from S. Lazebnik, M. Hebert
1 2 1 2
1 1 1
2 2 2
ˆ
ˆ
0
0
ˆ
1
1 ,
1,...,
ˆ
1
1
T T i i i i
T T
i i i
T T
i i i
a a
a CC a
a
a CC a
i
m
a
a CC a
�
� �
�
�
�
�
�
�
�
(46)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Estimating the Euclidean
Upgrade
•
This can be converted into a system of 3
m
equations:
•
Let
•
Then this translates to
3m
equations in L
 Solve for L
 Recover C from L by Cholesky decomposition: L = CCT  Update M and S: M = MC, S = C-1S
46
B. Leibe
Slide adapted from S. Lazebnik, M. Hebert
1 2 1 2
1 1 1
2 2 2
ˆ
ˆ
0
0
ˆ
1
1 ,
1,...,
ˆ
1
1
T T i i i i
T T
i i i
T T
i i i
a a
a CC a
a
a CC a
i
m
a
a CC a
�
� �
�
�
�
�
�
�
�
�
�
1 2
,
1,...,
Ti i T
i
a
A
i
m
a
� �
� �
� �
,
1,...,
T
i
i
A LA
I
i
m
T
(47)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Algorithm Summary
•
Given:
m
images and
n
features x
ij•
For each image
i, c
enter the feature coordinates.
•
Construct a 2
m
×
n
measurement matrix D:
Column
j
contains the projection of point
j
in all views
Row
i
contains one coordinate of the projections of all the
n
points in image
i
•
Factorize D:
Compute SVD: D = U W V
T Create U
3
by taking the first 3 columns of U
Create V
3
by taking the first 3 columns of V
Create W
3
by taking the upper left 3 × 3 block of
W
•
Create the motion and shape matrices:
M = U
3
W
3½and S = W
3½V
3T(or M = U
3and S = W
3V
3T)
•
Eliminate affine ambiguity
47 Slide credit: Martial Hebert
(48)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Reconstruction Results
48
B. Leibe
C. Tomasi and T. Kanade. Shape and motion from image streams under orthography:
A factorization method. IJCV, 9(2):137-154, November 1992.
(49)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Dealing with Missing Data
•
So far, we have assumed that all points are
visible in all views
•
In reality, the measurement matrix typically
looks something like this:
49
B. Leibe
Cameras
Points
(50)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Dealing with Missing Data
•
Possible solution: decompose matrix into
dense sub-blocks, factorize each sub-block,
and fuse the results
Finding dense maximal sub-blocks of the matrix is
NP-complete (equivalent to finding maximal cliques
in a graph)
•
Incremental bilinear refinement
50
(1) Perform
factorization on
a dense
sub-block
F. Rothganger, S. Lazebnik, C. Schmid, and J. Ponce. Segmenting,
Modeling, and Matching Video Clips Containing Multiple Moving
Objects. PAMI 2007.
(51)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Dealing with Missing Data
•
Possible solution: decompose matrix into
dense sub-blocks, factorize each sub-block,
and fuse the results
Finding dense maximal sub-blocks of the matrix is
NP-complete (equivalent to finding maximal cliques
in a graph)
•
Incremental bilinear refinement
51
(1) Perform
factorization on
a dense
sub-block
(2) Solve for a new 3D
point visible by at
least two known
cameras (linear
least squares)
F. Rothganger, S. Lazebnik, C. Schmid, and J. Ponce. Segmenting,
Modeling, and Matching Video Clips Containing Multiple Moving
Objects. PAMI 2007.
(1)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Topics of This Lecture
•
Structure from Motion (SfM)
Motivation
Ambiguity
•
Affine SfM
Affine cameras
Affine factorization
Euclidean upgrade
Dealing with missing data
•
Projective SfM
Two-camera case
Projective factorization
Bundle adjustment
Practical considerations
•
Applications
70
(2)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Commercial Software
Packages
•
boujou
(http://www.2d3.com/)
•
PFTrack
(http://www.thepixelfarm.co.uk/)
•
MatchMover
(http://www.realviz.com/)
•
SynthEyes
(http://www.ssontech.com/)
•
Icarus
(http://aig.cs.man.ac.uk/research/reveal/icarus/
)
•
Voodoo Camera Tracker
(http://www.digilab.uni-hannover.de/)
71
(3)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
boujou demo
(We have a license available, so if you want
to try it for interesting projects, contact us.)
72
(4)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Applications: Matchmoving
73
B. Leibe
•
Putting virtual objects into real-world videos
Original sequence Tracked features
SfM results
Final video
(5)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
Another Example: The Campanile
Movie
74
Video from SIGGRAPH’97 Animation Theatre http://www.debevec.org/Campanile/#movie
(6)
P
e
rc
e
p
tu
a
l
a
n
d
S
e
n
s
o
ry
A
u
g
m
e
n
te
d
C
o
m
p
u
ti
n
g
C
o
m
p
u
te
r
V
is
io
n
W
S
0
8
/0
9
References and Further
Reading
•
A (relatively short) treatment of affine and
projective SfM and the basic ideas and
algorithms can be found in Chapters 12 and 13
of
•
More detailed information (if you really
want to implement this) and better
explanations can be found in Chapters 10,
18 (factorization) and 19 (self-calibration)
of
B. Leibe 75
D. Forsyth, J. Ponce,
Computer Vision – A Modern Approach. Prentice Hall, 2003
R. Hartley, A. Zisserman
Multiple View Geometry in Computer Vision 2nd Ed., Cambridge Univ. Press, 2004