cv08 part17 reconstruction3

(1)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Computer Vision – Lecture 17

Structure-from-Motion

21.01.2009

Bastian Leibe

RWTH Aachen

http://www.umic.rwth-aachen.de/multimedia

[email protected]


(2)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Course Outline

Image Processing Basics

Segmentation & Grouping

Object Recognition

Local Features & Matching

Object Categorization

3D Reconstruction

Epipolar Geometry and Stereo Basics

Camera calibration & Uncalibrated Reconstruction

Structure-from-Motion

Motion and Tracking


(3)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: A General Point

Equations of the form

How do we solve them? (always!)

Apply SVD

Singular values of A = square roots of the eigenvalues of

A

T

A.

The solution of Ax=0 is the

nullspace

vector of A.

This corresponds to the

smallest singular vector

of A.

3

Ax 0

11 11 1

1

T N T

NN N NN

d

v

v

d

v

v

��

��

��

��

��

A UDV

U

L

O

M O

M

L

SVD

Singular values

Singular vectors


(4)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Properties of SVD

Frobenius norm

Generalization of the Euclidean norm to matrices

Partial reconstruction property of SVD

Let

i

i=1,…,N

be the singular values of

A

.

Let

A

p

= U

p

D

p

V

pT

be the reconstruction of

A

when we set

p+1

,…,

N

to zero.

Then

A

p

= U

p

D

p

V

pT

is the best rank-p approximation of

A

in

the sense of the Frobenius norm

(i.e. the best least-squares approximation).

4

2

1 1

m n

ij F

i j

A

a

 

��

min( , ) 2

1

m n

i i


(5)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Camera Parameters

Intrinsic parameters

Principal point coordinates

Focal length

Pixel magnification factors

Skew (non-rectangular pixels)

Radial distortion

Extrinsic parameters

Rotation R

Translation t

(both relative to world coordinate system)

Camera projection matrix

General pinhole camera: 9 DoF

CCD Camera with square pixels: 10 DoF

General camera:

11 DoF

5

B. Leibe

0 0

1 1 1

x x x

y y y

m f p x

K m f p y

 

� �� � � �

� �� � � �

�� � �

� �� � � �

� �� � � �

s

s


(6)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Calibrating a Camera

Goal

Compute intrinsic and

extrinsic parameters using

observed camera data.

Main idea

Place “calibration object”

with known geometry in the

scene

Get correspondences

Solve for mapping from scene

to image: estimate P=P

int

P

ext

6

B. Leibe

Slide credit: Kristen Grauman P?

Xi xi


(7)

P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9

Recap: Camera Calibration (DLT

Algorithm)

P has 11 degrees of freedom.

Two linearly independent equations per independent

2D/3D correspondence.

Solve with SVD (similar to homography estimation)

Solution corresponds to smallest singular vector.

5 ½ correspondences needed for a minimal solution.

7

B. Leibe

Slide adapted from Svetlana Lazebnik

0

p

A

0

P

P

P

X

0

X

X

X

0

X

0

X

X

X

0

3 2 1 1 1 1 1 1 1

T n n T T n T n n T n T T T T T T T

x

y

x

y


(8)

P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9

Two independent equations each in terms of

three unknown entries of X.

Stack equations and solve with SVD.

This approach nicely generalizes to multiple cameras.

Recap: Triangulation – Linear Algebraic

Approach

8 B. Leibe

X

P

x

X

P

x

2 2 2 1 1 1

0

X

P

x

0

X

P

x

2

2

1

1

0

X

P

]

[x

0

X

P

]

[x

2

2

1

1

Slide credit: Svetlana Lazebnik

O1 O2

x1 x2

X? R1 R2


(9)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Epipolar Geometry –

Calibrated Case

9

B. Leibe

X

x x’

Camera matrix:

[I|0]

X

= (

u

,

v

,

w

,

1

)

T

x

= (

u

,

v

,

w

)

T

Camera matrix:

[

R

T

| –

R

T

t

]

Vector

x

in second

coord. system has

coordinates

Rx

in the

first one.

t

The vectors

x

,

t

, and

Rx’

are coplanar

R


(10)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Epipolar Geometry –

Calibrated Case

10

B. Leibe

X

x x’

Slide credit: Svetlana Lazebnik

Essential Matrix

(Longuet-Higgins, 1981)

0

)]

(

[

t

R

x


(11)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Epipolar Geometry –

Uncalibrated Case

The calibration matrices

K

and

K’

of the two

cameras are unknown

We can write the epipolar constraint in terms of

unknown

normalized coordinates:

11

B. Leibe

X

x x’

Slide credit: Svetlana Lazebnik

0

ˆ

ˆ

E

x


(12)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Epipolar Geometry –

Uncalibrated Case

12

B. Leibe

X

x x’

Slide credit: Svetlana Lazebnik

Fundamental Matrix

(Faugeras and Luong, 1992)

0

ˆ

ˆ

E

x

x

T

x

K

x

x

K

x

ˆ

ˆ

1

with

0

F

K

E

K

x

F


(13)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Problem: poor numerical conditioning

Recap: The Eight-Point

Algorithm

13

B. Leibe

x

= (

u

,

v

, 1)

T

,

x’

= (

u’

,

v’

, 1)

T

Minimize:

under the constraint

|

F

|

2

= 1

2

1

)

(

i

N

i

T

i

F

x

x


(14)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Normalized Eight-Point

Algorithm

1.

Center the image data at the origin, and scale it so

the mean squared distance between the origin and

the data points is 2 pixels.

2.

Use the eight-point algorithm to compute

F

from

the normalized points.

3.

Enforce the rank-2 constraint using SVD.

4.

Transform fundamental matrix back to original

units: if T and T’ are the normalizing

transformations in the two images, than the

fundamental matrix in original coordinates is

T

T

F T’.

14

B. Leibe [Hartley, 1995]

Slide credit: Svetlana Lazebnik

11 11 13

22

33 31 33

T T

d

v

v

F

d

d

v

v

��

��

��

��

��

UDV

U

L

M O

M

L

SVD

Set d

33

to

zero and

reconstruct F


(15)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Comparison of Estimation

Algorithms

15

B. Leibe

8-point Normalized 8-point Nonlinear least squares Av. Dist. 1 2.33 pixels 0.92 pixel 0.86 pixel

Av. Dist. 2 2.18 pixels 0.85 pixel 0.80 pixel


(16)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Epipolar Transfer

Assume the epipolar geometry is known

Given projections of the same point in two

images, how can we compute the projection of

that point in a third image?

16

B. Leibe

x

1

x

2

x

3

l

32

l

31

l

31

= F

T

13

x

1

l

32

= F

T

23

x

2


(17)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Recap: Active Stereo with

Structured Light

Optical triangulation

Project a single stripe of laser light

Scan it across the surface of the object

This is a very precise version of structured light scanning

17

B. Leibe

Digital Michelangelo Project

http://graphics.stanford.edu/projects/mich/


(18)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Topics of This Lecture

Structure from Motion (SfM)

Motivation

Ambiguity

Affine SfM

Affine cameras

Affine factorization

Euclidean upgrade

Dealing with missing data

Projective SfM

Two-camera case

Projective factorization

Bundle adjustment

Practical considerations

Applications

18


(19)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Structure from Motion

Given:

m

images of

n

fixed 3D points

x

ij

=

P

i

X

j

,

i =

1

, … , m, j =

1

, … , n

Problem: estimate

m

projection matrices

P

i

and

n

3D points

X

j

from the

mn

correspondences

x

ij

19

x

1j

x

2j

x

3j

X

j

P

1

P

2

P

3

B. Leibe


(20)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

What Can We Use This For?

20

B. Leibe

E.g. movie special effects

Video


(21)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Structure from Motion

Ambiguity

If we scale the entire scene by some factor

k

and, at the same time, scale the camera matrices

by the factor of 1/

k

, the projections of the scene

points in the image remain exactly the same:

It is impossible to recover the absolute scale of

the scene!

21

B. Leibe

)

(

1

X

P

PX

x

k

k


(22)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Structure from Motion

Ambiguity

If we scale the entire scene by some factor

k

and, at the same time, scale the camera

matrices by the factor of 1/

k

, the projections

of the scene points in the image remain

exactly the same.

More generally: if we transform the scene

using a transformation

Q

and apply the

inverse transformation to the camera

matrices, then the images do not change

22

B. Leibe

Slide credit: Svetlana Lazebnik

PQ

QX

PX

x

x

PX

PQ

-1

QX

-1


(23)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Reconstruction Ambiguity:

Similarity

23

B. Leibe

PQ

Q

X

PX

x

-1

S

S


(24)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Reconstruction Ambiguity:

Affine

24

B. Leibe

Slide credit: Svetlana Lazebnik

PQ

Q

X

PX

x

-1

A

A


(25)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Reconstruction Ambiguity:

Projective

25

B. Leibe

PQ

Q

X

PX

x

-1

P

P


(26)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Projective Ambiguity

26

B. Leibe


(27)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

From Projective to Affine

27

B. Leibe


(28)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

From Affine to Similarity

28

B. Leibe


(29)

P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9

Hierarchy of 3D

Transformations

With no constraints on the camera calibration matrix or on the

scene, we get a

projective

reconstruction.

Need additional information to

upgrade

the reconstruction to

affine, similarity, or Euclidean.

29 B. Leibe

v

T

v

t

A

Projectiv

e

15dof

Affine

12dof

Similari

ty

7dof

Euclidea

n

6dof

Preserves intersection and tangency Preserves parallellism, volume ratios Preserves angles, ratios of length

1

0

t

A

T

1

0

t

R

T

s

1

0

t

R

T Preserves angles,

lengths


(30)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Topics of This Lecture

Structure from Motion (SfM)

Motivation

Ambiguity

Affine SfM

Affine cameras

Affine factorization

Euclidean upgrade

Dealing with missing data

Projective SfM

Two-camera case

Projective factorization

Bundle adjustment

Practical considerations

Applications

30


(31)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Structure from Motion

Let’s start with

affine cameras

(the math is

easier)

31

B. Leibe

center at infinity


(32)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Orthographic Projection

Special case of perspective projection

Distance from center of projection to image plane is

infinite

Projection matrix:

32

B. Leibe

Slide credit: Steve Seitz


(33)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Affine Cameras

33

B. Leibe

Orthographic Projection

Parallel Projection


(34)

P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9

Affine Cameras

A general affine camera combines the effects of

an affine transformation of the 3D space,

orthographic projection, and an affine

transformation of the image:

Affine projection is a linear mapping + translation

in inhomogeneous coordinates

34 B. Leibe

1

0

b

A

P

1

0

0

0

]

affine

4

4

[

1

0

0

0

0

0

1

0

0

0

0

1

]

affine

3

3

[

21 22 23 2

1 13 12 11

b

a

a

a

b

a

a

a

x

X

a

1

a

2

b

AX

x









2 1 23 22 21 13 12 11

b

b

Z

Y

X

a

a

a

a

a

a

y

x

Projection of

world origin


(35)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Affine Structure from Motion

Given:

m

images of

n

fixed 3D points:

x

ij

=

A

i

X

j

+

b

i

, i =

1

,… , m, j =

1

, … , n

Problem: use the

mn

correspondences x

ij

to estimate

m

projection matrices A

i

and translation vectors b

i

,

and

n

points X

j

The reconstruction is defined up to an arbitrary

affine

transformation Q (12 degrees of freedom):

We have 2

mn

knowns and 8

m

+ 3

n

unknowns (minus

12 dof for affine ambiguity).

Thus, we must have 2

mn

>= 8

m

+ 3

n

– 12.

For two views, we need four point correspondences.

35

B. Leibe









1

X

Q

1

X

,

Q

1

0

b

A

1

0

b

A

1


(36)

P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9

Affine Structure from Motion

Centering: subtract the centroid of the image

points

For simplicity, assume that the origin of the

world coordinate system is at the centroid of

the 3D points.

After centering, each normalized point x

ij

is

related to the 3D point X

i

by

36 B. Leibe

j i n k k j i n k i k i i j i n k ik ij ij

n

n

n

X

A

X

X

A

b

X

A

b

X

A

x

x

x

ˆ

1

1

1

ˆ

1 1 1

  

j

i

ij

A

X

x

ˆ


(37)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Affine Structure from Motion

Let’s create a 2

m

×

n

data (measurement)

matrix:

37

B. Leibe

mn

m

m

n

n

x

x

x

x

x

x

x

x

x

D

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ

2

1

2

22

21

1

12

11

Cameras

(2

m)

Points (n)

C. Tomasi and T. Kanade. Shape and motion from image streams under orthography:

A factorization method. IJCV, 9(2):137-154, November 1992.


(38)

P e rc e p tu a l a n d S e n s o ry A u g m e n te d C o m p u ti n g C o m p u te r V is io n W S 0 8 /0 9

Affine Structure from Motion

Let’s create a 2

m

×

n

data (measurement)

matrix:

The measurement matrix

D = MS

must have rank

3!

38

B. Leibe

C. Tomasi and T. Kanade. Shape and motion from image streams under orthography:

A factorization method. IJCV, 9(2):137-154, November 1992.

Slide credit: Svetlana Lazebnik

Cameras

(2

m × 3)

n

m

mn

m

m

n

n

X

X

X

A

A

A

x

x

x

x

x

x

x

x

x

D

2

1

2

1

2

1

2

22

21

1

12

11

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ


(39)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Factorizing the Measurement

Matrix

39

B. Leibe


(40)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Factorizing the Measurement

Matrix

Singular value decomposition of D:

40 Slide credit: Martial Hebert


(41)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Factorizing the Measurement

Matrix

Singular value decomposition of D:

41 Slide credit: Martial Hebert


(42)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Factorizing the Measurement

Matrix

Obtaining a factorization from SVD:

42 Slide credit: Martial Hebert


(43)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Factorizing the Measurement

Matrix

Obtaining a factorization from SVD:

43 Slide credit: Martial Hebert

This decomposition minimizes


(44)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Affine Ambiguity

The decomposition is not unique. We get the same D

by using any 3×3 matrix C and applying the

transformations M → MC, S →C

-1

S.

That is because we have only an affine transformation

and we have not enforced any Euclidean constraints

(like forcing the image axis to be perpendicular, for

example). We need a

Euclidean upgrade

.

44

B. Leibe


(45)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Estimating the Euclidean

Upgrade

Orthographic assumption: image axes are

perpendicular and scale is 1.

This can be converted into a system of 3

m

equations:

45

B. Leibe

x

X

a

1

a

2

a

1

· a

2

= 0

|a

1

|

2

= |a

2

|

2

= 1

Slide adapted from S. Lazebnik, M. Hebert

1 2 1 2

1 1 1

2 2 2

ˆ

ˆ

0

0

ˆ

1

1 ,

1,...,

ˆ

1

1

T T i i i i

T T

i i i

T T

i i i

a a

a CC a

a

a CC a

i

m

a

a CC a

� �


(46)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Estimating the Euclidean

Upgrade

This can be converted into a system of 3

m

equations:

Let

Then this translates to

3m

equations in L

Solve for L

Recover C from L by Cholesky decomposition: L = CCTUpdate M and S: M = MC, S = C-1S

46

B. Leibe

Slide adapted from S. Lazebnik, M. Hebert

1 2 1 2

1 1 1

2 2 2

ˆ

ˆ

0

0

ˆ

1

1 ,

1,...,

ˆ

1

1

T T i i i i

T T

i i i

T T

i i i

a a

a CC a

a

a CC a

i

m

a

a CC a

� �

1 2

,

1,...,

T

i i T

i

a

A

i

m

a

� �

� �

� �

,

1,...,

T

i

i

A LA

I

i

m

T


(47)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Algorithm Summary

Given:

m

images and

n

features x

ij

For each image

i, c

enter the feature coordinates.

Construct a 2

m

×

n

measurement matrix D:

Column

j

contains the projection of point

j

in all views

Row

i

contains one coordinate of the projections of all the

n

points in image

i

Factorize D:

Compute SVD: D = U W V

T

Create U

3

by taking the first 3 columns of U

Create V

3

by taking the first 3 columns of V

Create W

3

by taking the upper left 3 × 3 block of

W

Create the motion and shape matrices:

M = U

3

W

and S = W

V

3T

(or M = U

3

and S = W

3

V

3T

)

Eliminate affine ambiguity

47 Slide credit: Martial Hebert


(48)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Reconstruction Results

48

B. Leibe

C. Tomasi and T. Kanade. Shape and motion from image streams under orthography:

A factorization method. IJCV, 9(2):137-154, November 1992.


(49)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Dealing with Missing Data

So far, we have assumed that all points are

visible in all views

In reality, the measurement matrix typically

looks something like this:

49

B. Leibe

Cameras

Points


(50)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Dealing with Missing Data

Possible solution: decompose matrix into

dense sub-blocks, factorize each sub-block,

and fuse the results

Finding dense maximal sub-blocks of the matrix is

NP-complete (equivalent to finding maximal cliques

in a graph)

Incremental bilinear refinement

50

(1) Perform

factorization on

a dense

sub-block

F. Rothganger, S. Lazebnik, C. Schmid, and J. Ponce. Segmenting,

Modeling, and Matching Video Clips Containing Multiple Moving

Objects. PAMI 2007.


(51)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Dealing with Missing Data

Possible solution: decompose matrix into

dense sub-blocks, factorize each sub-block,

and fuse the results

Finding dense maximal sub-blocks of the matrix is

NP-complete (equivalent to finding maximal cliques

in a graph)

Incremental bilinear refinement

51

(1) Perform

factorization on

a dense

sub-block

(2) Solve for a new 3D

point visible by at

least two known

cameras (linear

least squares)

F. Rothganger, S. Lazebnik, C. Schmid, and J. Ponce. Segmenting,

Modeling, and Matching Video Clips Containing Multiple Moving

Objects. PAMI 2007.


(1)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Topics of This Lecture

Structure from Motion (SfM)

Motivation

Ambiguity

Affine SfM

Affine cameras

Affine factorization

Euclidean upgrade

Dealing with missing data

Projective SfM

Two-camera case

Projective factorization

Bundle adjustment

Practical considerations

Applications

70


(2)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Commercial Software

Packages

boujou

(http://www.2d3.com/)

PFTrack

(http://www.thepixelfarm.co.uk/)

MatchMover

(http://www.realviz.com/)

SynthEyes

(http://www.ssontech.com/)

Icarus

(http://aig.cs.man.ac.uk/research/reveal/icarus/

)

Voodoo Camera Tracker

(http://www.digilab.uni-hannover.de/)

71


(3)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

boujou demo

(We have a license available, so if you want

to try it for interesting projects, contact us.)

72


(4)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Applications: Matchmoving

73

B. Leibe

Putting virtual objects into real-world videos

Original sequence Tracked features

SfM results

Final video


(5)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

Another Example: The Campanile

Movie

74

Video from SIGGRAPH’97 Animation Theatre http://www.debevec.org/Campanile/#movie


(6)

P

e

rc

e

p

tu

a

l

a

n

d

S

e

n

s

o

ry

A

u

g

m

e

n

te

d

C

o

m

p

u

ti

n

g

C

o

m

p

u

te

r

V

is

io

n

W

S

0

8

/0

9

References and Further

Reading

A (relatively short) treatment of affine and

projective SfM and the basic ideas and

algorithms can be found in Chapters 12 and 13

of

More detailed information (if you really

want to implement this) and better

explanations can be found in Chapters 10,

18 (factorization) and 19 (self-calibration)

of

B. Leibe 75

D. Forsyth, J. Ponce,

Computer Vision – A Modern Approach. Prentice Hall, 2003

R. Hartley, A. Zisserman

Multiple View Geometry in Computer Vision 2nd Ed., Cambridge Univ. Press, 2004