Robust Geometry Estimation

(1)

Computer Vision I -

Robust Geometry Estimation from two Cameras

Carsten Rother

Computer Vision I: Image Formation Process 16/01/2015

(2)

FYI

16/01/2015

Computer Vision I: Image Formation Process 2

• Microsoft Research Cambridge

• excellent 8 week paid summer internship

• Deadline: 16

^th

January 2015

• If interested, please contact me

• The last lecture on 6.2.2015 will mainly be a Q&A session.

Please look through all lectures again and prepare questions!

(3)

Roadmap for next five lecture

• Appearance based matching (sec. 4.1)

• How do we get an RGB image? (sec 2.2-2.3)

• The Human eye

• The Camera and Image formation model

• Projective Geometry - Basics (sec. 2.1.1-2.1.4)

• Geometry of a single camera (sec 2.1.5, 2.1.6)

• Pinhole camera

• Lens effects

• Geometry of two cameras (sec. 7.2)

• Robust Geometry estimation for two cameras (sec. 6.1.4)

• Multi-View 3D reconstruction (sec. 7.3-7.4)

16/01/2015 3 Computer Vision I: Image Formation Process

(4)

Camera parameters - Summary

• Camera matrix P has 11 DoF:

• Intrinsic parameters

• Focal length 𝑓

• Pixel y-direction magnification factors 𝑚

• Skew (non-rectangular pixels) 𝑠

• Principal point coordinates (𝑝_𝑥, 𝑝_𝑦)

• Extrinsic parameters

• Rotation 𝑹 (3DoF) and translation 𝐂 (3DoF) relative to world coordinate system

16/01/2015

𝒙 = 𝑲 𝑹 (𝑰_𝟑×𝟑 | − 𝑪) 𝑿~

~

𝑲 =

𝑓 𝑠 𝑝_𝑥 0 𝑚𝑓 𝑝_𝑦

0 0 1

𝒙 = 𝑷 𝑿 Image

3D world point

𝑓

(5)

Reminder: The upcoming tasks

•

Two-view transformations we look at:

• Homography 𝐻: between two views

• Camera matrix 𝑃

(mapping from 3D to 2D)

• Fundamental matrix 𝐹

between two un-calibrated views

• Essential matrix 𝐸

between two calibrated views

• Derive geometrically: 𝐻, 𝑃, 𝐹, 𝐸, i.e. what do they mean?

• Calibration: Take primitives (points, lines, planes, cones,…)

to compute 𝐻, 𝑃, 𝐹, 𝐸 :

•

What is the minimal number of points to compute them (very important for next lecture on robust methods)

•

If we have many points with noise: what is the best way to computer them: algebraic error versus geometric error?

•

Can we derive the intrinsic (𝐾) an extrinsic (𝑅, 𝐶

) parameters from 𝐻, 𝑃, 𝐹, 𝐸

?

• What can we do with 𝐻, 𝑃, 𝐹, 𝐸? (e.g. Panoramic Stitching)

16/01/2015

Computer Vision I: Two-View Geometry 5

(6)

Homography 𝐻 : Summary

•

Derive geometrically 𝐻

•

Calibration: Take measurements (points) to compute 𝐻

•

Minimum of 4 points. Solution: right null space of 𝐴ℎ = 0

•

Many points. Use SVD to solve 𝒉

^∗ = 𝑎𝑟𝑔𝑚𝑖𝑛_ℎ 𝐴𝒉

•

Can we derive the intrinsic (𝐾) an extrinsic (𝑅, 𝐶 ) parameters from 𝐻?

-> hard. Not discussed much

•

What can we do with 𝐻 ?

-> augmented reality on planes, panoramic stitching

16/01/2015

(7)

Camera Matrix 𝑃 : Summary

•

Derive geometrically 𝑃

•

Calibration: Take measurements (points) to compute 𝑃

•

6 or more points. Use SVD to solve 𝒑

^∗ = 𝑎𝑟𝑔𝑚𝑖𝑛_ℎ 𝐴𝒑

•

Can we derive the intrinsic (𝐾) an extrinsic (𝑅, 𝐶 ) parameters from 𝐻?

-> yes, use SVD and RQ decomposition

•

What can we do with 𝑃 ?

-> very many things (robotic, photogrammetry, augmented reality, …)

16/01/2015

𝒙 = 𝑲 𝑹 (𝑰_𝟑×𝟑 | − 𝑪) 𝑿~ 𝒙 = 𝑷 𝑿

(8)

Topic 3: Fundamental/Essential Matrix 𝐹/𝐸

•

Derive geometrically 𝐹/𝐸

•

Calibration: Take measurements (points) to compute

𝐹/𝐸

•

How do we do that with a minimal number of points?

•

How do we do that with many points?

•

Can we derive the intrinsic (𝐾) an extrinsic (𝑅, 𝐶 ) parameters from 𝐹/𝐸 ?

•

What can we do with 𝐹/𝐸?

16/01/2015

(9)

Reminder: Matching two Images

16/01/2015

v

• Find interest points

• Find orientated patches around interest points to capture appearance

• Encode patch appearance in a descriptor

• Find matching patches according to appearance (similar descriptors)

• Verify matching patches according to geometry (later lecture)

We will discover in next slides:

Seven 3D points define how other 3D points match between 2 views!

(10)

Reminder: 3D Geometry

16/01/2015

Non-moving scene

𝑃’

Rigidly (6D) moving scene

Both cases are equivalent for the following derivations

(11)

Reminder: Epipolar Geometry

16/01/2015

• Epipole: Image location of the optical center of the other camera.

(Can be outside of the visible area)

(12)

Reminder: Epipolar Geometry

16/01/2015

Epipolar plane: Plane through both camera centers and world point.

(13)

Reminder: Epipolar Geometry

16/01/2015

• Epipolar line: Constrains the location where a particular point (here 𝑝₁) from one view can be found in the other.

(14)

Reminder: Epipolar Geometry

16/01/2015

• Epipolar lines:

• Intersect at the epipoles

• In general not parallel

(15)

Reminder: Example: Converging Cameras

16/01/2015

(16)

Reminder: Example: Motion Parallel to Camera

16/01/2015

• We will use this idea when it comes to stereo matching

(17)

Reminder: Example: Forward Motion

16/01/2015

• Epipoles have same coordinate in both images

• Points move along lines radiating from epipole

“focus of expansion”

(18)

The maths behind it: Fundamental/Essential Matrix

derivation on black board

16/01/2015

Computer Vision I: Robst Two-View Geometry 18

𝑋

(𝑅, 𝑇)

~

Camera 0 Camera 1

(19)

The maths behind it: Fundamental/Essential Matrix

16/01/2015

The 3 vectors are in same plane (co-planar):

1) 𝑇 = 𝐶₁ − 𝐶₀ 2) 𝑋 − 𝐶₀

3) 𝑋 − 𝐶₁

Set camera matrix: 𝑥₀ = 𝐾₀ 𝐼 0 𝑋 and 𝑥₁ = 𝐾₁𝑅⁻¹ 𝐼 −𝐶₁ 𝑋 The three vectors can be re-writting using 𝑥_0,1, 𝐾:

1) T

2) 𝐾₀⁻¹𝑥₀

3) 𝑅𝐾₁⁻¹𝑥₁ + 𝐶₁ − 𝐶₁ = 𝑅𝐾₁⁻¹𝑥₁

We know that:

𝐾₀⁻¹𝑥₀ ^𝑇 𝑇 _× 𝑅(𝐾₁⁻¹𝑥₁) = 0 which gives: 𝑥₀^𝑇𝐾₀^−𝑇 𝑇 _× 𝑅(𝐾₁⁻¹𝑥₁) = 0

~ ~ ~

~ ~

~

~ ~

𝑋~

(20)

The maths behind it: Fundamental/Essential Matrix

• In an un-calibrated setting (𝐾’𝑠 not known):

𝑥

₀^𝑇

𝐾

₀^−𝑇

𝑇

_×

𝑅(𝐾

₁⁻¹

𝑥

₁

) = 0

• In short: 𝑥

₀^𝑇

𝐹𝑥

₁

= 0 where F is called the Fundamental Matrix

(discovered by Faugeras and Luong 1992, Hartley 1992)

• In an calibrated setting (𝐾’s are known):

we use rays: 𝑥

_𝑖

= 𝐾

_𝑖⁻¹

𝑥

_𝑖

then we get: 𝑥

₀^𝑇

𝑇

_×

𝑅𝑥

₁

= 0

In short: 𝑥

₀^𝑇

𝐸𝑥

₁

= 0 where E is called the Essential Matrix

(discovered by Longuet-Higgins 1981)

16/01/2015

~

(21)

Fundamental Matrix: Properties

• We have 𝑥

₀^𝑇

𝐹𝑥

₁

= 0 where F is called the Fundamental Matrix

• It is det 𝐹 = 0. Hence F has 7 DoF Proof: 𝐹 = 𝐾

₀^−𝑇

𝑇

_×

𝑅𝐾

₁⁻¹

𝐹 has Rank 2 since 𝑇

_×

has Rank 2

16/01/2015

~

Check: det( 𝑥 _×) = 𝑥₃ 𝑥₃0 − 𝑥₁𝑥₂ + 𝑥₂ 𝑥₁𝑥₃ + 𝑥₂0 = 0

𝒙 _× =

0 −𝑥₃ 𝑥₂ 𝑥₃ 0 −𝑥₁

−𝑥₂ 𝑥₁ 0

~

(22)

Fundamental Matrix: Properties

• For any two matching points (i.e. they have the same 3D point) we have: 𝑥

₀^𝑇

𝐹𝑥

₁

= 0

• Compute epipolar line in camera 1 of a point 𝑥

₀

: 𝑙

₁^𝑇

= 𝑥

₀^𝑇

𝐹 (since 𝑙

₁^𝑇

𝑥

₁

= 𝑥

₀^𝑇

𝐹𝑥

₁

= 0)

• Compute epipolar line in camera 0 of a point 𝑥

₁

: 𝑙

₀

= 𝐹𝑥

₁

(since 𝑥

₀^𝑇

𝑙

₀

= 𝑥

₀^𝑇

𝐹𝑥

₁

= 0)

16/01/2015

Camera 0

Camera 1

(23)

𝑋

(𝑅, 𝑇)~

~

Fundamental Matrix: Properties

16/01/2015

Computer Vision I: Robust Two-View Geometry 23 Camera 0

Camera 1

• For any two matching points (i.e. have the same 3D point) we have: 𝑥

₀^𝑇

𝐹𝑥

₁

= 0

• Epipole 𝑒

₀

is the left nullspace of 𝐹 (can be computed with SVD) It is: 𝑒

₀^𝑇

𝐹𝑥

_𝑖

= 0 for all points 𝑥

_𝑖 (since all lines 𝑙₀ = 𝐹𝑥_𝑖 go through 𝑒₀^𝑇)

This means: 𝑒

₀^𝑇

𝐹 = 0

• Epipole 𝑒

₁

is right null space of 𝐹 (𝐹𝑒

₁

= 0)

(24)

How can we compute 𝐹 (2-view calibration) ?

16/01/2015

• Each pair of matching points gives one linear constraint 𝑥^𝑇𝐹𝑥^′ = 0 in 𝐹. For 𝑥, 𝑥′ we get:

• Given 𝑚 ≥ 8 matching points (𝑥^′, 𝑥) we can compute the 𝐹 in a simple way.

𝑓₁₁ 𝑓₁₂ 𝑓₁₃ 𝑓₂₁ 𝑓₂₂ 𝑓₂₃ 𝑓₃₁ 𝑓₃₂ 𝑓₃₃

= 0

𝑥₁𝑥₁^′ 𝑥₁𝑥₂^′ 𝑥₁𝑥₃^′ 𝑥₂𝑥₁^′ 𝑥₂𝑥₂^′ 𝑥₂𝑥₃^′ 𝑥₃𝑥₁^′ 𝑥₃𝑥₂^′ 𝑥₃𝑥₃^′ .

. .

(25)

How can we compute 𝐹 (2-view calibration) ?

Method (normalized 8-point algorithm):

1) Take 𝑚 ≥ 8 points

2) Compute 𝑇, and condition points: 𝑥 = 𝑇𝑥; 𝑥’ = 𝑇’𝑥’

3) Assemble 𝐴 with 𝐴𝑓 = 0, here 𝐴 is of size 𝑚 × 9, and 𝑓 vectorized 𝐹 4) Compute 𝑓^∗ = 𝑎𝑟𝑔𝑚𝑖𝑛_𝑓 𝐴𝑓 subject to 𝑓 = 1.

Use SVD to do this.

5) Get 𝐹 of unconditioned points: 𝑇^𝑇𝐹𝑇′ (note: (𝑇𝑥)^𝑇𝐹 𝑇′𝑥^′ = 0)

4) Make 𝑟𝑎𝑛𝑘 𝐹 = 2

16/01/2015

[See HZ page 282]

(26)

How to make 𝐹 Rank 2

• (Again) use SVD:

16/01/2015

Set last singular value 𝜎_𝑝−1 to 0 then 𝐴 has Rank 𝑝 − 1 and not 𝑝 (assuming 𝐴 has originally full Rank)

Proof: diagonal matrix has Rank 𝑝 − 1 hence 𝐴 has Rank 𝑝 − 1

(27)

Can we compute 𝐹 with just 7 points?

16/01/2015

Computer Vision I: Robust Two-View Geometry 27

Method (7-point algorithm):

1) Take 𝑚 = 7 points

2) Assemble 𝐴 with 𝐴𝑓 = 0, here 𝐴 is of size 7 × 9, and 𝑓 vectorized 𝐹 3) Compute 2D right null space: 𝐹₁ and 𝐹₂ from last two rows in 𝑉^𝑇

(use the SVD decomposition: 𝐴 = 𝑈𝐷𝑉^𝑇)

4) Choose: 𝐹 = 𝛼𝐹₁ + 1 − 𝛼 𝐹₂ (see comments next slide)

5) Determine 𝛼^′𝑠 (either 1 or 3 solutions for 𝛼) (see comments next slide) by using the constraint: det(𝛼𝐹₁ + 1 − 𝛼 𝐹₂) = 0.

(This is a cubic polynomial equation for 𝛼 which has one or three real-value solutions for 𝛼)

• Note an 8^th point would determine which of these 3 solutions is the correct one.

• We will see later that the 7-point algorithm is the best choice for the robust case.

(28)

Comments to previous slide

Step 4) Choose: 𝐹 = 𝛼𝐹₁ + 1 − 𝛼 𝐹₂

• The full null-space is given by: 𝐹 = 𝛼𝐹₁ + 𝛽𝐹₂

• We are free to say that we want: 𝛼𝐹₁ + 𝛽𝐹₂ ≥ 1

(here 𝐹₁, 𝐹₂ are in vectorised form. Note that this is the same as having 𝐹₁, 𝐹₂ in matrix form and using the Frobenius norm for matrices)

• It is: 𝛼 𝐹₁ + 𝛽 𝐹₂ = 𝛼𝐹₁ + 𝛽𝐹₂ ≥ 𝛼𝐹₁ + 𝛽𝐹₂ (triangulation inequality)

• Hence we want: 𝛼 𝐹₁ + 𝛽 𝐹₂ ≥ 1

• Hence we want: 𝛼 + 𝛽 ≥ 1 (since 𝐹₁, 𝐹₂ are rows in 𝑉^𝑇)

• Hence we can choose: 𝛽 = 1 − 𝛼

16/01/2015

(29)

Comments to previous slide

Step 5) Compute det(𝛼𝐹₁ + 1 − 𝛼 𝐹₂) = 0.

16/01/2015

(30)

Can we get 𝐾’𝑠, 𝑅, 𝑇 from 𝐹 ?

•

Assume we have

𝐹 = 𝑥₀^𝑇𝐾₀^−𝑇 𝑇 _× 𝑅𝐾₁⁻¹

can we get out 𝐾

₁, 𝑅, 𝐾₀, 𝑇

?

• 𝐹

has 7 DoF

• 𝐾₁, 𝑅, 𝐾₀, 𝑇

have together 16 DoF

•

Not directly possible. Only with assumptions such as:

•

External constraints

•

Camera does not change over several frames

•

This is an important topic (more than 10 years of research!) called

auto-calibration or self-calibration. We look at it in detail in next lecture.

16/01/2015

~

(31)

Coming back to Essential Matrix

• In a calibrated setting ( 𝐾 ’s are known):

we use rays: 𝑥

_𝑖

= 𝐾

_𝑖⁻¹

𝑥

_𝑖

then we get: 𝑥

₀^𝑇

𝑇

_×

𝑅𝑥

₁

= 0

In short: 𝑥

₀^𝑇

𝐸𝑥

₁

= 0 where E is called the Essential Matrix

• 𝐸 has 5 DoF , since 𝑇 has 3DoF, 𝑅 3DoF (note overall scale of 𝑇 is unknown)

• 𝐸 has also Rank 2

16/01/2015

~

𝑋

(𝑅, 𝑇)~

~

(32)

How to compute 𝐸

• We have: 𝑥

₀^𝑇

𝐸𝑥

₁

= 0 (reminder: 𝑥

₀^𝑇

𝐹𝑥

₁

= 0)

• Given 𝑚 ≥ 8 matching run 8-point algorithm (as for 𝐹)

• Given 𝑚 = 7 run 7-point algorithm and get 1 or 3 solutions

• Given 𝑚 = 5 run 5-point algorithm to get up to 10 solutions.

This is the minimal case since 𝐸 has 5 DoF.

• 5-point algorithm history:

• Kruppa, “Zur Ermittlung eines Objektes aus zwei Perspektiven mit innere

Orientierung,” Sitz.-Ber. Akad. Wiss., Wien, Math.-Naturw. Kl., Abt. IIa, (122):1939- 1948, 1913.

found 11 solutions

• M. Demazure, “Sur deux problemes de reconstruction,” Technical Report Rep. 882, INRIA, Les Chesnay, France, 1988

showed that only 10 valid solutions exist

• D. Nister, “An Efficient Solution to the Five-Point Relative Pose Problem,” IEEE Conference on Computer Vision and Pattern Recognition, Volume 2, pp. 195-202, 2003

fast method which gives out 10 solutions of a 10 degree polynomial

16/01/2015

(33)

Can we get 𝑅, 𝑇 from 𝐸 ?

•

Assume we have

𝐸 = 𝑇 _× 𝑅,

can we get out 𝑅, 𝑇 ?

•

E has 5 DoF

• 𝑅, 𝑇

have together 6 DoF

•

We can get

𝑇

up to scale, and a unique

𝑅

16/01/2015

~

𝑋

(𝑅, 𝑇)~

~

(34)

How to get a unique 𝑇, 𝑅?

1) Compute 𝑇

Note: 𝐸 has rank 2, and 𝑇 is in the left nullspace of 𝐸 since This means that an SVD of 𝐸 must look like:

2) Compute 4 possible solutions for 𝑅

3) Derive the unique solution for 𝑅 and sign for 𝑇:

1) d𝑒𝑡(𝑅) = 1

2) Reconstruct a 3D point and choose the solution where it lies in front of the two cameras. (In robust case: Take solution where most (≥ 5) points lie in front of the cameras)

16/01/2015

~

~ ~ ~

𝐸 = 𝑈𝐷𝑉^𝑇 = 𝒖₀ 𝒖₁ 𝑇

1 0 0 0 1 0 0 0 0

𝒗₀^𝑇 𝒗₁^𝑇 𝒗₂^𝑇

~

This fixes the norm of 𝑇^~ to 1, but correct sign (+/−𝑇)^~ is done in step 3

𝑅_1,2 =+/−𝑈𝑅₉₀^𝑇 ^𝑜𝑉^𝑇; 𝑅_3,4 =+/−𝑈𝑅₋₉₀^𝑇 ^𝑜𝑉^𝑇 (see derivation HZ page 259; Szeliski page 310)

where 𝐸 = 𝑈𝐷𝑉^𝑇, 𝑅₉₀ =

0 −1 0

1 0 0

0 0 0

, 𝑅₋₉₀ =

0 1 0

−1 0 0

0 0 0

~

𝑇^𝑡 𝑇 _×= (0,0,0)

(35)

Visualization of the 4 solutions for 𝑅, 𝑇

16/01/2015

The property that points must lie in front of the camera is known as Chirality (Hartley 1998) This is the correct solution, since

point is in front of cameras!

𝑇~ −𝑇~

~

(36)

What can we do with 𝐹, 𝐸 ?

• 𝐹/𝐸 encode the geometry of 2 cameras

• Can be used to find matching points (dense or sparse) between two views

• 𝐹/𝐸 encodes the essential information to do 3D reconstruction

16/01/2015

(37)

Fundamental and Essential Matrix: Summary

•

Derive geometrically 𝐹, 𝐸 ∶

• 𝐹

for un-calibrated cameras

• 𝐸

for calibrated cameras

•

Calibration: Take measurements (points) to compute 𝐹, 𝐸

• 𝐹

minimum of 7 points -> 1 or 3 real solutions.

• 𝐹

many points -> least square solution with SVD

• 𝐸

minimum of 5 points -> 10 solutions

• 𝐸

many points -> least square solution with SVD

•

Can we derive the intrinsic (𝐾) an extrinsic (𝑅, 𝑇) parameters from 𝐹, 𝐸?

-> 𝐹 next lecture

-> 𝐸 yes can be done (translation up to scale)

•

What can we do with 𝐹, 𝐸 ?

-> essential tool for 3D reconstruction

16/01/2015

(38)

Half-way slide

3 Minutes break

16/01/2015 Computer Vision I: Multi-View 3D

reconstruction 38

(39)

In last lecture we asked (for rotating camera)…

16/01/2015

Question 1: If a match is completely wrong then is a bad idea

Question 2: If a match is slightly wrong then might not be perfect.

Better might be a geometric error:

𝑎𝑟𝑔𝑚𝑖𝑛_ℎ 𝐴𝒉 𝑎𝑟𝑔𝑚𝑖𝑛_ℎ 𝐴𝒉

𝑎𝑟𝑔𝑚𝑖𝑛_ℎ 𝑯𝑥 − 𝑥′

(40)

Robust model fitting

RANSAC:

Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and

Automated Cartography

Martin A. Fischler and Robert C. Bolles (June 1981).

16/01/2015

[Side credits: Dimitri Schlesinger]

(41)

Example Tasks

16/01/2015

Search for a straight line in a clutter of points

i.e. search for parameters and for the model given a training set

𝑥 𝑦

(42)

Example Tasks

16/01/2015

Estimate the fundamental matrix

i.e. parameters satisfying

given a training set of correspondent pairs

For Homography of rotating camera we have: 𝑥_𝑙^𝑖𝐻 = 𝑥_𝑟^𝑖

(43)

Two sources of errors

16/01/2015

1. Noise: the coordinates deviate from the true ones according to some “rule” (probability) – the father away the less confident

2. Outliers: the data have nothing in common with the model to be estimated

Ignoring outliers can lead to a wrong

estimation.

→ The way out – find outliers explicitly,

estimate the model from inliers only

(44)

Task formulation

16/01/2015

Let be the input space and be the parameter space.

The training data consist of data points

Let an evaluation function be given that checks the consistency of a point with a model .

• Straight line

• Fundamental matrix

The task is to find the parameter that is consistent with the majority of the data points: 𝑦^∗ = 𝑎𝑟𝑔𝑚𝑖𝑛_𝑦 _𝑖𝑓(𝑥^𝑖, 𝑦)

𝑓 𝑥₁, 𝑥₂, 𝑎, 𝑏 = 0 𝑖𝑓 𝑎𝑥₁ + 𝑏𝑥₂ − 1 ≤ 𝑡 (𝑒. 𝑔. 0.1)

1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

𝑓 𝑥_𝑙, 𝑥_𝑟, 𝐹 = 0 𝑖𝑓 𝑥_𝑙^𝑡𝐹𝑥_𝑟 ≤ 𝑡 (𝑒. 𝑔. 0.1)

Inlier Outlier

𝑥₁ 𝑥₂

(45)

First Idea: 2D Line estimation

16/01/2015

A naïve approach: enumerate all parameter values

→ know as Hough Transform (very time consuming and not

possible at all for many free parameters (i.e. high dimensional parameter space)

𝑦^∗ = 𝑎𝑟𝑔𝑚𝑖𝑛_𝑦

𝑖

𝑓(𝑥^𝑖, 𝑦)

Question: How to compute:

Image with points

Encode all line with two parameters (𝑟, Θ)

𝑥 𝑦

𝑟 Θ

Image with just 3 points

Goal: Find “point” in the figure where most lines meet Hough transform

All lines that go through these 3 points

(46)

First Idea: 2D Line estimation

• Observation: The parameter space have very low counts

• Idea: do not try all values but only some of them. Which ones?

16/01/2015

Accumulator space number of inliers (sketched)

0 1 0 0 2 1 0 0

2 0 3 0 0 1 0 0

2 3 1 0 2 0 0 1

3 3 3 4 4 1 3 0

3 3 4 3 1 0 1 0

9 ¹⁰ 9 3 1 3 1 0

20 25 4 2 3 0 0

50 35 5 10 5 2 4 4

Θ 𝑟

200

Hough transform

(47)

Data-driven Oracle

16/01/2015

An Oracle is a function that predicts a parameter given the minimum amount of data points (𝑑-tuple):

Examples:

• Line can be estimated from 𝑑 = 2 points

• Fundamental matrix from 𝑑 = 7 or 8 points correspondences

• Homography can be computed from 𝑑 = 4 points correspondences

First Idea: Do not enumerate all parameter values but all 𝑑-tuples of data points That is then 𝑛^𝑑 number of tests, e.g. 𝑛² for lines (with 𝑛 points)

The optimization is performed over a discrete domain.

𝑦^∗ = 𝑎𝑟𝑔𝑚𝑖𝑛_𝑦

𝑖

𝑓(𝑥^𝑖, 𝑦)

Second Idea: Do not try all subsets, but sample them randomly

(48)

RANSAC

16/01/2015

Basic RANSAC method:

•

Sometimes we get a discrete set of intermediate solutions 𝑦. For example for 𝐹 -matrix computation from 7 points we have up to 3 solutions. The we simply evaluate 𝑓′ 𝑦 for all solutions.

•

How many times do you have to sample in order to reliable estimate the true model?

Can be done in parallel!

Repeat many times

select d-tuple, e.g. (𝑥

¹

, 𝑥

²

) for lines

compute parameter(s) 𝑦, e.g. line 𝑦 = 𝑔 𝑥

¹

, 𝑥

²

evaluate 𝑓′ 𝑦 =

_𝑖

𝑓(𝑥

^𝑖

, 𝑦)

If 𝑓′ 𝑦 ≤ 𝑓′ 𝑦

^∗

set 𝑦

^∗

= 𝑦 and keep value 𝑓′ 𝑦

^∗

[Random Sample Consensus, Fischler and Bolles 1981]

(49)

Convergence

16/01/2015

Observation: it is necessary to sample any -tuple of inliers just once in order to estimate the model correctly.

Let 𝜀 be the probability of outliers.

The probability to sample 𝑑 inliers is 1 − 𝜀

^𝑑

(here 0.8

²

= 0.64) The probability of a “wrong” 𝑑-tuple is (here 0.36) The probability to sample 𝑛 times only wrong tuples is

(here 0.36

²⁰

= 0.0000000013)

The probability to sample the “right” tuple at least once during the process (i.e. to estimate the correct model according to assumptions)

𝜀 ~ 0.2 200

1000 points overall

(here 99.999999866 %)

800

(50)

Convergence

16/01/2015

𝑛

𝑝𝑟𝑜𝑏 𝑎𝑏𝑖𝑡𝑦

𝜀

(outliers)

(51)

Comment

• In our derivation for 𝑝 = we were slightly optimistic since

„degenerate“ inliers may give rise to bad lines

16/01/2015

Computer Vision I: Image Formation Process 51 ++

• However, these bad lines have little support wrt number of inliers

• We also define later a refinement procedure which can correct such bad lines

(52)

The choice of the oracle is crucial

16/01/2015

Example – the fundamental matrix:

a) 8-point algorithm

Probability: 70% (𝑛 = 300; 𝜖 = 0.5; 𝑑 = 8) b) 7-point algorithm

Probability: 90% (𝑛 = 300; 𝜖 = 0.5; 𝑑 = 7) Number of trials to get p% accuracy (here 99%)

𝑛 = log 1 − 𝑝

log(1 − 1 − 𝜀 ^𝑑) 𝑝 = 1 − 1 − 1 − 𝜀 ^{𝑑 𝑛}

𝑑 𝜀

(53)

The choice of evaluation function is crucial

• Algebraic error: Is a measure that has no geometric meaning Example:

16/01/2015

For a line: 𝑑 𝑥₁, 𝑥₂, 𝑎, 𝑏 = 𝑎𝑥₁ + 𝑏𝑥₂ − 1 For a homograpy: 𝑑 𝑥₁, 𝑥₂, 𝑎, 𝑏 = |𝑨𝒉|

(where 𝑨 is 1 × 8 matrix derived as above For 𝐹-matrix: 𝑑 𝑥_𝑙, 𝑥_𝑟, 𝐹 = 𝑥_𝑙^𝑡𝐹𝑥_𝑟

• Geometric error: Is a measure that considers a distance in image plane Example: For a line: 𝑑 𝑥₁, 𝑥₂, 𝑎, 𝑏 = 𝑑( 𝑥₁, 𝑥₂ , 𝑙 𝑎, 𝑏 )

• Evaluation function: ^{𝑓 𝑥}₁^{, 𝑥}₂^{, 𝑎, 𝑏 =}¹ ^{𝑖𝑓 𝑎𝑥}¹ ^{+ 𝑏𝑥}² − 1 ≤ 𝑡 (𝑒. 𝑔. 0.1)

error function

Line: 𝑙 𝑎, 𝑏

Geometric error: for homography and F-matrix to come

(𝑑 is Euclidean distance between point to line)

𝑥₁, 𝑥₂

𝑑( 𝑥₁, 𝑥₂ , 𝑙 𝑎, 𝑏 )

(54)

The choice of confidence interval is crucial

16/01/2015

Large confidence,

“right” model, 2 outliers

Large confidence,

“wrong” model, 2 outliers

Small confidence,

Almost all points are outliers (independent of the model)

Examples:

(55)

Extension: Adaptive number of samples 𝑛

Choose 𝑛 in an adaptive way:

1) Fix 𝑝 = 99.9% (very large value)

2) Set 𝑛 = ∞ and 𝜀 = 0.9 (large value for outlier) 3) During RANSAC adapt 𝑛, 𝜀 :

1) Re-compute 𝜀 from current best solution 𝜀 = outliers / all points

2) Re-Compute new 𝑛:

𝑛 =

^{log 1−𝑝}

log(1− 1−𝜀 ^𝑑)

16/01/2015

Computer Vision I: Robst Two-View Geometry 55

(56)

MSAC (M-Estimator SAmple Consensus)

16/01/2015

If a data point is an inlier the penalty is not 0, but it depends on the

“distance” to the model.

Example for the fundamental matrix:

becomes

→ the task is to find the model with the minimum average penalty

[P.H.S. Torr und A. Zisserman 1996]

“robust function”

𝑡)

𝑓 𝑥_𝑙, 𝑥_𝑟, 𝐹 = 0 𝑖𝑓 𝑥_𝑙^𝑡𝐹𝑥_𝑟 ≤ 𝑡 (𝑒. 𝑔. 0.1)

𝑓 𝑥_𝑙, 𝑥_𝑟, 𝐹 = 𝑥_𝑙^𝑡𝐹𝑥_𝑟 𝑖𝑓 𝑥_𝑙^𝑡𝐹𝑥_𝑟 ≤ 𝑡 (𝑒. 𝑔. 0.1)

𝑡 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

𝑡

(57)

Randomized RANSAC

16/01/2015

Evaluation of a hypothesis , i.e. is often time consuming

Randomized

RANSAC:

instead of checking all data points 1. Sample 𝑚 points from

2. If all of them are good, check all others as before

3. If there is at least one bad point, among 𝑚, reject the hypothesis It is possible that good hypotheses are rejected.

However it saves time (bad hypotheses are recognized fast)

→ one can sample more often

→ overall often profitable (depends on application).

(58)

Refinement after RANSAC

Typical procedure:

1. RASNAC: compute model 𝑦 in a robust way 2. Find all inliers 𝑥

_{𝑖𝑛𝑙𝑖𝑒𝑟𝑠}

3. Refine model 𝑦 from inliers 𝑥

_{𝑖𝑛𝑙𝑖𝑒𝑟𝑠}

4. Go to Step 2.

(until numbers of inliers or model does not change much)

16/01/2015

(59)

In last lecture we asked (for rotating camera)…

16/01/2015

Question 1: If a match is completly wrong then is a bad idea Answer: RANSAC with 𝑑 = 4

Question 2: If a match is slighly wrong then might not be perfect.

Better might be a geometric error:

Answer: see next slides

𝑎𝑟𝑔𝑚𝑖𝑛_ℎ 𝐴𝒉

𝑎𝑟𝑔𝑚𝑖𝑛_ℎ 𝑯𝑥 − 𝑥′

(60)

Reminder from last Lecture: Homography for rotating camera

16/01/2015

image

Algorithm:

1) Take 𝑚 ≥ 4 point matches (𝑥, 𝑥’) 2) Assemble 𝐴 with 𝐴𝒉 = 0

3) compute 𝒉

^∗

= 𝑎𝑟𝑔𝑚𝑖𝑛

_𝒉

𝐴𝒉 subject to 𝒉 = 1,

use SVD to do this.

(61)

Refine Hypothesis 𝐻 with inliers

16/01/2015

1. Algebraic error: 𝑎𝑟𝑔𝑚𝑖𝑛

_ℎ 𝐴𝒉

2. First

geometric error:

𝐻^∗ = 𝑎𝑟𝑔𝑚𝑖𝑛_𝐻 _𝑖 𝑑(𝑥_𝑖^′, 𝐻𝑥_𝑖)

This is not symmetric

𝑥_𝑖 𝐻𝑥_𝑖 𝑥_𝑖^′

where 𝑑(𝑎, 𝑏)is 2D geometric distance 𝑎 − 𝑏 ²

(62)

Refine Hypothesis 𝐻 with inliers

16/01/2015

1. Algebraic error: 𝑎𝑟𝑔𝑚𝑖𝑛

_ℎ 𝐴𝒉

2. First

geometric error:

𝑥_𝑖 𝐻⁻¹𝑥′_𝑖 𝐻𝑥_𝑖 𝑥_𝑖^′

3. Second, symmetric geometric error:

^𝐻^∗ ^{= 𝑎𝑟𝑔𝑚𝑖𝑛}_𝐻 _𝑖^{𝑑 𝑥}_𝑖^′^{, 𝐻𝑥}_𝑖 ^{+ 𝑑 𝑥}_𝑖^{, 𝐻}⁻¹^𝑥′_𝑖

(63)

Refine Hypothesis 𝐻 with inliers

16/01/2015

1. Algebraic error: 𝑎𝑟𝑔𝑚𝑖𝑛

_ℎ 𝐴𝒉

2. First

geometric error:

3. Second, symmetric geometric error:

^𝐻^∗ ^{= 𝑎𝑟𝑔𝑚𝑖𝑛}_𝐻 _𝑖^{𝑑 𝑥}_𝑖^′^{, 𝐻𝑥}_𝑖 ^{+ 𝑑 𝑥}_𝑖^{, 𝐻}⁻¹^𝑥′_𝑖

4. Third, optimal geometric error (gold standard error):

{𝐻^∗, 𝑥_𝑖, 𝑥′_𝑖} = 𝑎𝑟𝑔𝑚𝑖𝑛 _𝑖 𝑑(𝑥_𝑖, 𝑥_𝑖) + 𝑑(𝑥′_𝑖, 𝑥′_𝑖)

𝑥_𝑖 𝑥_𝑖^′

the true 3D points 𝑋 are searched for 𝑥_𝑖 ^

^ 𝑥_𝑖^′

𝐻

Comment: This is optimal in the sense that it is the maximum-likelihood (ML) estimation under isotropic Gaussian noise assumption for 𝑥^ (see page 103 HZ)

^

^ ^

^ 𝑠𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜 𝑥^{^}_𝑖^′ = 𝐻𝑥^{^}_𝑖

𝐻, 𝑥^{^}_𝑖, 𝑥′_𝑖

^ ^

(64)

Full Homography Method (HZ page 123)

16/01/2015

[see details on page 114ff in HZ]

we discussed : Harries corner detector we discussed : Kd-tree to make it fast

This is the optimal geometric error

See next slides This is a geometric error (for fixed H, see next slides).

Depending on runtime one can choose different once.

See next slide

(65)

Example

16/01/2015

Input images

~500 interest points

(66)

Example

16/01/2015

268 putative

matches 117

outliers found

151 inliers found

262 inliers after guided matching

Guided matching Variant: use given 𝐻 and look for new inliers. Here we also double the threshold on appearance feature matches to get more inliers.

(67)

To have a 95% chance that an inlier is inside the confidence interval, we require:

1. For a 2D line: 𝑑 𝑥, 𝑙 ≤ 𝜎 3.84 = 𝑡

2. For a Homography: 𝑑 𝑥_𝑙, 𝑥_𝑟, 𝐻 ≤ 𝜎 5.99 = 𝑡 3. For an F-matrix: 𝑑 𝑥_𝑙, 𝑥_𝑟, 𝐹 ≤ 𝜎 3.84 = 𝑡

Geometric derivation of confidence interval

16/01/2015

Assume Gaussian noise for a point with 𝜎 standard deviation and 0 mean:

(see page 119 HZ)

(68)

𝑥, 𝑥′^ ^

𝑎𝑟𝑔𝑚𝑖𝑛

𝑖

𝑑(𝑥_𝑖, 𝑥_𝑖) + 𝑑(𝑥′_𝑖, 𝑥′_𝑖)

Methods for 𝐹/𝐸/𝐻 Matrix computation - Summary

Procedure (as mentioned above):

1. RASNAC: compute model 𝐹/𝐸/𝐻 in a robust way 2. Find all inliers 𝑥_{𝑖𝑛𝑙𝑖𝑒𝑟𝑠} (with potential relaxed criteria) 3. Refine model 𝐹/𝐸/𝐻 from inliers 𝑥_{𝑖𝑛𝑙𝑖𝑒𝑟𝑠}

4. Go to Step 2.

(until numbers of inliers or model does not change much)

16/01/2015

1. For a Homography: 𝑑 𝑥, 𝑥′, 𝐻 = min 𝑑 𝑥, 𝑥 + 𝑑 𝑥′, 𝑥′^ ^ subject to 𝑥′ = 𝐻𝑥^ ^

2. For an 𝐹/𝐸-matrix: 𝑑 𝑥, 𝑥′, 𝐹/𝐸 = min 𝑑 𝑥, 𝑥 + 𝑑 𝑥′, 𝑥′^ ^ subject to 𝑥^^′𝑡𝐹/𝐸𝑥 = 0^

We need geometric error for model refinement 𝐹/𝐸/𝐻 :

1. For a Homography: {𝐻^∗, 𝑥_𝑖, 𝑥_𝑖^′} = 𝑎𝑟𝑔𝑚𝑖𝑛

𝑖

𝑑(𝑥_𝑖, 𝑥_𝑖) + 𝑑(𝑥′_𝑖, 𝑥′_𝑖) subject to𝑥′_𝑖 = 𝐻𝑥_𝑖 2. For an 𝐹/𝐸-matrix: {𝐹^∗/𝐸^∗, 𝑥_𝑖, 𝑥_𝑖^′}= sbj. to 𝑥′_𝑖^𝑡𝐹/𝐸𝑥_𝑖 = 0

^ ^

^

𝐻, 𝑥_𝑖, 𝑥′^_𝑖

𝐹/𝐸, 𝑥_𝑖, 𝑥′^ _𝑖

^

^ ^

^

“we see in next lecture that this one can be computed in closed-form”

^ ^

We need geometric error for a fixed model 𝐹/𝐸/𝐻 :

^ 𝑥, 𝑥′^

(69)

A few word on iterative continuous optimization

So far we had linear (least square) optimization problems:

𝑥^∗ = 𝑎𝑟𝑔𝑚𝑖𝑛_𝑥 𝐴𝑥

For non-linear (arbitrary) optimization problems:

𝑥^∗ = 𝑎𝑟𝑔𝑚𝑖𝑛_𝑥 𝑓(𝑥)

• Iterative Estimation methods (see Appendix 6 in HZ; page 597ff)

• Gradient Descent Method

(good to get roughly to solution)

• Newton Methods (e.g. Gauss-Newton):

second order Method (Hessian). Good to find accurate result

• Levenberg – Marquardt Method:

mix of Newton method and Gradient descent

16/01/2015

Computer Vision I: Robust Two-View Geometry 69 Red Newton’s method;

green gradient descent

(70)

Application: Automatic Panoramic Stitching

16/01/2015

Run Homography search between all pairs of images An unordered set of images:

(71)

Application: Automatic Panoramic Stitching

16/01/2015

... automatically create a panorama

(72)

Application: Automatic Panoramic Stitching

16/01/2015

... automatically create a panorama

(73)

Application: Automatic Panoramic Stitching

16/01/2015