• Keine Ergebnisse gefunden

Robust Geometry Estimation

N/A
N/A
Protected

Academic year: 2022

Aktie "Robust Geometry Estimation"

Copied!
73
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Computer Vision I -

Robust Geometry Estimation from two Cameras

Carsten Rother

Computer Vision I: Image Formation Process 16/01/2015

(2)

FYI

16/01/2015

Computer Vision I: Image Formation Process 2

β€’ Microsoft Research Cambridge

β€’ excellent 8 week paid summer internship

β€’ Deadline: 16

th

January 2015

β€’ If interested, please contact me

β€’ The last lecture on 6.2.2015 will mainly be a Q&A session.

Please look through all lectures again and prepare questions!

(3)

Roadmap for next five lecture

β€’ Appearance based matching (sec. 4.1)

β€’ How do we get an RGB image? (sec 2.2-2.3)

β€’ The Human eye

β€’ The Camera and Image formation model

β€’ Projective Geometry - Basics (sec. 2.1.1-2.1.4)

β€’ Geometry of a single camera (sec 2.1.5, 2.1.6)

β€’ Pinhole camera

β€’ Lens effects

β€’ Geometry of two cameras (sec. 7.2)

β€’ Robust Geometry estimation for two cameras (sec. 6.1.4)

β€’ Multi-View 3D reconstruction (sec. 7.3-7.4)

16/01/2015 3 Computer Vision I: Image Formation Process

(4)

Camera parameters - Summary

β€’ Camera matrix P has 11 DoF:

β€’ Intrinsic parameters

β€’ Focal length 𝑓

β€’ Pixel y-direction magnification factors π‘š

β€’ Skew (non-rectangular pixels) 𝑠

β€’ Principal point coordinates (𝑝π‘₯, 𝑝𝑦)

β€’ Extrinsic parameters

β€’ Rotation 𝑹 (3DoF) and translation 𝐂 (3DoF) relative to world coordinate system

16/01/2015

Computer Vision I: Image Formation Process 4

𝒙 = 𝑲 𝑹 (π‘°πŸ‘Γ—πŸ‘ | βˆ’ π‘ͺ) 𝑿~

~

𝑲 =

𝑓 𝑠 𝑝π‘₯ 0 π‘šπ‘“ 𝑝𝑦

0 0 1

𝒙 = 𝑷 𝑿 Image

3D world point

𝑓

(5)

Reminder: The upcoming tasks

β€’

Two-view transformations we look at:

β€’ Homography 𝐻: between two views

β€’ Camera matrix 𝑃

(mapping from 3D to 2D)

β€’ Fundamental matrix 𝐹

between two un-calibrated views

β€’ Essential matrix 𝐸

between two calibrated views

β€’ Derive geometrically: 𝐻, 𝑃, 𝐹, 𝐸, i.e. what do they mean?

β€’ Calibration: Take primitives (points, lines, planes, cones,…)

to compute 𝐻, 𝑃, 𝐹, 𝐸 :

β€’

What is the minimal number of points to compute them (very important for next lecture on robust methods)

β€’

If we have many points with noise: what is the best way to computer them: algebraic error versus geometric error?

β€’

Can we derive the intrinsic (𝐾) an extrinsic (𝑅, 𝐢

) parameters from 𝐻, 𝑃, 𝐹, 𝐸

?

β€’ What can we do with 𝐻, 𝑃, 𝐹, 𝐸? (e.g. Panoramic Stitching)

16/01/2015

Computer Vision I: Two-View Geometry 5

(6)

Homography 𝐻 : Summary

β€’

Derive geometrically 𝐻

β€’

Calibration: Take measurements (points) to compute 𝐻

β€’

Minimum of 4 points. Solution: right null space of π΄β„Ž = 0

β€’

Many points. Use SVD to solve 𝒉

βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›β„Ž 𝐴𝒉

β€’

Can we derive the intrinsic (𝐾) an extrinsic (𝑅, 𝐢 ) parameters from 𝐻?

-> hard. Not discussed much

β€’

What can we do with 𝐻 ?

-> augmented reality on planes, panoramic stitching

16/01/2015

Computer Vision I: Two-View Geometry 6

(7)

Camera Matrix 𝑃 : Summary

β€’

Derive geometrically 𝑃

β€’

Calibration: Take measurements (points) to compute 𝑃

β€’

6 or more points. Use SVD to solve 𝒑

βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›β„Ž 𝐴𝒑

β€’

Can we derive the intrinsic (𝐾) an extrinsic (𝑅, 𝐢 ) parameters from 𝐻?

-> yes, use SVD and RQ decomposition

β€’

What can we do with 𝑃 ?

-> very many things (robotic, photogrammetry, augmented reality, …)

16/01/2015

Computer Vision I: Two-View Geometry 7

𝒙 = 𝑲 𝑹 (π‘°πŸ‘Γ—πŸ‘ | βˆ’ π‘ͺ) 𝑿~ 𝒙 = 𝑷 𝑿

(8)

Topic 3: Fundamental/Essential Matrix 𝐹/𝐸

β€’

Derive geometrically 𝐹/𝐸

β€’

Calibration: Take measurements (points) to compute

𝐹/𝐸

β€’

How do we do that with a minimal number of points?

β€’

How do we do that with many points?

β€’

Can we derive the intrinsic (𝐾) an extrinsic (𝑅, 𝐢 ) parameters from 𝐹/𝐸 ?

β€’

What can we do with 𝐹/𝐸?

16/01/2015

Computer Vision I: Two-View Geometry 8

(9)

Reminder: Matching two Images

16/01/2015

Computer Vision I: Image Formation Process 9

v

β€’ Find interest points

β€’ Find orientated patches around interest points to capture appearance

β€’ Encode patch appearance in a descriptor

β€’ Find matching patches according to appearance (similar descriptors)

β€’ Verify matching patches according to geometry (later lecture)

We will discover in next slides:

Seven 3D points define how other 3D points match between 2 views!

(10)

Reminder: 3D Geometry

16/01/2015

Computer Vision I: Two-View Geometry 10

Non-moving scene

𝑃’

Rigidly (6D) moving scene

Both cases are equivalent for the following derivations

(11)

Reminder: Epipolar Geometry

16/01/2015

Computer Vision I: Two-View Geometry 11

β€’ Epipole: Image location of the optical center of the other camera.

(Can be outside of the visible area)

(12)

Reminder: Epipolar Geometry

16/01/2015

Computer Vision I: Two-View Geometry 12

Epipolar plane: Plane through both camera centers and world point.

(13)

Reminder: Epipolar Geometry

16/01/2015

Computer Vision I: Two-View Geometry 13

β€’ Epipolar line: Constrains the location where a particular point (here 𝑝1) from one view can be found in the other.

(14)

Reminder: Epipolar Geometry

16/01/2015

Computer Vision I: Two-View Geometry 14

β€’ Epipolar lines:

β€’ Intersect at the epipoles

β€’ In general not parallel

(15)

Reminder: Example: Converging Cameras

16/01/2015

Computer Vision I: Two-View Geometry 15

(16)

Reminder: Example: Motion Parallel to Camera

16/01/2015

Computer Vision I: Two-View Geometry 16

β€’ We will use this idea when it comes to stereo matching

(17)

Reminder: Example: Forward Motion

16/01/2015

Computer Vision I: Two-View Geometry 17

β€’ Epipoles have same coordinate in both images

β€’ Points move along lines radiating from epipole

β€œfocus of expansion”

(18)

The maths behind it: Fundamental/Essential Matrix

derivation on black board

16/01/2015

Computer Vision I: Robst Two-View Geometry 18

𝑋

(𝑅, 𝑇)

~

~

Camera 0 Camera 1

(19)

The maths behind it: Fundamental/Essential Matrix

16/01/2015

Computer Vision I: Two-View Geometry 19

The 3 vectors are in same plane (co-planar):

1) 𝑇 = 𝐢1 βˆ’ 𝐢0 2) 𝑋 βˆ’ 𝐢0

3) 𝑋 βˆ’ 𝐢1

Set camera matrix: π‘₯0 = 𝐾0 𝐼 0 𝑋 and π‘₯1 = 𝐾1π‘…βˆ’1 𝐼 βˆ’πΆ1 𝑋 The three vectors can be re-writting using π‘₯0,1, 𝐾:

1) T

2) 𝐾0βˆ’1π‘₯0

3) 𝑅𝐾1βˆ’1π‘₯1 + 𝐢1 βˆ’ 𝐢1 = 𝑅𝐾1βˆ’1π‘₯1

We know that:

𝐾0βˆ’1π‘₯0 𝑇 𝑇 Γ— 𝑅(𝐾1βˆ’1π‘₯1) = 0 which gives: π‘₯0𝑇𝐾0βˆ’π‘‡ 𝑇 Γ— 𝑅(𝐾1βˆ’1π‘₯1) = 0

~ ~ ~

~ ~

~ ~

~

~

~ ~

~ ~

𝑋~

(20)

The maths behind it: Fundamental/Essential Matrix

β€’ In an un-calibrated setting (𝐾’𝑠 not known):

π‘₯

0𝑇

𝐾

0βˆ’π‘‡

𝑇

Γ—

𝑅(𝐾

1βˆ’1

π‘₯

1

) = 0

β€’ In short: π‘₯

0𝑇

𝐹π‘₯

1

= 0 where F is called the Fundamental Matrix

(discovered by Faugeras and Luong 1992, Hartley 1992)

β€’ In an calibrated setting (𝐾’s are known):

we use rays: π‘₯

𝑖

= 𝐾

π‘–βˆ’1

π‘₯

𝑖

then we get: π‘₯

0𝑇

𝑇

Γ—

𝑅π‘₯

1

= 0

In short: π‘₯

0𝑇

𝐸π‘₯

1

= 0 where E is called the Essential Matrix

(discovered by Longuet-Higgins 1981)

16/01/2015

Computer Vision I: Two-View Geometry 20

~

~

(21)

Fundamental Matrix: Properties

β€’ We have π‘₯

0𝑇

𝐹π‘₯

1

= 0 where F is called the Fundamental Matrix

β€’ It is det 𝐹 = 0. Hence F has 7 DoF Proof: 𝐹 = 𝐾

0βˆ’π‘‡

𝑇

Γ—

𝑅𝐾

1βˆ’1

𝐹 has Rank 2 since 𝑇

Γ—

has Rank 2

16/01/2015

Computer Vision I: Two-View Geometry 21

~

Check: det( π‘₯ Γ—) = π‘₯3 π‘₯30 βˆ’ π‘₯1π‘₯2 + π‘₯2 π‘₯1π‘₯3 + π‘₯20 = 0

𝒙 Γ— =

0 βˆ’π‘₯3 π‘₯2 π‘₯3 0 βˆ’π‘₯1

βˆ’π‘₯2 π‘₯1 0

~

(22)

Fundamental Matrix: Properties

β€’ For any two matching points (i.e. they have the same 3D point) we have: π‘₯

0𝑇

𝐹π‘₯

1

= 0

β€’ Compute epipolar line in camera 1 of a point π‘₯

0

: 𝑙

1𝑇

= π‘₯

0𝑇

𝐹 (since 𝑙

1𝑇

π‘₯

1

= π‘₯

0𝑇

𝐹π‘₯

1

= 0)

β€’ Compute epipolar line in camera 0 of a point π‘₯

1

: 𝑙

0

= 𝐹π‘₯

1

(since π‘₯

0𝑇

𝑙

0

= π‘₯

0𝑇

𝐹π‘₯

1

= 0)

16/01/2015

Computer Vision I: Two-View Geometry 22

Camera 0

Camera 1

(23)

𝑋

(𝑅, 𝑇)~

~

Fundamental Matrix: Properties

16/01/2015

Computer Vision I: Robust Two-View Geometry 23 Camera 0

Camera 1

β€’ For any two matching points (i.e. have the same 3D point) we have: π‘₯

0𝑇

𝐹π‘₯

1

= 0

β€’ Epipole 𝑒

0

is the left nullspace of 𝐹 (can be computed with SVD) It is: 𝑒

0𝑇

𝐹π‘₯

𝑖

= 0 for all points π‘₯

𝑖 (since all lines 𝑙0 = 𝐹π‘₯𝑖 go through 𝑒0𝑇)

This means: 𝑒

0𝑇

𝐹 = 0

β€’ Epipole 𝑒

1

is right null space of 𝐹 (𝐹𝑒

1

= 0)

(24)

How can we compute 𝐹 (2-view calibration) ?

16/01/2015

Computer Vision I: Two-View Geometry 24

β€’ Each pair of matching points gives one linear constraint π‘₯𝑇𝐹π‘₯β€² = 0 in 𝐹. For π‘₯, π‘₯β€² we get:

β€’ Given π‘š β‰₯ 8 matching points (π‘₯β€², π‘₯) we can compute the 𝐹 in a simple way.

𝑓11 𝑓12 𝑓13 𝑓21 𝑓22 𝑓23 𝑓31 𝑓32 𝑓33

= 0

π‘₯1π‘₯1β€² π‘₯1π‘₯2β€² π‘₯1π‘₯3β€² π‘₯2π‘₯1β€² π‘₯2π‘₯2β€² π‘₯2π‘₯3β€² π‘₯3π‘₯1β€² π‘₯3π‘₯2β€² π‘₯3π‘₯3β€² .

. .

(25)

How can we compute 𝐹 (2-view calibration) ?

Method (normalized 8-point algorithm):

1) Take π‘š β‰₯ 8 points

2) Compute 𝑇, and condition points: π‘₯ = 𝑇π‘₯; π‘₯’ = 𝑇’π‘₯’

3) Assemble 𝐴 with 𝐴𝑓 = 0, here 𝐴 is of size π‘š Γ— 9, and 𝑓 vectorized 𝐹 4) Compute π‘“βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›π‘“ 𝐴𝑓 subject to 𝑓 = 1.

Use SVD to do this.

5) Get 𝐹 of unconditioned points: 𝑇𝑇𝐹𝑇′ (note: (𝑇π‘₯)𝑇𝐹 𝑇′π‘₯β€² = 0)

4) Make π‘Ÿπ‘Žπ‘›π‘˜ 𝐹 = 2

16/01/2015

Computer Vision I: Two-View Geometry 25

[See HZ page 282]

(26)

How to make 𝐹 Rank 2

β€’ (Again) use SVD:

16/01/2015

Computer Vision I: Two-View Geometry 26

Set last singular value πœŽπ‘βˆ’1 to 0 then 𝐴 has Rank 𝑝 βˆ’ 1 and not 𝑝 (assuming 𝐴 has originally full Rank)

Proof: diagonal matrix has Rank 𝑝 βˆ’ 1 hence 𝐴 has Rank 𝑝 βˆ’ 1

(27)

Can we compute 𝐹 with just 7 points?

16/01/2015

Computer Vision I: Robust Two-View Geometry 27

Method (7-point algorithm):

1) Take π‘š = 7 points

2) Assemble 𝐴 with 𝐴𝑓 = 0, here 𝐴 is of size 7 Γ— 9, and 𝑓 vectorized 𝐹 3) Compute 2D right null space: 𝐹1 and 𝐹2 from last two rows in 𝑉𝑇

(use the SVD decomposition: 𝐴 = π‘ˆπ·π‘‰π‘‡)

4) Choose: 𝐹 = 𝛼𝐹1 + 1 βˆ’ 𝛼 𝐹2 (see comments next slide)

5) Determine 𝛼′𝑠 (either 1 or 3 solutions for 𝛼) (see comments next slide) by using the constraint: det(𝛼𝐹1 + 1 βˆ’ 𝛼 𝐹2) = 0.

(This is a cubic polynomial equation for 𝛼 which has one or three real-value solutions for 𝛼)

β€’ Note an 8th point would determine which of these 3 solutions is the correct one.

β€’ We will see later that the 7-point algorithm is the best choice for the robust case.

(28)

Comments to previous slide

Step 4) Choose: 𝐹 = 𝛼𝐹1 + 1 βˆ’ 𝛼 𝐹2

β€’ The full null-space is given by: 𝐹 = 𝛼𝐹1 + 𝛽𝐹2

β€’ We are free to say that we want: 𝛼𝐹1 + 𝛽𝐹2 β‰₯ 1

(here 𝐹1, 𝐹2 are in vectorised form. Note that this is the same as having 𝐹1, 𝐹2 in matrix form and using the Frobenius norm for matrices)

β€’ It is: 𝛼 𝐹1 + 𝛽 𝐹2 = 𝛼𝐹1 + 𝛽𝐹2 β‰₯ 𝛼𝐹1 + 𝛽𝐹2 (triangulation inequality)

β€’ Hence we want: 𝛼 𝐹1 + 𝛽 𝐹2 β‰₯ 1

β€’ Hence we want: 𝛼 + 𝛽 β‰₯ 1 (since 𝐹1, 𝐹2 are rows in 𝑉𝑇)

β€’ Hence we can choose: 𝛽 = 1 βˆ’ 𝛼

16/01/2015

Computer Vision I: Image Formation Process 28

(29)

Comments to previous slide

Step 5) Compute det(𝛼𝐹1 + 1 βˆ’ 𝛼 𝐹2) = 0.

16/01/2015

Computer Vision I: Image Formation Process 29

(30)

Can we get 𝐾’𝑠, 𝑅, 𝑇 from 𝐹 ?

β€’

Assume we have

𝐹 = π‘₯0𝑇𝐾0βˆ’π‘‡ 𝑇 Γ— 𝑅𝐾1βˆ’1

can we get out 𝐾

1, 𝑅, 𝐾0, 𝑇

?

β€’ 𝐹

has 7 DoF

β€’ 𝐾1, 𝑅, 𝐾0, 𝑇

have together 16 DoF

β€’

Not directly possible. Only with assumptions such as:

β€’

External constraints

β€’

Camera does not change over several frames

β€’

This is an important topic (more than 10 years of research!) called

auto-calibration or self-calibration. We look at it in detail in next lecture.

16/01/2015

Computer Vision I: Robust Two-View Geometry 30

~

~

~

(31)

Coming back to Essential Matrix

β€’ In a calibrated setting ( 𝐾 ’s are known):

we use rays: π‘₯

𝑖

= 𝐾

π‘–βˆ’1

π‘₯

𝑖

then we get: π‘₯

0𝑇

𝑇

Γ—

𝑅π‘₯

1

= 0

In short: π‘₯

0𝑇

𝐸π‘₯

1

= 0 where E is called the Essential Matrix

β€’ 𝐸 has 5 DoF , since 𝑇 has 3DoF, 𝑅 3DoF (note overall scale of 𝑇 is unknown)

β€’ 𝐸 has also Rank 2

16/01/2015

Computer Vision I: Robust Two-View Geometry 31

~

~

𝑋

(𝑅, 𝑇)~

~

~

(32)

How to compute 𝐸

β€’ We have: π‘₯

0𝑇

𝐸π‘₯

1

= 0 (reminder: π‘₯

0𝑇

𝐹π‘₯

1

= 0)

β€’ Given π‘š β‰₯ 8 matching run 8-point algorithm (as for 𝐹)

β€’ Given π‘š = 7 run 7-point algorithm and get 1 or 3 solutions

β€’ Given π‘š = 5 run 5-point algorithm to get up to 10 solutions.

This is the minimal case since 𝐸 has 5 DoF.

β€’ 5-point algorithm history:

β€’ Kruppa, β€œZur Ermittlung eines Objektes aus zwei Perspektiven mit innere

Orientierung,” Sitz.-Ber. Akad. Wiss., Wien, Math.-Naturw. Kl., Abt. IIa, (122):1939- 1948, 1913.

found 11 solutions

β€’ M. Demazure, β€œSur deux problemes de reconstruction,” Technical Report Rep. 882, INRIA, Les Chesnay, France, 1988

showed that only 10 valid solutions exist

β€’ D. Nister, β€œAn Efficient Solution to the Five-Point Relative Pose Problem,” IEEE Conference on Computer Vision and Pattern Recognition, Volume 2, pp. 195-202, 2003

fast method which gives out 10 solutions of a 10 degree polynomial

16/01/2015

Computer Vision I: Robust Two-View Geometry 32

(33)

Can we get 𝑅, 𝑇 from 𝐸 ?

β€’

Assume we have

𝐸 = 𝑇 Γ— 𝑅,

can we get out 𝑅, 𝑇 ?

β€’

E has 5 DoF

β€’ 𝑅, 𝑇

have together 6 DoF

β€’

We can get

𝑇

up to scale, and a unique

𝑅

16/01/2015

Computer Vision I: Robust Two-View Geometry 33

~

~

~

~

~

𝑋

(𝑅, 𝑇)~

~

(34)

How to get a unique 𝑇, 𝑅?

1) Compute 𝑇

Note: 𝐸 has rank 2, and 𝑇 is in the left nullspace of 𝐸 since This means that an SVD of 𝐸 must look like:

2) Compute 4 possible solutions for 𝑅

3) Derive the unique solution for 𝑅 and sign for 𝑇:

1) d𝑒𝑑(𝑅) = 1

2) Reconstruct a 3D point and choose the solution where it lies in front of the two cameras. (In robust case: Take solution where most (β‰₯ 5) points lie in front of the cameras)

16/01/2015

Computer Vision I: Image Formation Process 34

~

~

~ ~ ~

𝐸 = π‘ˆπ·π‘‰π‘‡ = 𝒖0 𝒖1 𝑇

1 0 0 0 1 0 0 0 0

𝒗0𝑇 𝒗1𝑇 𝒗2𝑇

~

This fixes the norm of 𝑇~ to 1, but correct sign (+/βˆ’π‘‡)~ is done in step 3

𝑅1,2 =+/βˆ’π‘ˆπ‘…90𝑇 π‘œπ‘‰π‘‡; 𝑅3,4 =+/βˆ’π‘ˆπ‘…βˆ’90𝑇 π‘œπ‘‰π‘‡ (see derivation HZ page 259; Szeliski page 310)

where 𝐸 = π‘ˆπ·π‘‰π‘‡, 𝑅90 =

0 βˆ’1 0

1 0 0

0 0 0

, π‘…βˆ’90 =

0 1 0

βˆ’1 0 0

0 0 0

~

𝑇𝑑 𝑇 Γ—= (0,0,0)

(35)

Visualization of the 4 solutions for 𝑅, 𝑇

16/01/2015

Computer Vision I: Robust Two-View Geometry 35

The property that points must lie in front of the camera is known as Chirality (Hartley 1998) This is the correct solution, since

point is in front of cameras!

𝑇~ βˆ’π‘‡~

𝑇~ βˆ’π‘‡~

~

(36)

What can we do with 𝐹, 𝐸 ?

β€’ 𝐹/𝐸 encode the geometry of 2 cameras

β€’ Can be used to find matching points (dense or sparse) between two views

β€’ 𝐹/𝐸 encodes the essential information to do 3D reconstruction

16/01/2015

Computer Vision I: Robust Two-View Geometry 36

(37)

Fundamental and Essential Matrix: Summary

β€’

Derive geometrically 𝐹, 𝐸 ∢

β€’ 𝐹

for un-calibrated cameras

β€’ 𝐸

for calibrated cameras

β€’

Calibration: Take measurements (points) to compute 𝐹, 𝐸

β€’ 𝐹

minimum of 7 points -> 1 or 3 real solutions.

β€’ 𝐹

many points -> least square solution with SVD

β€’ 𝐸

minimum of 5 points -> 10 solutions

β€’ 𝐸

many points -> least square solution with SVD

β€’

Can we derive the intrinsic (𝐾) an extrinsic (𝑅, 𝑇) parameters from 𝐹, 𝐸?

-> 𝐹 next lecture

-> 𝐸 yes can be done (translation up to scale)

β€’

What can we do with 𝐹, 𝐸 ?

-> essential tool for 3D reconstruction

16/01/2015

Computer Vision I: Robust Two-View Geometry 37

(38)

Half-way slide

3 Minutes break

16/01/2015 Computer Vision I: Multi-View 3D

reconstruction 38

(39)

In last lecture we asked (for rotating camera)…

16/01/2015

Computer Vision I: Robust Two-View Geometry 39

Question 1: If a match is completely wrong then is a bad idea

Question 2: If a match is slightly wrong then might not be perfect.

Better might be a geometric error:

π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›β„Ž 𝐴𝒉 π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›β„Ž 𝐴𝒉

π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›β„Ž 𝑯π‘₯ βˆ’ π‘₯β€²

(40)

Robust model fitting

RANSAC:

Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and

Automated Cartography

Martin A. Fischler and Robert C. Bolles (June 1981).

16/01/2015

Computer Vision I: Robust Two-View Geometry 40

[Side credits: Dimitri Schlesinger]

(41)

Example Tasks

16/01/2015

Computer Vision I: Robust Two-View Geometry 41

Search for a straight line in a clutter of points

i.e. search for parameters and for the model given a training set

π‘₯ 𝑦

(42)

Example Tasks

16/01/2015

Computer Vision I: Robust Two-View Geometry 42

Estimate the fundamental matrix

i.e. parameters satisfying

given a training set of correspondent pairs

For Homography of rotating camera we have: π‘₯𝑙𝑖𝐻 = π‘₯π‘Ÿπ‘–

(43)

Two sources of errors

16/01/2015

Computer Vision I: Robust Two-View Geometry 43

1. Noise: the coordinates deviate from the true ones according to some β€œrule” (probability) – the father away the less confident

2. Outliers: the data have nothing in common with the model to be estimated

Ignoring outliers can lead to a wrong

estimation.

β†’ The way out – find outliers explicitly,

estimate the model from inliers only

(44)

Task formulation

16/01/2015

Computer Vision I: Robust Two-View Geometry 44

Let be the input space and be the parameter space.

The training data consist of data points

Let an evaluation function be given that checks the consistency of a point with a model .

β€’ Straight line

β€’ Fundamental matrix

The task is to find the parameter that is consistent with the majority of the data points: π‘¦βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›π‘¦ 𝑖𝑓(π‘₯𝑖, 𝑦)

𝑓 π‘₯1, π‘₯2, π‘Ž, 𝑏 = 0 𝑖𝑓 π‘Žπ‘₯1 + 𝑏π‘₯2 βˆ’ 1 ≀ 𝑑 (𝑒. 𝑔. 0.1)

1 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’

𝑓 π‘₯𝑙, π‘₯π‘Ÿ, 𝐹 = 0 𝑖𝑓 π‘₯𝑙𝑑𝐹π‘₯π‘Ÿ ≀ 𝑑 (𝑒. 𝑔. 0.1)

1 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’

Inlier Outlier

π‘₯1 π‘₯2

(45)

First Idea: 2D Line estimation

16/01/2015

Computer Vision I: Robust Two-View Geometry 45

A naΓ―ve approach: enumerate all parameter values

β†’ know as Hough Transform (very time consuming and not

possible at all for many free parameters (i.e. high dimensional parameter space)

π‘¦βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›π‘¦

𝑖

𝑓(π‘₯𝑖, 𝑦)

Question: How to compute:

Image with points

Encode all line with two parameters (π‘Ÿ, Θ)

π‘₯ 𝑦

π‘Ÿ Θ

Image with just 3 points

Goal: Find β€œpoint” in the figure where most lines meet Hough transform

All lines that go through these 3 points

(46)

First Idea: 2D Line estimation

β€’ Observation: The parameter space have very low counts

β€’ Idea: do not try all values but only some of them. Which ones?

16/01/2015

Computer Vision I: Robust Two-View Geometry 46

Accumulator space number of inliers (sketched)

0 1 0 0 2 1 0 0

2 0 3 0 0 1 0 0

2 3 1 0 2 0 0 1

3 3 3 4 4 1 3 0

3 3 4 3 1 0 1 0

9 10 9 3 1 3 1 0

20 25 4 2 3 0 0

50 35 5 10 5 2 4 4

Θ π‘Ÿ

200

Hough transform

(47)

Data-driven Oracle

16/01/2015

Computer Vision I: Robust Two-View Geometry 47

An Oracle is a function that predicts a parameter given the minimum amount of data points (𝑑-tuple):

Examples:

β€’ Line can be estimated from 𝑑 = 2 points

β€’ Fundamental matrix from 𝑑 = 7 or 8 points correspondences

β€’ Homography can be computed from 𝑑 = 4 points correspondences

First Idea: Do not enumerate all parameter values but all 𝑑-tuples of data points That is then 𝑛𝑑 number of tests, e.g. 𝑛2 for lines (with 𝑛 points)

The optimization is performed over a discrete domain.

π‘¦βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›π‘¦

𝑖

𝑓(π‘₯𝑖, 𝑦)

Second Idea: Do not try all subsets, but sample them randomly

(48)

RANSAC

16/01/2015

Computer Vision I: Robust Two-View Geometry 48

Basic RANSAC method:

β€’

Sometimes we get a discrete set of intermediate solutions 𝑦. For example for 𝐹 -matrix computation from 7 points we have up to 3 solutions. The we simply evaluate 𝑓′ 𝑦 for all solutions.

β€’

How many times do you have to sample in order to reliable estimate the true model?

Can be done in parallel!

Repeat many times

select d-tuple, e.g. (π‘₯

1

, π‘₯

2

) for lines

compute parameter(s) 𝑦, e.g. line 𝑦 = 𝑔 π‘₯

1

, π‘₯

2

evaluate 𝑓′ 𝑦 =

𝑖

𝑓(π‘₯

𝑖

, 𝑦)

If 𝑓′ 𝑦 ≀ 𝑓′ 𝑦

βˆ—

set 𝑦

βˆ—

= 𝑦 and keep value 𝑓′ 𝑦

βˆ—

[Random Sample Consensus, Fischler and Bolles 1981]

(49)

Convergence

16/01/2015

Computer Vision I: Robust Two-View Geometry 49

Observation: it is necessary to sample any -tuple of inliers just once in order to estimate the model correctly.

Let πœ€ be the probability of outliers.

The probability to sample 𝑑 inliers is 1 βˆ’ πœ€

𝑑

(here 0.8

2

= 0.64) The probability of a β€œwrong” 𝑑-tuple is (here 0.36) The probability to sample 𝑛 times only wrong tuples is

(here 0.36

20

= 0.0000000013)

The probability to sample the β€œright” tuple at least once during the process (i.e. to estimate the correct model according to assumptions)

πœ€ ~ 0.2 200

1000 points overall

(here 99.999999866 %)

800

(50)

Convergence

16/01/2015

Computer Vision I: Robust Two-View Geometry 50

𝑛

π‘π‘Ÿπ‘œπ‘ π‘Žπ‘π‘–π‘‘π‘¦

πœ€

(outliers)

(51)

Comment

β€’ In our derivation for 𝑝 = we were slightly optimistic since

β€ždegenerateβ€œ inliers may give rise to bad lines

16/01/2015

Computer Vision I: Image Formation Process 51 ++

β€’ However, these bad lines have little support wrt number of inliers

β€’ We also define later a refinement procedure which can correct such bad lines

(52)

The choice of the oracle is crucial

16/01/2015

Computer Vision I: Robust Two-View Geometry 52

Example – the fundamental matrix:

a) 8-point algorithm

Probability: 70% (𝑛 = 300; πœ– = 0.5; 𝑑 = 8) b) 7-point algorithm

Probability: 90% (𝑛 = 300; πœ– = 0.5; 𝑑 = 7) Number of trials to get p% accuracy (here 99%)

𝑛 = log 1 βˆ’ 𝑝

log(1 βˆ’ 1 βˆ’ πœ€ 𝑑) 𝑝 = 1 βˆ’ 1 βˆ’ 1 βˆ’ πœ€ 𝑑 𝑛

𝑑 πœ€

(53)

The choice of evaluation function is crucial

β€’ Algebraic error: Is a measure that has no geometric meaning Example:

16/01/2015

Computer Vision I: Image Formation Process 53

For a line: 𝑑 π‘₯1, π‘₯2, π‘Ž, 𝑏 = π‘Žπ‘₯1 + 𝑏π‘₯2 βˆ’ 1 For a homograpy: 𝑑 π‘₯1, π‘₯2, π‘Ž, 𝑏 = |𝑨𝒉|

(where 𝑨 is 1 Γ— 8 matrix derived as above For 𝐹-matrix: 𝑑 π‘₯𝑙, π‘₯π‘Ÿ, 𝐹 = π‘₯𝑙𝑑𝐹π‘₯π‘Ÿ

β€’ Geometric error: Is a measure that considers a distance in image plane Example: For a line: 𝑑 π‘₯1, π‘₯2, π‘Ž, 𝑏 = 𝑑( π‘₯1, π‘₯2 , 𝑙 π‘Ž, 𝑏 )

β€’ Evaluation function: 𝑓 π‘₯1, π‘₯2, π‘Ž, 𝑏 = 1 𝑖𝑓 π‘Žπ‘₯1 + 𝑏π‘₯2 βˆ’ 1 ≀ 𝑑 (𝑒. 𝑔. 0.1)

0 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’

error function

Line: 𝑙 π‘Ž, 𝑏

Geometric error: for homography and F-matrix to come

(𝑑 is Euclidean distance between point to line)

π‘₯1, π‘₯2

𝑑( π‘₯1, π‘₯2 , 𝑙 π‘Ž, 𝑏 )

(54)

The choice of confidence interval is crucial

16/01/2015

Computer Vision I: Robust Two-View Geometry 54

Large confidence,

β€œright” model, 2 outliers

Large confidence,

β€œwrong” model, 2 outliers

Small confidence,

Almost all points are outliers (independent of the model)

Examples:

(55)

Extension: Adaptive number of samples 𝑛

Choose 𝑛 in an adaptive way:

1) Fix 𝑝 = 99.9% (very large value)

2) Set 𝑛 = ∞ and πœ€ = 0.9 (large value for outlier) 3) During RANSAC adapt 𝑛, πœ€ :

1) Re-compute πœ€ from current best solution πœ€ = outliers / all points

2) Re-Compute new 𝑛:

𝑛 =

log 1βˆ’π‘

log(1βˆ’ 1βˆ’πœ€ 𝑑)

16/01/2015

Computer Vision I: Robst Two-View Geometry 55

(56)

MSAC (M-Estimator SAmple Consensus)

16/01/2015

Computer Vision I: Robust Two-View Geometry 56

If a data point is an inlier the penalty is not 0, but it depends on the

β€œdistance” to the model.

Example for the fundamental matrix:

becomes

β†’ the task is to find the model with the minimum average penalty

[P.H.S. Torr und A. Zisserman 1996]

β€œrobust function”

𝑑)

𝑓 π‘₯𝑙, π‘₯π‘Ÿ, 𝐹 = 0 𝑖𝑓 π‘₯𝑙𝑑𝐹π‘₯π‘Ÿ ≀ 𝑑 (𝑒. 𝑔. 0.1)

1 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’

𝑓 π‘₯𝑙, π‘₯π‘Ÿ, 𝐹 = π‘₯𝑙𝑑𝐹π‘₯π‘Ÿ 𝑖𝑓 π‘₯𝑙𝑑𝐹π‘₯π‘Ÿ ≀ 𝑑 (𝑒. 𝑔. 0.1)

𝑑 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’

𝑑

(57)

Randomized RANSAC

16/01/2015

Computer Vision I: Robust Two-View Geometry 57

Evaluation of a hypothesis , i.e. is often time consuming

Randomized

RANSAC:

instead of checking all data points 1. Sample π‘š points from

2. If all of them are good, check all others as before

3. If there is at least one bad point, among π‘š, reject the hypothesis It is possible that good hypotheses are rejected.

However it saves time (bad hypotheses are recognized fast)

β†’ one can sample more often

β†’ overall often profitable (depends on application).

(58)

Refinement after RANSAC

Typical procedure:

1. RASNAC: compute model 𝑦 in a robust way 2. Find all inliers π‘₯

π‘–π‘›π‘™π‘–π‘’π‘Ÿπ‘ 

3. Refine model 𝑦 from inliers π‘₯

π‘–π‘›π‘™π‘–π‘’π‘Ÿπ‘ 

4. Go to Step 2.

(until numbers of inliers or model does not change much)

16/01/2015

Computer Vision I: Image Formation Process 58

(59)

In last lecture we asked (for rotating camera)…

16/01/2015

Computer Vision I: Robust Two-View Geometry 59

Question 1: If a match is completly wrong then is a bad idea Answer: RANSAC with 𝑑 = 4

Question 2: If a match is slighly wrong then might not be perfect.

Better might be a geometric error:

Answer: see next slides

π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›β„Ž 𝐴𝒉

π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›β„Ž 𝐴𝒉

π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›β„Ž 𝑯π‘₯ βˆ’ π‘₯β€²

(60)

Reminder from last Lecture: Homography for rotating camera

16/01/2015

Computer Vision I: Robust Two-View Geometry 60

image

Algorithm:

1) Take π‘š β‰₯ 4 point matches (π‘₯, π‘₯’) 2) Assemble 𝐴 with 𝐴𝒉 = 0

3) compute 𝒉

βˆ—

= π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›

𝒉

𝐴𝒉 subject to 𝒉 = 1,

use SVD to do this.

(61)

Refine Hypothesis 𝐻 with inliers

16/01/2015

Computer Vision I: Robust Two-View Geometry 61

1. Algebraic error: π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›

β„Ž 𝐴𝒉

2. First

geometric error:

π»βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›π» 𝑖 𝑑(π‘₯𝑖′, 𝐻π‘₯𝑖)

This is not symmetric

π‘₯𝑖 𝐻π‘₯𝑖 π‘₯𝑖′

where 𝑑(π‘Ž, 𝑏)is 2D geometric distance π‘Ž βˆ’ 𝑏 2

(62)

Refine Hypothesis 𝐻 with inliers

16/01/2015

Computer Vision I: Robust Two-View Geometry 62

1. Algebraic error: π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›

β„Ž 𝐴𝒉

2. First

geometric error:

π»βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›π» 𝑖 𝑑(π‘₯𝑖′, 𝐻π‘₯𝑖)

where 𝑑(π‘Ž, 𝑏)is 2D geometric distance π‘Ž βˆ’ 𝑏 2

π‘₯𝑖 π»βˆ’1π‘₯′𝑖 𝐻π‘₯𝑖 π‘₯𝑖′

3. Second, symmetric geometric error:

π»βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›π» 𝑖𝑑 π‘₯𝑖′, 𝐻π‘₯𝑖 + 𝑑 π‘₯𝑖, π»βˆ’1π‘₯′𝑖

(63)

Refine Hypothesis 𝐻 with inliers

16/01/2015

Computer Vision I: Robust Two-View Geometry 63

1. Algebraic error: π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›

β„Ž 𝐴𝒉

2. First

geometric error:

π»βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›π» 𝑖 𝑑(π‘₯𝑖′, 𝐻π‘₯𝑖)

where 𝑑(π‘Ž, 𝑏)is 2D geometric distance π‘Ž βˆ’ 𝑏 2

3. Second, symmetric geometric error:

π»βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›π» 𝑖𝑑 π‘₯𝑖′, 𝐻π‘₯𝑖 + 𝑑 π‘₯𝑖, π»βˆ’1π‘₯′𝑖

4. Third, optimal geometric error (gold standard error):

{π»βˆ—, π‘₯𝑖, π‘₯′𝑖} = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘› 𝑖 𝑑(π‘₯𝑖, π‘₯𝑖) + 𝑑(π‘₯′𝑖, π‘₯′𝑖)

π‘₯𝑖 π‘₯𝑖′

the true 3D points 𝑋 are searched for π‘₯𝑖 ^

^ π‘₯𝑖′

𝐻

Comment: This is optimal in the sense that it is the maximum-likelihood (ML) estimation under isotropic Gaussian noise assumption for π‘₯^ (see page 103 HZ)

^

^ ^

^ 𝑠𝑒𝑏𝑗𝑒𝑐𝑑 π‘‘π‘œ π‘₯^𝑖′ = 𝐻π‘₯^𝑖

𝐻, π‘₯^𝑖, π‘₯′𝑖

^ ^

(64)

Full Homography Method (HZ page 123)

16/01/2015

Computer Vision I: Robust Two-View Geometry 64

[see details on page 114ff in HZ]

we discussed : Harries corner detector we discussed : Kd-tree to make it fast

This is the optimal geometric error

See next slides This is a geometric error (for fixed H, see next slides).

Depending on runtime one can choose different once.

See next slide

(65)

Example

16/01/2015

Computer Vision I: Robust Two-View Geometry 65

Input images

~500 interest points

(66)

Example

16/01/2015

Computer Vision I: Robust Two-View Geometry 66

268 putative

matches 117

outliers found

151 inliers found

262 inliers after guided matching

Guided matching Variant: use given 𝐻 and look for new inliers. Here we also double the threshold on appearance feature matches to get more inliers.

(67)

To have a 95% chance that an inlier is inside the confidence interval, we require:

1. For a 2D line: 𝑑 π‘₯, 𝑙 ≀ 𝜎 3.84 = 𝑑

2. For a Homography: 𝑑 π‘₯𝑙, π‘₯π‘Ÿ, 𝐻 ≀ 𝜎 5.99 = 𝑑 3. For an F-matrix: 𝑑 π‘₯𝑙, π‘₯π‘Ÿ, 𝐹 ≀ 𝜎 3.84 = 𝑑

Geometric derivation of confidence interval

16/01/2015

Computer Vision I: Image Formation Process 67

Assume Gaussian noise for a point with 𝜎 standard deviation and 0 mean:

(see page 119 HZ)

(68)

π‘₯, π‘₯β€²^ ^

π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›

𝑖

𝑑(π‘₯𝑖, π‘₯𝑖) + 𝑑(π‘₯′𝑖, π‘₯′𝑖)

Methods for 𝐹/𝐸/𝐻 Matrix computation - Summary

Procedure (as mentioned above):

1. RASNAC: compute model 𝐹/𝐸/𝐻 in a robust way 2. Find all inliers π‘₯π‘–π‘›π‘™π‘–π‘’π‘Ÿπ‘  (with potential relaxed criteria) 3. Refine model 𝐹/𝐸/𝐻 from inliers π‘₯π‘–π‘›π‘™π‘–π‘’π‘Ÿπ‘ 

4. Go to Step 2.

(until numbers of inliers or model does not change much)

16/01/2015

Computer Vision I: Image Formation Process 68

1. For a Homography: 𝑑 π‘₯, π‘₯β€², 𝐻 = min 𝑑 π‘₯, π‘₯ + 𝑑 π‘₯β€², π‘₯β€²^ ^ subject to π‘₯β€² = 𝐻π‘₯^ ^

2. For an 𝐹/𝐸-matrix: 𝑑 π‘₯, π‘₯β€², 𝐹/𝐸 = min 𝑑 π‘₯, π‘₯ + 𝑑 π‘₯β€², π‘₯β€²^ ^ subject to π‘₯^′𝑑𝐹/𝐸π‘₯ = 0^

We need geometric error for model refinement 𝐹/𝐸/𝐻 :

1. For a Homography: {π»βˆ—, π‘₯𝑖, π‘₯𝑖′} = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›

𝑖

𝑑(π‘₯𝑖, π‘₯𝑖) + 𝑑(π‘₯′𝑖, π‘₯′𝑖) subject toπ‘₯′𝑖 = 𝐻π‘₯𝑖 2. For an 𝐹/𝐸-matrix: {πΉβˆ—/πΈβˆ—, π‘₯𝑖, π‘₯𝑖′}= sbj. to π‘₯′𝑖𝑑𝐹/𝐸π‘₯𝑖 = 0

^ ^

^

𝐻, π‘₯𝑖, π‘₯β€²^𝑖

𝐹/𝐸, π‘₯𝑖, π‘₯β€²^ 𝑖

^

^

^

^

^ ^

^

β€œwe see in next lecture that this one can be computed in closed-form”

^ ^

^ ^

We need geometric error for a fixed model 𝐹/𝐸/𝐻 :

^ π‘₯, π‘₯β€²^

(69)

A few word on iterative continuous optimization

So far we had linear (least square) optimization problems:

π‘₯βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›π‘₯ 𝐴π‘₯

For non-linear (arbitrary) optimization problems:

π‘₯βˆ— = π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘›π‘₯ 𝑓(π‘₯)

β€’ Iterative Estimation methods (see Appendix 6 in HZ; page 597ff)

β€’ Gradient Descent Method

(good to get roughly to solution)

β€’ Newton Methods (e.g. Gauss-Newton):

second order Method (Hessian). Good to find accurate result

β€’ Levenberg – Marquardt Method:

mix of Newton method and Gradient descent

16/01/2015

Computer Vision I: Robust Two-View Geometry 69 Red Newton’s method;

green gradient descent

(70)

Application: Automatic Panoramic Stitching

16/01/2015

Computer Vision I: Robust Two-View Geometry 70

Run Homography search between all pairs of images An unordered set of images:

(71)

Application: Automatic Panoramic Stitching

16/01/2015

Computer Vision I: Robust Two-View Geometry 71

... automatically create a panorama

(72)

Application: Automatic Panoramic Stitching

16/01/2015

Computer Vision I: Robust Two-View Geometry 72

... automatically create a panorama

(73)

Application: Automatic Panoramic Stitching

16/01/2015

Computer Vision I: Robust Two-View Geometry 73

... automatically create a panorama

Referenzen

Γ„HNLICHE DOKUMENTE

The linear version of Weinstein’s theorem says that, given two Lagrangian subsapces of a given symplectic vector space, there exists an automorphism of the symplectic vectorspace

Temperaturregelung und eine gleichmÀßige Temperaturverteilung für mehr Effizienz und hervorragende Garergebnisse.. Tradition

β€’ Closest compatible point (stable, slow and requires preprocessing). β€’ Can

Scene type Scene geometry Object classes Object position Object orientation Object shape Depth/occlusions Object appearance Illumination Shadows Motion blur Camera

β€’ Can we derive the

β€’ β€žTracking an object in an image sequence means continuously identifying its location when either the object or the camera are movingβ€œ [Lepetit and Fua 2005].. β€’ This can

Table 2.2: Error terms and optimization parameters for different calibration techniques Two aspects are worth mentioning: β€’ In contrast to camera calibration, where known 3D points

Not only the form of the doubly robust estimator we are considering here is different, but also with our proposed model under the below described setup we are able to get doubly