Computer Vision I - Image Matching and
Image Formation
Carsten Rother
10/12/2014
Roadmap for next five lecture
• Appearance based matching (sec. 4.1)
• How do we get an RGB image? (sec 2.2-2.3)
• The Human eye
• The Camera and Image formation model
• Projective Geometry - Basics (sec. 2.1.1-2.1.4)
• Geometry of a single camera (sec 2.1.5, 2.1.6)
• Pinhole camera
• Lens effects
• Geometry of two cameras (sec. 7.2)
• Robust Geometry estimation for two cameras (sec. 6.1.4)
• Multi-View 3D reconstruction (sec. 7.3-7.4)
10/12/2014 2 Computer Vision I: Image Formation Process
Matching two Images
v
• Find interest points
• Find orientated patches around interest points to capture appearance
• Encode patch appearance in a descriptor
• Find matching patches according to appearance (similar descriptors)
• Verify matching patches according to geometry (later lecture)
Examples: Appearance-based matching
10/12/2014
Computer Vision I: Image Formation Process 4
Applications
• 3D reconstruction:
• Augmented Realty:
• Panoramic Stitching:
Roadmap: matching two Images (appearance & geometry)
10/12/2014
Computer Vision I: Image Formation Process 6
v
• Find interest points
• Find orientated patches around interest points to capture appearance
• Encode patch appearance in a descriptor
• Find matching patches according to appearance (similar descriptors)
• Verify matching patches according to geometry (later lecture!)
Reminder Lecture 3: Harris Corner Detector
Idea was a so-called auto-correlation function:
Compute:
1.
2.
Roadmap: matching two Images (appearance & geometry)
10/12/2014
Computer Vision I: Image Formation Process 8
v
• Find interest points
• Find orientated patches around interest points to capture appearance
• Encode patch appearance in a descriptor
• Find matching patches according to appearance (similar descriptors)
• Verify matching patches according to geometry (later lecture)
How to deal with orientation
Orientate with image gradient:
Choose a patch around each point
10/12/2014
Computer Vision I: Image Formation Process 10
How to deal with scale?
Choose a patch around each point
Choose a patch around each point
10/12/2014
Computer Vision I: Image Formation Process 12
How to deal with scale?
Reminder: Edge detection via image gradient
Final result with canny edge detector
Result of
(using small sigma) ( (𝐷
𝑥∗ 𝐺) ∗ 𝐼 , (𝐷
𝑦∗ 𝐺) ∗ 𝐼)
Image Image gradient:
Result of
(using large sigma)
Alterative Edge Detector via LoG Operator
10/12/2014 14
Find zero-crossing
• To find an edge we first smooth
• is called the LoG (Laplacian of Gaussian operator)
• 1D example: • 2D example
(Mexican hat):
Computer Vision I: Image Formation Process
Alterative: Edge detection with LoG Filter
Called Laplacian of Gaussian (LoG)
Scale selection (illustration)
10/12/2014
Computer Vision I: Image Formation Process 16
𝛻
2𝐺(𝜎) ∗ 𝐼 = 𝜕
2(𝐺(𝜎) ∗ 𝐼)
𝜕𝑥
2+ 𝜕
2(𝐺(𝜎) ∗ 𝐼)
𝜕𝑦
2𝒇 is Laplacian of Gaussian (LoG) operator.
Measures an average edge-ness in all directions
(details on page 191)
Scale selection (illustration)
Scale selection (illustration)
10/12/2014
Computer Vision I: Image Formation Process 18
Scale selection (illustration)
Scale selection (illustration)
10/12/2014
Computer Vision I: Image Formation Process 20
We could match up these curves and find unique corresponding points
Scale selection (illustration)
Simpler: Find maxima / minima in the curves, respectively
Roadmap: matching 2 Images (appearance & geometry)
10/12/2014
Computer Vision I: Image Formation Process 22
v
• Find interest points
• Find orientated patches around interest points to capture appearance
• Encode patch appearance in a descriptor
• Find matching patches according to appearance (similar descriptors)
• Verify matching patches according to geometry
v
SIFT feature
• 4*4=16 cells
• Each cell has an 8 bin histogram (smoothed across cells)
• In total: 16*8 values, i.e. 128D vector
64 pixels
64 pi xels
A cell has 16x16 pixels
(here 8x8 for illustration only)
(blue circle shows center weighting)
SIFT feature is very popular
10/12/2014
Computer Vision I: Image Formation Process 24
• Fast to compute
• Can handle large changes in viewpoint well (up to 60
𝑜out of plane rotation)
• Can handle photometric changes (even day versus night)
Many other feature descriptors
• MOPS [Brown, Szeliski and Winder 2005]
• SURF [Herbert Bay et al. 2006]
• DAISY
[Tola, Lepetit, Fua 2010]• Shape Context
• ….
DAISY
Roadmap: matching 2 Images (appearance & geometry)
10/12/2014
Computer Vision I: Image Formation Process 26
v
• Find interest points
• Find orientated patches around interest points to capture appearance
• Encode patch appearance in a descriptor
• Find matching patches according to appearance (similar descriptors)
• Verify matching patches according to geometry (later lecture)
Find matching patches fast
N patches (e.g. N = 1000) Goal:
1) Find for each patch in left image the closest in right image
2) Accept all those matches where descriptors are similar enough N patches
(e.g. N = 1000)
Methods:
• Naïve: 𝑁
2tests (here 1 Million)
• Hashing (not discussed)
• Kd-tree; on average NlogN tests (here ~10,000)
Hashing
(Hashing
Function) Index for patch DB
Subtask: Find one patch
10/12/2014
Computer Vision I: Image Formation Process 28
?
Database image
Query patch
Nearest Neighbor Search
• Tracking in Video
The video is the Database
• Whole image has one descriptor
• Database is an image collection
• Image retrieval
Kd-tree (d stands for dimension)
10/12/2014
Computer Vision I: Image Formation Process 30
• Build the tree over database image:
1) Cycle over dimensions: x,y,z,x,y,z,….
2) Put in axis-aligned hyper-planes (split at median of point set)
• Result: balanced tree
• Nearest Neighbour search (to come)
Example in 3D
[Invented by Jon Louis Bentley 1975]
4
Examples
1 2
3
5
6
Examples
10/12/2014
Computer Vision I: Image Formation Process 32
1 2
3
5
4,5,6 1,2,3
Dimension 1 0
4
5 10
8
Di men sion 2
d1>5
Kd-tree
4 6
Examples
1 2
3
5
1 2,3
d1>5
d2>4.8 d2>6.5
0 4
8
Di men sion 2
Dimension 1
5 10
5 4,6
Kd-tree
4 6
Examples
10/12/2014
Computer Vision I: Image Formation Process 34
1 2
3
5
1
d1>5
d2>4.8 d2>6.5
0 4
8
Di men sion 2
Dimension 1
5 10
5
d1>1
2 3
Kd-tree
4 6
d1>5.5
4 6
Examples: nearest neighbor search
1 2
3
5 4
1
d1>5
d2>4.8 d2>6.5
0 4
8
Di men sion 2
Dimension 1
5 10
5
d1>1
2 3
query
Kd-tree
6
d1>5.5
4 6
Examples: nearest neighbor search
10/12/2014
Computer Vision I: Image Formation Process 36
1 2
3
5
1
d1>5
d2>4.8 d2>6.5
0 4
8
Di men sion 2
Dimension 1
5 10
5
d1>1
2 3
visited
current best
search radius
query
Kd-tree
4 6
d1>5.5
4 6
“find leave where query is in”
Examples: nearest neighbor search
1 2
3
5
1
d1>5
d2>4.8 d2>6.5
0 4
8
Di men sion 2
Dimension 1
5 10
5
d1>1
2 3
visited
“Radius not intersected”
query
Kd-tree
4 6
d1>5.5
4 6
search radius
Examples: nearest neighbor search
10/12/2014
Computer Vision I: Image Formation Process 38
1 2
3
5
1
d1>5
d2>4.8 d2>6.5
0 4
8
Di men sion 2
Dimension 1
5 10
5
d1>1
2 3
visited
current best
“Radius intersects.
Go down this
subtree and find in all touching pockets (here one) the
closest point. Take it as new candidate if closer than
current one”
query
Kd-tree
4 6
d1>5.5
4 6
search radius
Examples: nearest neighbor search
1 2
3
5
1
d1>5
d2>4.8 d2>6.5
0 4
8
Di men sion 2
Dimension 1
5 10
5
d1>1
2 3
visited
search
radius “better
solution found”
query
Kd-tree
4 6
d1>5.5
4 6
Examples: nearest neighbor search
10/12/2014
Computer Vision I: Image Formation Process 40
1 2
3
5
1
d1>5
d2>4.8 d2>6.5
0 4
8
Di men sion 2
Dimension 1
5 10
5
d1>1
2 3
visited
current best
query
“no need to visit these subtrees”
Kd-tree
4 6
d1>5.5
4 6
search radius
Examples: nearest neighbor search
1 2
3
5
1
d1>5
d2>4.8 d2>6.5
0 4
8
Di men sion 2
Dimension 1
5 10
5
d1>1
2 3
visited
query
Done since root-node is marked in both ways
Kd-tree
4 6
d1>5.5
4 6
search radius
Nearest neighbour search – pseudo code
10/12/2014
Computer Vision I: Image Formation Process 42
Input: query point
Step 1: Find leave node (bucket) with query point Step 2: Make hyper-sphere with radius
(current best and query point)
Step3: go up the tree and see if hyper-plane intersects hyper-sphere Step 3a: no intersection: mark tree branch as visited
since no better point can be found there.
Step 3b: intersection: If not yet marked as visited, then go down
the other branch to find potentially a better point (in all
“touching” pockets). If so, mark as current best and go to Step 2, otherwise mark as visited.
Stop criteria: root node has both branches marked as visited.
Current best
query
Hyper-plane positions Nothing
better possible
Pseudo code
On average O(log N)
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 44
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 46
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 48
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 50
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 52
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 54
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 56
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 58
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 60
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 62
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 64
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 66
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 68
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 70
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 72
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 74
Example with many points in 2D
Example with many points in 2D
10/12/2014
Computer Vision I: Image Formation Process 76
done
v
Roadmap: matching 2 Images (appearance & geometry)
• Find interest points (including different scales)
• Find orientated patches around interest points to capture appearance
• Encode patch in a descriptor
• Find matching patches according to appearance (similar descriptors)
• Verify matching patches according to geometry (later lecture)
Halfway
3 Minutes Break Question?
10/12/2014
Computer Vision I: Image Formation Process 78
Roadmap for next five lecture
• Appearance based matching (sec. 4.1)
• How do we get an RGB image? (sec 2.2-2.3)
• The Human eye
• The Camera and Image formation model
• Projective Geometry - Basics (sec. 2.1.1-2.1.4)
• Geometry of a single camera (sec 2.1.5, 2.1.6)
• Pinhole camera
• Lens effects
• Geometry of two cameras (sec. 7.2)
• Robust Geometry estimation for two cameras (sec. 6.1.4)
The Human eye
10/12/2014
Computer Vision I: Image Formation Process 80
• The retina contains different types of sensors:
cones “Zapfen” (colors, 6 million) and rods “Stäbchen” (gray levels, 120 million)
• The resolution is much higher in fovea centralis
• Light first passes through the layer of neurons before it reaches photo sensors (smoothing). Only in the
“fovea centralis” the light hits directly the photo sensors
• Signal goes out the other way:
Retina → Ganglion cells (1 million) → Optic nerve →
1 MPixel Camera ?
Spatial resolution
.
Spatial resolution
Image Processing: Human seeing 5
.
2MP Camera, far from the screen
Spatial resolution
.
5MP Camera, close to the screen
Spatial resolution (secrets)
10/12/2014
Computer Vision I: Image Formation Process 84
• The resolution is much higher In fovea centralis
• The Information is pre-processed by Ganglion cells
(Compare: 3072×2304=7MPixel, 2.4 MB RGB JPEG lossless)
• No still image, but a „Video“ (super-resolution)
• Scanning technique − Saccades
Eye Saccades
Eyes never move uniformly, but jump in saccades
(approximately 15-100 ms duration between fixation points)
Saccades are driven by the “importance” of the scene parts
The Human eye
10/12/2014
Computer Vision I: Image Formation Process 86
What is light?
Spectrum, i.e. a function of the wavelength
Spectral resolution of the eye is relatively bad due to projection
Different colors can be computed by adding /subtracting signal of rods
cones “Zapfen”
Roadmap for next five lecture
• Appearance based matching (sec. 4.1)
• How do we get an RGB image? (sec 2.2-2.3)
• The Human eye
• The Camera and Image formation model
• Projective Geometry - Basics (sec. 2.1.1-2.1.4)
• Geometry of a single camera (sec 2.1.5, 2.1.6)
• Pinhole camera
• Lens effects
• Geometry of two cameras (sec. 7.2)
• Robust Geometry estimation for two cameras (sec. 6.1.4)
Image formation Process
10/12/2014
Computer Vision I: Image Formation Process 88
Model of the Surface
BRDF (Bi-directional Reflectance Function)
light
Rendering equation:
Outgoing energy:
𝒕: time x∶ position
𝒙
Incoming energy (shining surface)
BRDF function
(4DoF only 𝜔𝑖 , 𝜔𝑜 considered)
Incoming light
Fore-shorting angle (always ≥ 0)
𝜔
𝑖𝜔
𝑜Example: diffuse reflectance
10/12/2014
Computer Vision I: Image Formation Process 90
DIFFUSE
𝑓
𝑟same for all 𝜔
𝑜Single Light source:
Shading effects come from: max(0, 𝑤
𝑖𝒏) Rendering equation
constant 0
constant (one 𝜔𝑖)
0 constant
(also known as the “Lambertian world assumption”)
Examples: BRDFs
DIFFUSE SPECULAR DIFFUSE + ROUGH SPECULAR
0
Rendering equation (single light source)
constant(one 𝜔𝑖)
𝑓
𝑟𝑓
𝑟𝑓
𝑟(for one 𝜔𝑖 and all 𝜔𝑜) (for one 𝜔𝑖and all 𝜔𝑜) (for one 𝜔𝑖and all 𝜔𝑜)
Inside a graphics engine
• Rendering equation
• Ray tracing
• Radiosity
10/12/2014
Computer Vision I: Image Formation Process 92
Example
Texture only 3D scene in blender
Capturig BRDF / BTF capture
10/12/2014
Computer Vision I: Image Formation Process 94
[Christopher Schwartz, Ralf Sarlette, Michael Weinmann and Reinhard Klein]
BTF – Bidirectional Texture function. At every spatial location 𝒙 a different BRDF
Camera types
• RGB cameras
• Depth cameras:
• Passive RGB stereo
• Active Structured Light stereo
• Active Time of Flight
• Lightfield cameras
Digital RGB cameras
10/12/2014
Computer Vision I: Image Formation Process 96
• A digital camera replaces film with a sensor array
• Each cell in the sensor array is light-sensitive diode that converts photons to electrons
• Two common types:
• Charge Coupled Device (CCD)
• Complementary metal oxide semiconductor (CMOS)
Color - Filters
Problem with de-mosaicing: color moiré
10/12/2014
Computer Vision I: Image Formation Process 98
Cause of color moiré
There is a lot of image processing in camera
10/12/2014
Computer Vision I: Image Formation Process 100
Image taken
zoom
In-camera processing
Depth camera: Time of Flight
10/12/2014
Computer Vision I: Image Formation Process 102
Intensity image Depth Image
PMD Camera
Sends out infra-red light and looks at changes in phase and magnitude
Depth Camera – passive RGB stereo
Easy to match hard to match
Depth Camera – Structured Light
10/12/2014
Computer Vision I: Image Formation Process 104