Computer Vision I -

(1)

Computer Vision I -

Algorithms and Applications:

Basics of Digital Image Processing – Part 2

Carsten Rother

Computer Vision I: Basics of Image Processing 06/11/2013

(2)

Roadmap: Basics of Digital Image Processing

• Images

• Point operators (ch. 3.1)

• Filtering: (ch. 3.2, ch 3.3, ch. 3.4) – main focus

• Linear filtering

• Non-linear filtering

• Fourier Transformation (ch. 3.4)

• Multi-scale image representation (ch. 3.5)

• Edges detection and linking (ch. 4.2)

• Line detection (ch. 4.3)

• Interest Point detection (ch. 4.1.1)

• Using multiple Images: Define Challenges

(3)

Reminder: Convolution

06/11/2013

Computer Vision I: Basics of Image Processing 3

• Replace each pixel by a linear combination of its neighbours and itself.

• 2D convolution (discrete) 𝑔 = 𝑓 ∗ ℎ

𝑔 𝑥, 𝑦 = _𝑘,𝑙 𝑓 𝑥 − 𝑘, 𝑦 − 𝑙 ℎ 𝑘, 𝑙 = _𝑘,𝑙 𝑓 𝑘, 𝑙 ℎ 𝑥 − 𝑘, 𝑦 − 𝑙

𝑓 𝑥, 𝑦 ℎ 𝑥, 𝑦 g 𝑥, 𝑦

Centred at 0,0

(4)

Reminder – Linear Filters

(5)

Roadmap: Basics of Digital Image Processing

• Images

06/11/2013

(6)

Gaussian Image Pyramid

• Represent Image at multiple resolution

High resolution Low resolution

(7)

A naive approach

06/11/2013

[From book: Computer Vision A modern Approach, Ponce and Forsyth]

Take every second pixel – bad!

(8)

Problem: Aliasing Effect

Problem: High frequencies (sharp transitions) are lost

(9)

Solution: Smooth before downsampling

06/11/2013

(10)

Application: template search

Search template:

(11)

Application: Large Image Segmentation

06/11/2013

Banded Segmentation

Small image (100x100)

Small segmentation result

Large image, e.g. 10 MPixel Trimap: created from small image

Segmentation large image

(12)

Application: Large Label Space

Approach:

1. solve problem on small image (𝑥5 downscale) with 40 𝑥 40 label space (coarse motion)

(1600 labels)

2. Do on full resolution only in 5𝑥5 neighbourhood around each solution (add fine motion)

(25 labels)

Color coding 2 images

(overlaid)

Motion

200 𝑥 200 possible discrete movements

(40.000 labels)

(problem small objects

(13)

Roadmap: Basics of Digital Image Processing

• Images

06/11/2013

(14)

Goal: Find long edge chains

What do we want

^:

• Good detection: we want to find edges not noise

• Good localization: find true edge

• Single response: one per edge

(independent of edge sharpness)

• Long edge-chains

(15)

Idealized edge types

06/11/2013

We focus on this

(16)

What are edges ?

• correspond to fast changes in the image

• The magnitude of the derivative is large

Image of 2 step edges

Slizce through the image

Image of 2 ramp edges

Slizce through the image

(17)

What are fast changes in the image?

06/11/2013

Image

Scanline 250

Scanline 250 smoothed with Gausian

Texture or many edges?

Edges defined after smoothing

(18)

Edges and Derivatives

We will look at this first

(19)

Edge filters in 1D

06/11/2013

We can implement this as a linear filter:

Forward differences:

Central differences: ^1/2 ^{-1 0} ¹

1/2 -1 1

(20)

Reminder: Seperable Filters

This is the centralized differencesoperator

(21)

Edge Filter in 1D: Example

06/11/2013

Based on 1st derivative

• Smooth with Gaussian – to filter out noise

• Calculate derivative

• Find its optima

(22)

Edge Filtering in 1D

Simplification:

(saves one operation)

Derivative of Gaussian

(23)

Edge Filtering in 2D

06/11/2013

(24)

Edge Filter in 2D: Example

(25)

Edge Filter in 2D

06/11/2013

x-derivatives with different Gaussian smoothing

(26)

What is a gradient

(27)

What is a gradient

06/11/2013

(28)

What is a gradient

(29)

What is a Gradient

06/11/2013

(30)

Example – Gradient magnitude

Our goal was to get thin edge chains?

First smoothed with Gaussian First smoothed with broad Gaussian

(31)

How to get edge chains

06/11/2013

1. Compute robust Gradient Image: ^{( (𝐷}_𝑥 ∗ 𝐺) ∗ 𝐼 , (𝐷_𝑦 ∗ 𝐺) ∗ 𝐼)

2. Find edge-points (“edgels”): non-maximum suppression 3. Link-up edge-points to get chains

4. Do hysteresis to clean-up chains

Rough Outline of a good edge detector (such as Canny)

edge-points or edgel

(32)

Non-maximum surpression

1. Check if pixel is local maximum at gradient orientation (interpolate values for p,r)

2. Accept edge-point if above a threshold

Wich pixel is an edge point ? (non-max surpression)

Edge point Not Edge point

(33)

Link up edge-points to get chains

06/11/2013

1. Link-up neighbouring pixels if both are edge-points.

Edge point Not Edge point

(34)

Clean up chains with Hysteresis

High start threshold

Low threshold along the chain

Keep a chain:

(35)

Final Result

06/11/2013

Image Not much smoothing (fine scale)

much smoothing (coarse scale) small threshold

much smoothing (coarse scale) large threshold

(36)

Alterative: Edge detection with Laplacian Filter

Called Laplacian of Gaussian (LoG)

(37)

Laplacian example in 1D

06/11/2013

Find zero-crossing

(38)

Approximate LoG with Difference of Gaussian (DoG)

Solid Line:

Differnce of Gaussian (DoG) Dashed Line:

Laplacian of Gaussian

(39)

Laplacian example in 2D

06/11/2013

(40)

Future Lecture: Segmentation

• So far we looked at “jumps” in gray-scale images?

• Humans perceive edges very differently (edges depend on semantic)

“average human drawing”

Hard for computer vision method which operates only locally!

(41)

Try to learn semantically meaningful image edges

06/11/2013

• Features: brightness gradient; color gradient; texture gradient; weighted combination, etc.

(42)

Roadmap: Basics of Digital Image Processing

• Images

(43)

Hough voting with edgels (edge-points)

06/11/2013

Hough Space

(line is now a point)

Algorithm:

1. Empty cells in Hough Space

2. Put for each Edgel(𝜃, 𝑟) into a cell of the Hough Space 3. Find Peaks in Hough Space (use non-max suppression) 4. Re-fit all edgels to a single line

Hough transform

3 edgels

(edge-points with direction)

(44)

Hough Voting: original

Goal: find all points with many

“votes” in accumulator space

Hough transform

Goal: find all lines

Image with just 3 points

All lines that go through the 3 points This idea of transformation to a voting space can be used for many scenarios

Image with points

(45)

Hough transform: original

06/11/2013

[From Wikipedia]

All possiblelines at each point

(46)

Example: Orthogonal Vanishing point detection

846 line segments found

Found 3 orthogonal vanishing points Algorithm: RANSAC (explained later)

Application: Camera Calibration (see later)

(47)

Roadmap: Basics of Digital Image Processing

• Images

06/11/2013

(48)

What region should we try to match?

Look for a region that is unique, i.e. not ambiguous

We want to find a few regions where this image pair matches: Applications later

(49)

Goal: Interest Point Detection

• Goal: predict a few “interest points” in order to remove redundant data efficiently

06/11/2013

• Should be invariant against:

a. Geometric transformation – scaling, rotation, translation, affine transformation, projective transformation etc.

b. Color transformation – additive (lightning change), multiplicative (contrast), linear (both), monotone etc.;

c. Discretization (e.g. spatial resolution, focus);

(scetch)

(50)

Points versus Lines

„Apeture problem“

Lines are not as good as points

?

(51)

Harris Detector

06/11/2013

[Szeliski and Seitz]

Local measure of feature uniqueness:

Shifting the window in any direction: how does it change

Shift left Shift top,left

(52)

Harris Detector

How similar is the image to itself?

Autocorrelation function:

is a small vicinity (window) around

is a convolution kernel, used to decrease the influence of pixels far from , e.g. the Gaussian

For simplicity we use 𝑤(𝑢, 𝑣) = 1

𝑥 𝑦

Δ𝑥, Δ𝑦

(53)

Harris Detector

06/11/2013

One is interested in properties of at each position Let us look at a linear approximation of

Taylor expansion around

+ 𝜖(Δ𝑥, Δ𝑦)

Gradient at (𝑢, 𝑣)

(54)

Harris Detector

Put it together:

with

Q: Structure Tensor

We compute this at any image location (𝑥, 𝑦)

(55)

Harris Detector

06/11/2013

The autocorrelation function

Function c is (after approximation) a quadratic function in and

• Isolines are ellipses ( is symmetric and positive definite);

• Eigenvector 𝑥₁ with (larger) Eigenvalue 𝜆₁ is the direction of fastest change in function 𝑐

• Eigenvector 𝑥₂ with (smaller) Eigenvalue 𝜆₂ is direction of slowest change in function 𝑐

Δ𝑥 Δy

Function c 𝑥₂

𝑥₁

Note 𝑐 = 0 for Δ𝑥 = Δy = 0

(56)

Harris Detector

Some examples – isolines for :

(a) Flat (b) Edges (c) Corners a. Homogenous regions: both -s are small

b. Edges: one is small the other one is large

c. Corners: both -s are large (this is what we are looking for!)

(57)

Harris Detector

06/11/2013

Zoom in

smaller eigenvalue larger eigenvalue

𝜆₁ 𝜆₂

Image

(58)

Harris detector

“Cornerness” is a characteristic of

Proposition by Harris:

Downweights edges where 𝜆₁ ≫ 𝜆₂

(59)

Harris Corners - example

06/11/2013

(60)

h-score (red- high, blue - low)

(61)

Threshold (H-score > value)

06/11/2013

(62)

Non-maximum suppression

(63)

Harris corners in red

06/11/2013 Computer Vision: Algorithms and Applications --

- Carsten Rother 63

(64)

Other examples

(65)

Maximally stable extremal regions

06/11/2013

• Invariant to affine transformation of gray-values

• Both small and large structures are detected

(66)

Literature

There is a large body of literature on detectors and descriptors (later lecture)

A comparison paper

(e.g. what is the most robust corner detectors):

K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J.

Matas, F. Schaffalitzky, T. Kadir: A Comparison of Affine

Region Detectors (IJCV 2006)

(67)

Roadmap: Basics of Digital Image Processing

• Images

06/11/2013

(68)

Difference between Appearance and Geometry

Does this look like a cotrrect match?

Image 1 Image 2 Image 1 Image 2

Does this look like a cotrrect match?

Appearance based matching:

Geometry based matching:

1) Assume sensible camera model. We will see that: Given 4 matching

points on a surface defines how other points on the surface will match

Neighborhodd looks quite similiar (descriptor)

(69)

The 3D case

Illustration of the general 3D case

06/11/2013

Appearance based matching:

Geometry based matching:

1) Assume sensible camera model. We will see that: Given 7 matching 3D points defines how other 3D points match.

(70)

Sparse versus Dense Matching: Tasks and Applications

Tasks:

• Find places where we could match features (points, lines, regions, etc)

• Extract appearance - features descriptors

• Find all possible (putative) appearance matches between images

• Verify with geometry

For what applications is sparse matching enough:

• Sparse 3D reconstruction of a rigid scene

• Panoramic stitching of a rotating / translating camera

(71)

Building Rome on a cloudless day

06/11/2013

[Frahm et al. ECCV ´10 ] The old city of Dubrovnik

(72)

Sparse versus Dense Matching

3D view interpolation

Kinect RGB and Depth data input Dense flow:

frame 1->2

Dense flow:

frame 2->1

Flow encoding

(73)

Sparse versus Dense Matching: Tasks and Applications

06/11/2013

Tasks (all in one):

• Find for each pixel the 2D/3D/6D displacement (using both appearance and geometry)

• Find points which are occluded

For what applications is dense matching needed:

• Dense reconstruction of rigid scene

• 3D reconstruction of a non-rigid scene

[Goesele et al. ICCV 07]

(74)

Using multiple Images: Define Challenges

A road map for the next five lectures

• L4: Geometry of a Single Camera and Image Formation Process

• L5: Sparse Matching two images: Appearance

• L6: Sparse Matching two images: Geometry

• L7: Sparse Reconstructing the world (Geometry of n-views)

• L8: Dense Geometry estimation

(stereo, flow and scene flow, registration)

(75)

v

Outlook – matching 2 Images (appearance & geometry)

06/11/2013

• Find interest points (including different scales)

• Find orientated patches around interest points to capture appearance

• Encode patch in a descriptor

• Find matching patches according to appearance (similar descriptors)

• Verify matching patches according to geometry

(76)

Reading for next class

This lecture:

• Chapter 3.5: multi-scale representation

• Chapter 4.2 and 4.3 - Edge and Line detection

• Chapter 4.1.1 Interest Point Detection Next lecture:

• Chapter 2 (in particular: 2.1, 2.2) – Image formation process

• And a bit of Hartley and Zisserman – chapter 2