Algorithms and Applications:

(1)

Computer Vision I -

Algorithms and Applications:

Image Formation Process – Part 2

Carsten Rother

17/11/2013

(2)

Roadmap this lecture

• Geometric primitives and transformations (sec. 2.1.1-2.1.4)

• Geometric image formation (sec 2.1.5, 2.1.6)

• Pinhole camera

• Lens effects

• Photometric image formation process (sec 2.2)

• The Human eye

• Camera Types and Hardware (sec 2.3)

• Appearance based matching

17/11/2013

Computer Vision I: Image Formation Process 2

(3)

Reminder: Pinhole Camera (Geometry)

• Camera matrix P has 11 DoF

• Intrinsic parameters

• Principal point coordinates (𝑝 _𝑥 , 𝑝 _𝑦 )

• Focal length 𝑓

• Pixel magnification factors 𝑚

• Skew (non-rectangular pixels) 𝑠

• Extrinsic parameters

• Rotation 𝑹 (3DoF) and translation 𝐂 (3DoF) relative to world coordinate system

𝒙 = 𝑲 𝑹 (𝑰

_𝟑×𝟑

| − 𝑪) 𝑿 ~

~

𝑲 =

𝑓 𝑠 𝑝

_𝑥

0 𝑚𝑓 𝑝

_𝑦

0 0 1

𝒙 = 𝑷 𝑿

(4)

Reminder: Lens effects

17/11/2013

(5)

Roadmap this lecture

• Geometric primitives and transformations (sec. 2.1.1-2.1.4)

• Geometric image formation (sec 2.1.5, 2.1.6)

• Pinhole camera

• Lens effects

• Photometric image formation process (sec 2.2)

• The Human eye

• Camera Types and Hardware (sec 2.3)

• Appearance based matching

(6)

Image formation Process

17/11/2013

(7)

Model of the Surface

BRDF (Bi-diretcional Reflectance Function)

light

Rendering equation:

Outgoing energy:

𝒕 – time x – position 𝜆 – wavelength

𝜔 / 𝜔 - incoming / outgoing direction 𝒙

Outgoing energy (shining surface)

BRDF function

(4DoF only 𝜔_𝑖 , 𝜔_𝑜 considered)

Incoming light

Fore-shorting angle (always ≥ 0)

𝜔

_𝑖

𝜔

_𝑜

(8)

Example: diffuse illumination

17/11/2013

DIFFUSE

𝑓

_𝑟

same for all 𝜔

_𝑜

Single Light source:

Shading effects come from: max(0, 𝑤

_𝑖

𝒏) Rendering equation

constant 0

constant (fixed 𝜔_𝑖)

0 constant

(9)

Examples: BRDFs

DIFFUSE SPECULAR DIFFUSE + ROUGH SPECULAR

0 Rendering equation (single light source)

constant (fixed 𝜔

_𝑖

)

Single light source (top, right)

𝑓

_𝑟

𝑓

_𝑟

𝑓

_𝑟

(10)

Inside a graphics engine

• Rendering equation

• Ray tracing

• Radiosity

17/11/2013

(11)

Example

Texture only

Direct light Global (direct + indirect) light

3D scene in blender

(12)

How to model light?

17/11/2013

• Point light source

• Area light source

• Non-local illumination situation can be approximated with spherical harmonics

First 9 spherical harmonics (gray scale)

They can produce this

(13)

Capturig BRDF / BTF capture

BTF – Bidirectional Texture function. At every spatial location 𝒙 a different BRDF

(14)

Fabricating BRDFs

17/11/2013

[Levin et al. Siggraph 2013]

(15)

Single Image

[Barron and Malik 2012]

(16)

Single Image … far from being solved

17/11/2013

[Barron and Malik 2012]

(17)

Reconstruction and recognition

[Vinett, Rother, Torr, NIPS ‘13]

(18)

Roadmap this lecture

• Geometric primitives and transformations (sec. 2.1.1-2.1.4)

• Geometric image formation (sec 2.1.5, 2.1.6)

• Pinhole camera

• Lens effects

• Photometric image formation process (sec 2.2)

• The Human eye

• Camera Types and Hardware (sec 2.3)

• Appearance based matching

17/11/2013

(19)

The Human eye

• The retina contains different types of sensors:

cones “Zapfen” (colors, 6 million) and

rods “Staebchen” (gray levels, 120 million)

• The resolution is much higher In fovea centralis

• Light first passes through the layer of neurons before it reaches photo sensors (smoothing). Only in the

“fovea centralis” the light hits directly the photo sensors

• Signal goes out the other way:

Retina → Ganglion cells (1 million) → Optic nerve →

1 MPixel Camera ?

(20)

Spatial resolution

17/11/2013

.

(21)

Spatial resolution

.

2MP Camera, far from the screen

(22)

Spatial resolution

Image Processing: Human seeing 5

.

5MP Camera, close to the screen

(23)

Spatial resolution (secrets)

• The resolution is much higher In fovea centralis

• The Information is pre-processed by Ganglion cells

(Compare: 3072×2304=7MPixel, 2.4 MB RGB JPEG lossless)

• No still image, but a „Video“ (super-resolution)

• Scanning technique − Saccades

(24)

Eye Saccades

17/11/2013

Eyes never move uniformly, but jump in saccades

(approximately 15-100 ms duration between fixation points)

Saccades are driven by the “importance” of the scene parts

(eyes, mouth etc).

(25)

The Human eye

What is light?

Spectrum, i.e. a function of the wavelength

Spectral resolution of the eye is relatively bad due to projection

Different colors can be computed by adding /subtracting signal of rods

cones “Zapfen”

(26)

Roadmap this lecture

• Geometric primitives and transformations (sec. 2.1.1-2.1.4)

• Geometric image formation (sec 2.1.5, 2.1.6)

• Pinhole camera

• Lens effects

• Photometric image formation process (sec 2.2)

• The Human eye

• Camera Types and Hardware (sec 2.3)

• Appearance based matching

17/11/2013

(27)

Camera types

• RGB cameras

• Depth cameras:

• Passive RGB stereo

• Active Structured light

• Active Time of Flight

• Lightfield cameras

(28)

Digital RGB cameras

17/11/2013

• A digital camera replaces film with a sensor array

• Each cell in the sensor array is light-sensitive diode that converts photons to electrons

• Two common types:

• Charge Coupled Device (CCD)

• Complementary metal oxide semiconductor (CMOS)

(29)

CCD versus CMOS

• CCD: move charge from pixel to pixel and then converts to voltage and digital signal

• Negative: more expensive

• CMOS: converts to voltage inside the pixel

• Negative: slower to read out image ("rolling shutter" effect)

(30)

Color - Filters

17/11/2013

(31)

Problem with de-mosaicing: color moiré

(32)

Cause of color moiré

17/11/2013

(33)

General problems with RGB cameras

• Noise

• low light is where you most notice noise

• light sensitivity (ISO) / noise tradeoff

• stuck pixels

• Resolution: Are more megapixels better?

• requires higher quality lens

• noise issues

• In-camera processing

• oversharpening can produce halos

• RAW vs. compressed

• file size vs. quality tradeoff

• Blooming

• charge overflowing into neighboring pixels

• More info online:

• http://electronics.howstuffworks.com/

• http://www.dpreview.com/

• http://www.dxomark.com/

(34)

In-camera processing

17/11/2013

(35)

Historical context of cameras

(36)

Camera types

• RGB cameras

• Depth cameras:

• Passive RGB stereo

• Active Structured light

• Active Time of Flight

• Lightfield cameras

17/11/2013

(37)

Passive Depth Camera –RGB stereo

Easy to match hard to match

(38)

Active Depth Camera – structured light

17/11/2013

IR projector

IR camera

(39)

Depth Camera – structured light

• We have to perform matching (as with passive stereo camera) but now we have a very textured image

Reference image from IR projector

• Only works well when external IR light is not too string (not under sunlight)

(40)

Depth Camera – structure Light

17/11/2013

(41)

Halfway

3 Minutes Break

Question?

(42)

Depth camera – time of flight

17/11/2013

Intensity image Depth Image

PMD Camera

(43)

Principle Time of Flight

Modulated Light Source

Sensor Pixel

Cam Lens

[slide credits: Rahul Nair, Daniel Kondermann]

(44)

Principle Time of Flight

17/11/2013

Modulated Light Source

Sensor Pixel

Cam Lens

(45)

Principle Time of Flight

Modulated Light Source

Sensor Pixel

Cam Lens

(46)

Principle Time of Flight

17/11/2013

Modulated Light Source

Sensor Pixel

Cam Lens

(47)

Principle Time of Flight

Modulated Light Source

Sensor Pixel

Cam Lens

(48)

Principle Time of Flight

17/11/2013

Modulated Light Source

Sensor Pixel

Cam Lens

(49)

Principle Time of Flight

Measure:

• Phase shift

• Amplitude and offset Output:

• Depth Image

(from phase shift)

(wavelength determines the range of depth values; around 3.5 meter)

• Intensity image (from amplitude and offset)

Sensor Pixel

∆Φ

offset

amplitude

3.5meter

∆Φ means d= 0.3m or 3.8m or 7.3m or …

(take differently modulated frequences)

(50)

Depth camera – time of flight

One of the biggest problems is multi-path

17/11/2013

(51)

Lightfield cameras

Capture all light:

Refocus and change perspective with one image

http://www.lytro.com/camera/

(52)

Roadmap this lecture

• Geometric primitives and transformations (sec. 2.1.1-2.1.4)

• Geometric image formation (sec 2.1.5, 2.1.6)

• Pinhole camera

• Lens effects

• Photometric image formation process (sec 2.2)

• The Human eye

• Camera Types and Hardware (sec 2.3)

• Appearance based matching

17/11/2013

(53)

Roadmap: matching 2 Images (appearance & geometry)

v

• Find interest points

• Find orientated patches around interest points to capture appearance

• Encode patch appearance in a descriptor

• Find matching patches according to appearance (similar descriptors)

• Verify matching patches according to geometry

(54)

Roadmap: matching 2 Images (appearance & geometry)

17/11/2013

v

• Find interest points

• Find orientated patches around interest points to capture appearance

• Encode patch appearance in a descriptor

• Find matching patches according to appearance (similar descriptors)

• Verify matching patches according to geometry

(55)

Reminder Lecture 3: Harris Corner Detector

Auto-correlation function:

Harris measure:

(56)

Roadmap: matching 2 Images (appearance & geometry)

17/11/2013

v

• Find interest points

• Find orientated patches around interest points to capture appearance

• Encode patch appearance in a descriptor

• Find matching patches according to appearance (similar descriptors)

• Verify matching patches according to geometry

(57)

How to deal with orientation

Orientate with image gradient:

(58)

Choose a patch around each point

17/11/2013

How to deal with scale?

(59)

Choose a patch around each point

How to deal with scale?

(60)

Choose a patch around each point

17/11/2013

How to deal with scale?

(61)

Scale selection (illustration)

𝛻

²

𝐺(𝜎) ∗ 𝐼 = 𝜕

²

(𝐺(𝜎) ∗ 𝐼)

𝜕𝑥

²

+ 𝜕

²

(𝐺(𝜎) ∗ 𝐼)

𝜕𝑦

²

𝒇 is Laplacian of Gaussian (LoG) operator.

Measures an average edge-ness in all directions

(62)

Scale selection (illustration)

17/11/2013

(63)

Scale selection (illustration)

(64)

Scale selection (illustration)

17/11/2013

(65)

Scale selection (illustration)

We could match-up these curves and find unique corresponding points

(66)

Scale selection (illustration)

17/11/2013

Simpler: Find maxima /minima in image

(67)

Extensions: general affine transformations

(68)

Roadmap: matching 2 Images (appearance & geometry)

17/11/2013

v

• Find interest points

• Find orientated patches around interest points to capture appearance

• Encode patch appearance in a descriptor

• Find matching patches according to appearance (similar descriptors)

• Verify matching patches according to geometry

(69)

v

SIFT feature

[Lowe 2004]

• 4*4=16 cells

• Each cell has an 8 bin histogram (smoothed across cells)

• In total: 16*8 values, i.e. 128D vector

64 pixels

64 pi xels

A cell has 16x16 pixels

(here 8x8 for illustration only)

(blue circle shows center weighting)

(70)

SIFT feature is very popular

17/11/2013

• Fast to compute

• Can handle large changes in viewpoint well (up to 60

^𝑜

out of plane rotation)

• Can handle photometric changes (even day versus night)

(71)

Many other feature descriptors

• MOPS [Brown, Szeliski and Winder 2005]

• SURF [Herbert Bay et al. 2006]

• DAISY

[Tola, Lepetit, Fua 2010]

• Shape Context

• ….

DAISY

(72)

Roadmap: matching 2 Images (appearance & geometry)

17/11/2013

v

• Find interest points

• Find orientated patches around interest points to capture appearance

• Encode patch appearance in a descriptor

• Find matching patches according to appearance (similar descriptors)

• Verify matching patches according to geometry

(73)

Appearance-based matching

(74)

Appearance-based matching

17/11/2013

N patches (e.g. N = 1000) Goal:

1) Find for each patch in left image the closest in right image 2) Accept all matches where descriptors are similar enough

N patches (e.g. N = 1000)

Methods:

• Naïve: 𝑁

²

tests (here 1 Million)

• Hashing (locality sensitive hashing)

• Kd-tree; on average NlogN tests (here 10,000)

Hashing

(Hashing

Function) Index for patch DB

(75)

Subtask: Search for one patch

?

Database image

Query patch

(76)

Nearest Neighbor Search

• Tracking in Video

17/11/2013

The video is the Database

• Whole image has one descriptor

• Database is an image collection

• Image retrieval

(77)

Kd-tree (d stands for dimension)

• Build the tree over database image:

1) Cycle over dimensions: x,y,z,x,y,z,….

2) Put in axis-aligned hyper-planes (split at median of point set)

• Result: balanced tree

• Nearest Neighbour search (to come)

Example in 3D

[Invented by Jon Louis Bentley 1975]

(78)

4 Examples

17/11/2013

1 2

3

5

6

(79)

Examples

1 2

3

5 4,5,6 1,2,3

Dimension 1 0

4 5 10

8 Di men sion 2

d1>5

Kd-tree

4 6

(80)

Examples

17/11/2013

1 2

3

5 1 2,3

d1>5

d2>4.8 d2>6.5

0 4

8 Di men sion 2

Dimension 1

5 10

5 4,6

Kd-tree

4 6

(81)

Examples

1 2

3

5

1

d1>5

d2>4.8 d2>6.5

0 4

8 Di men sion 2

Dimension 1

5 10

5

d1>1

2 3

Kd-tree

4 6

d1>5.5

4 6

(82)

Examples: nearest neighbor search

17/11/2013

1 2

3 5 4

1

d1>5

d2>4.8 d2>6.5

0 4

8 Di men sion 2

Dimension 1

5 10

5

d1>1

2 3

query

Kd-tree

6

d1>5.5

4 6

(83)

Examples: nearest neighbor search

1 2

3

5

1

d1>5

d2>4.8 d2>6.5

0 4

8 Di men sion 2

Dimension 1

5 10

5

d1>1

2 3

visited

current best

search radius query

Kd-tree

4 6

d1>5.5

4 6

(84)

Examples: nearest neighbor search

17/11/2013

1 2

3

5

1

d1>5

d2>4.8 d2>6.5

0 4

8 Di men sion 2

Dimension 1

5 10

5

d1>1

2 3

visited

current best

search radius

“Radius not intersected”

query

Kd-tree

4 6

d1>5.5

4 6

(85)

Examples: nearest neighbor search

1 2

3

5

1

d1>5

d2>4.8 d2>6.5

0 4

8 Di men sion 2

Dimension 1

5 10

5

d1>1

2 3

visited

current best

search radius

“Radius intersects so go down the subtree”

query

Kd-tree

4 6

d1>5.5

4 6

(86)

Examples: nearest neighbor search

17/11/2013

1 2

3

5

1

d1>5

d2>4.8 d2>6.5

0 4

8 Di men sion 2

Dimension 1

5 10

5

d1>1

2 3

visited

current best

search

radius “better

solution found”

query

Kd-tree

4 6

d1>5.5

4 6

(87)

Examples: nearest neighbor search

1 2

3

5

1

d1>5

d2>4.8 d2>6.5

0 4

8 Di men sion 2

Dimension 1

5 10

5

d1>1

2 3

visited

current best

“no need to visit these subtrees”

Kd-tree

4 6

d1>5.5

4 6

(88)

Examples: nearest neighbor search

17/11/2013

1 2

3

5

1

d1>5

d2>4.8 d2>6.5

0 4

8 Di men sion 2

Dimension 1

5 10

5

d1>1

2 3

visited

current best

Done since root-node is marked in both ways

Kd-tree

4 6

d1>5.5

4 6

(89)

Nearest neighbour search – pseudo code

Input: query point

Step 1: Find leave node (bucket) with query point Step 2: Make hyper-sphere with radius

(current best and query point)

Step3: go up the tree and see if hyper-plane intersects hyper-sphere Step 3a: no intersection: mark tree branch as visited

since no better point can be found there.

If node is root node then stop.

Step 3b: intersection: go down the branch to find potentially a better point. If so, mark as current best and go to Step 2.

Current best

query

Hyper-plane positions Nothing

better possible

Pseudo code

On average O(log N)

(90)

Example with many points in 2D

17/11/2013

From Andrew Moore: http://www.cs.cmu.edu/~awm/animations/kdtree/

(91)

Example with many points in 2D

(92)

Example with many points in 2D

17/11/2013

(93)

Example with many points in 2D

(94)

Example with many points in 2D

17/11/2013

(95)

Example with many points in 2D

(96)

Example with many points in 2D

17/11/2013

(97)

Example with many points in 2D

(98)

Example with many points in 2D

17/11/2013

(99)

Example with many points in 2D

(100)

Example with many points in 2D

17/11/2013

(101)

Example with many points in 2D

(102)

Example with many points in 2D

17/11/2013

(103)

Example with many points in 2D

(104)

Example with many points in 2D

17/11/2013

(105)

Example with many points in 2D

(106)

Example with many points in 2D

17/11/2013

(107)

Example with many points in 2D

(108)

Example with many points in 2D

17/11/2013

(109)

Example with many points in 2D

(110)

Example with many points in 2D

17/11/2013

(111)

Example with many points in 2D

(112)

Example with many points in 2D

17/11/2013

(113)

Example with many points in 2D

(114)

Example with many points in 2D

17/11/2013

(115)

Example with many points in 2D

(116)

Example with many points in 2D

17/11/2013

(117)

Example with many points in 2D

(118)

Example with many points in 2D

17/11/2013

(119)

Example with many points in 2D

(120)

Example with many points in 2D

17/11/2013

(121)

Example with many points in 2D

(122)

Example with many points in 2D

17/11/2013

(123)

v

Roadmap: matching 2 Images (appearance & geometry)

• Find interest points (including different scales)

• Find orientated patches around interest points to capture appearance

• Encode patch in a descriptor

• Find matching patches according to appearance (similar descriptors)

• Verify matching patches according to geometry

(124)

Reading for next class

This lecture:

• Photometric image formation (sec 2.2)

• Camera Types and Hardware (sec 2.3)

• Appearance matching: (sec. 4.1.2-4.1.3)

Next lecture:

• Two-view Geometry (Hartley Zissermann)

17/11/2013

Computer Vision I: Basics of Image Processing 124

Algorithms and Applications:

Computer Vision I -