• Keine Ergebnisse gefunden

Probabilistic Fitting

N/A
N/A
Protected

Academic year: 2022

Aktie "Probabilistic Fitting"

Copied!
28
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Probabilistic Fitting

Marcel LΓΌthi, University of Basel

1

(2)

Analysis by Synthesis - Idea

Belief: Understanding means being able to synthesize it

Parameters πœƒ

Comparison

Update πœƒ Synthesis πœ‘(πœƒ)

(3)

Analysis by Synthesis – Modelling problem

Modelling problem: What are 𝑝(πœƒ) and 𝑝 𝐷 πœƒ)

Parameters πœƒ

Comparison: 𝑝 D πœƒ)

Update πœƒ Synthesis πœ‘(πœƒ)

Prior 𝑝(πœƒ)

(4)

Analysis by synthesis – Conceptual problem

Parameters πœƒ

Comparison: 𝑝 D πœƒ)

Update πœƒ Synthesis

πœ‘(πœƒ)

Prior 𝑝(πœƒ)

Updating beliefs through Bayesian inference

𝑝 πœƒ D = 𝑝 πœƒ 𝑝 D πœƒ

∫ 𝑝 πœƒ 𝑝 𝐷 πœƒ π‘‘πœƒ

(5)

Analysis by synthesis – Computational problem

5

𝑝 πœƒ D = 𝑝 πœƒ 𝑝 D πœƒ

∫ 𝑝 πœƒ 𝑝 𝐷 πœƒ π‘‘πœƒ

Usually non-linear and expensive to evaluate

High-Dimensional integral

ΰΆ± ΰΆ± … ࢱ𝑝 πœƒ 1 , … , πœƒ 𝑛 𝑝 𝐷 πœƒ 1 , … , πœƒ 𝑛 )π‘‘πœƒ 1 … πœƒ 𝑛 πœƒ 1 πœƒ 𝑛

Can only be

approximated

(6)

Outline

β€’ Basic idea: Sampling methods and MCMC

β€’ The Metropolis-Hastings algorithm

β€’ The Metropolis algorithm

β€’ Implementing the Metropolis algorithm

β€’ The Metropolis-Hastings algorithm

β€’ Example: 3D Landmark fitting

(7)

Variational methods

β€’ Function approximation π‘ž(πœƒ) arg max

π‘ž KL(π‘ž(πœƒ)|𝑝(πœƒ|𝐷))

Sampling methods

β€’ Numeric approximations through simulation

Approximate Bayesian Inference

KL: Kullback- Leibler divergence

πœƒ 𝑝

πœƒ

𝑝

(8)

β€’ Simulate a distribution 𝑝 through random samples π‘₯ 𝑖

β€’ Evaluate expectation (of some function 𝑓 of random variable 𝑋)

𝐸 𝑓(𝑋) = ΰΆ± 𝑓 π‘₯ 𝑝 π‘₯ 𝑑π‘₯

𝐸 𝑓(𝑋) β‰ˆ መ 𝑓 = 1 𝑁 ෍

𝑖 𝑁

𝑓 π‘₯ 𝑖 , π‘₯ 𝑖 ~ 𝑝 π‘₯

𝑉 መ 𝑓(𝑋) ~ 𝑂 1 𝑁

Sampling Methods

β€’ β€œIndependent” of dimensionality of 𝑋

β€’ More samples increase accuracy

This is difficult!

𝑝

(9)

Sampling from a Distribution

β€’ Easy for standard distributions … is it?

β€’ Uniform

β€’ Gaussian

β€’ How to sample from more complex distributions?

β€’ Beta, Exponential, Chi square, Gamma, …

β€’ Posteriors are very often not in a β€œnice” standard text book form

β€’ We need to sample from an unknown posterior with only unnormalized, expensive point- wise evaluation 

9

Random.nextDouble()

Random.nextGaussian()

(10)

Markov Chain Monte Carlo

Markov Chain Monte Carlo Methods (MCMC)

Idea: Design a Markov Chain such that samples π‘₯ obey the target distribution 𝑝 Concept: β€œUse an already existing sample to produce the next one”

β€’ Many successful practical applications

β€’ Proven: developed in the 1950/1970ies (Metropolis/Hastings)

β€’ Direct mapping of computing power to approximation accuracy

(11)

MCMC: An ingenious mathematical construction

Markov chain

Equilibrium

distribution Distribution 𝑝(π‘₯)

MCMC Algorithms induces

converges to Generate samples

from is

If Markov Chain is a- periodic and

irreducable it…

… an aperiodic and irreducable

No need to understand this now: more details follow!

(12)

The Metropolis Algorithm

β€’ Initialize with sample 𝒙

β€’ Generate next sample, with current sample 𝒙

1. Draw a sample 𝒙 β€² from 𝑄(𝒙 β€² |𝒙) (β€œproposal”) 2. With probability 𝛼 = min 𝑃 𝒙 β€²

𝑃 𝒙 , 1 accept 𝒙 β€² as new state 𝒙 3. Emit current state 𝒙 as sample

Requirements:

β€’ Proposal distribution 𝑄(𝒙 β€² |𝒙) – must generate samples, symmetric

β€’ Target distribution 𝑃 𝒙 – with point-wise evaluation Result:

β€’ Stream of samples approximately from 𝑃 𝒙

(13)

Jupyter-Notebook – Metropolis-Hastings.ipynb

(14)

The Metropolis-Hastings Algorithm

β€’ Initialize with sample 𝒙

β€’ Generate next sample, with current sample 𝒙

1. Draw a sample 𝒙 β€² from 𝑄(𝒙 β€² |𝒙) (β€œproposal”) 2. With probability 𝛼 = min 𝑃 π‘₯ β€²

𝑃 π‘₯

𝑄 π‘₯|π‘₯ β€²

𝑄 π‘₯ β€² |π‘₯ , 1 accept 𝒙 β€² as new state 𝒙 3. Emit current state 𝒙 as sample

β€’ Generalization of Metropolis algorithm to asymmetric Proposal distribution 𝑄 𝒙 β€² 𝒙 β‰  𝑄 𝒙 𝒙 β€²

𝑄 𝒙 β€² 𝒙 > 0 ⇔ 𝑄 𝒙 𝒙 β€² > 0

(15)

Properties

β€’ Approximation: Samples π‘₯ 1 , π‘₯ 2 , … approximate 𝑃(π‘₯)

Unbiased but correlated (not i.i.d.)

β€’ Normalization: 𝑃(π‘₯) does not need to be normalized

Algorithm only considers ratios 𝑃(π‘₯β€²)/𝑃(π‘₯)

β€’ Dependent Proposals: 𝑄 π‘₯ β€² π‘₯ depends on current sample π‘₯

Algorithm adapts to target with simple 1-step memory

(16)

Metropolis - Hastings: Limitations

β€’ Highly correlated targets

Proposal should match target to avoid too many rejections

β€’ Serial correlation

β€’ Results from rejection and too small stepping

β€’ Subsampling

Bishop. PRML, Springer, 2006

(17)

β€’ Metropolis algorithm formalizes: propose-and-verify

β€’ Steps are completely independent.

Propose

Draw a sample π‘₯ β€² from 𝑄(π‘₯ β€² |π‘₯)

Verify

With probability 𝛼 = min 𝑃 π‘₯ β€²

𝑃 π‘₯

𝑄 π‘₯|π‘₯ β€²

𝑄 π‘₯ β€² |π‘₯ , 1 accept 𝒙 β€² as new sample

Propose-and-Verify Algorithm

19

(18)

MH as Propose and Verify

β€’ Decouples the steps of finding the solution from validating a solution

β€’ Natural to integrate uncertain proposals Q

(e.g. automatically detected landmarks, ...)

β€’ Possibility to include β€œlocal optimization” (e.g. a ICP or ASM updates, gradient step, …) as proposal

Anything more β€œinformed” than random walk should improve convergence.

(19)

Fitting 3D Landmarks

3D Alignment with Shape and Pose

21

(20)

3D Fitting Example

right.eye.corner_outer left.eye.corner_outer

right.lips.corner left.lips.corner

(21)

3D Fitting Setup

Observations

β€’ Observed positions 𝑙 1 𝑇 , … , 𝑙 𝑇 𝑛

β€’ Correspondence: 𝑙 𝑅 1 , … , 𝑙 𝑅 𝑛

Parameters

πœƒ = 𝛼, πœ‘, πœ“, πœ—, 𝑑 Posterior distribution:

𝑃 πœƒ 𝑙 1 𝑇 , … , 𝑙 𝑇 𝑛 ∝ 𝑝 𝑙 1 𝑇 , … , 𝑙 𝑇 𝑅 |πœƒ 𝑃(πœƒ) Shape transformation

πœ‘ 𝑠 𝛼 = πœ‡ π‘₯ + ෍

𝑖=1 π‘Ÿ

𝛼 𝑖 πœ† 𝑖 𝛷 𝑖 (π‘₯)

Rigid transformation

β€’ 3 angles (pitch, yaw, roll) πœ‘, πœ“, πœ—

β€’ Translation 𝑑 = (𝑑 π‘₯ , 𝑑 𝑦 , 𝑑 𝑧 )

πœ‘ 𝑅 πœ‘, πœ“, πœ—, 𝑑 = 𝑅 πœ— 𝑅 πœ“ 𝑅 πœ‘ 𝒙 + 𝑑

Full transformation

πœ‘ πœƒ (π‘₯) = (πœ‘ 𝑅 ∘ πœ‘ 𝑆 )[πœƒ](π‘₯)

23

Goal: Find posterior distribution for arbitrary pose and shape

(22)

Proposals

β€’ Gaussian random walk proposals

"𝑄 πœƒ β€² |πœƒ = 𝑁(πœƒ β€² |πœƒ, Ξ£ πœƒ )"

β€’ Update different parameter types block-wise

β€’ Shape 𝑁(πœΆβ€²|𝜢, 𝜎 𝑆 2 𝐼 π‘šΓ— π‘š )

β€’ Rotation 𝑁 πœ‘ β€² πœ‘, 𝜎 πœ‘ 2 , 𝑁 πœ“ β€² πœ“, 𝜎 πœ“ 2 , 𝑁 πœ— β€² πœ—, 𝜎 πœ— 2

β€’ Translation 𝑁 𝒕 β€² 𝒕, 𝜎 𝑑 2 𝐼 3Γ—3

β€’ Large mixture distributions as proposals

β€’ Choose proposal 𝑄 𝑖 with probability 𝑐 𝑖

𝑄 πœƒ β€² |πœƒ = βˆ‘π‘ 𝑖 𝑄 𝑖 (πœƒ β€² |πœƒ)

(23)

3DMM Landmarks Likelihood

Simple models: Independent Gaussians

Observation of 𝐿 landmark locations 𝑙 𝑇 𝑖 in image

β€’ Single landmark position model:

𝑝 𝑙 𝑇 πœƒ, 𝑙 𝑅 = 𝑁 πœ‘ πœƒ 𝑙 𝑅 , 𝐼 3Γ—3 𝜎 2

β€’ Independent model (conditional independence):

𝑝 𝑙 1 𝑇 , … , 𝑙 𝑇 𝑛 |πœƒ = ΰ·‘

𝑖=1 𝐿

𝑝 𝑖 𝑙 𝑇 𝑖 |πœƒ

25

(24)

3D Fit to landmarks

β€’ Influence of landmarks uncertainty on final posterior?

β€’ 𝜎 LM = 1mm

β€’ 𝜎 LM = 4mm

β€’ 𝜎 LM = 10mm

β€’ Only 4 landmark observations:

β€’ Expect only weak shape impact

β€’ Should still constrain pose

β€’ Uncertain landmarks should be looser

(25)

Posterior: Pose & Shape, 4mm

27

ΖΈπœ‡ yaw = 0.511

ො

𝜎 yaw = 0.073 (4°)

ΖΈπœ‡ t

x

= βˆ’1 mm

ො

𝜎 t

x

= 4 mm

ΖΈπœ‡ 𝛼

1

= 0.4

ො

𝜎 𝛼

1

= 0.6

(Estimation from samples)

(26)

Posterior: Pose & Shape, 1mm

ΖΈπœ‡ yaw = 0.50

ො

𝜎 = 0.041 (2.4°)

ΖΈπœ‡ t

x

= βˆ’2 mm

ො

𝜎 = 0.8 mm

ΖΈπœ‡ 𝛼

1

= 1.5

ො

𝜎 = 0.35

(27)

Posterior: Pose & Shape, 10mm

30

ΖΈπœ‡ yaw = 0.49

ො

𝜎 yaw = 0.11 (7°)

ΖΈπœ‡ t

x

= βˆ’5 mm

ො

𝜎 t

x

= 10 mm

ΖΈπœ‡ 𝛼

1

= 0

ො

𝜎 𝛼

1

= 0.6

(28)

Summary: MCMC for 3D Fitting

β€’ Probabilistic inference for fitting probabilistic models

β€’ Bayesian inference: posterior distribution

β€’ Probabilistic inference is often intractable

β€’ Use approximate inference methods

β€’ MCMC methods provide a powerful sampling framework

β€’ Metropolis-Hastings algorithm

β€’ Propose update step

β€’ Verify and accept with probability

β€’ Samples converge to true distribution: More about this later!

Referenzen

Γ„HNLICHE DOKUMENTE

[r]

For reasonable interaction energies attributed to increasing order, the main extra contribution to polarity formation results from interactions up to next nearest neighbours..

A further development step towards an object model has been presented by Blanz and Vetter with the 3D Morphable Model (3DMM) [Blanz and Vetter, 1999]. They made the conceptual

β€’ BLOSUM matrices are based on local alignments from protein families in the BLOCKS database. β€’ Original paper: (Henikoff S & Henikoff JG, 1992;

β€’ Having a cavity does not depend on whether the patient has a toothache or gum problems. β€’ Does not depend on what the

Idea: Design a Markov Chain such that

Discriminative learning – large margin learning, SSVM, loss-based learning, learning with latent variables

Als Beispiel sind in der Abbildung die Wahrscheinlichkeitsverteilungen gegenΓΌbergestellt, die sich fΓΌr die QualitΓ€tsfunktion und eine Randbedingung aus den Unsicherheiten der