• Keine Ergebnisse gefunden

How to deal with…

N/A
N/A
Protected

Academic year: 2022

Aktie "How to deal with…"

Copied!
51
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

How to deal with…

non-submodular and higher-order energies (Part 1)

Carsten Rother

27/06/2014 Machine Learning 2

(2)

Advertisement

• Optimization and Learning in discrete-domain models

(CRFs, Higher-order models, continuous label space, loss based learning, etc)

Theoretical Side:

• Scene recovery from multiple images

• 3D Scene understanding

• Bio Imaging Application Side:

Main Research Theme:

• Combining physics-based vision with machine learning:

Generative models meet discriminative models

(3)

State-of-the art CRF models

27/06/2014 3

y

i

Energy:

𝐸 𝒚, 𝒙, 𝒘 =

𝐹

𝐸𝐹 𝑦𝐹, 𝒙, 𝑤𝐹

Factors graph:

Factor graph - compact:

𝑝 𝒚|𝒙, 𝒘 = 1

𝑍 𝒙, 𝒘 𝒆

−𝐸 𝒚,𝒙,𝒘

Gibbs distribution:

Machine Learning 2

(4)

Deconvolution

Input x = K*y

Output

Combine physics and machine learning:

1) Using physics:

Add Gaussian “likelihood” (x-K*y)

2

2) Put into deep learning appraoch

x RTF1 y

1

RTF2 y

2

(Stacked RTFs)

y

[Schmidt, Rother, Nowozin, Jancsary, Roth, CVPR 2013.

(5)

Scene recovery from multiple images

27/06/2014 5

2 RGBD Input

Machine Learning 2

(6)

Scene recovery from single images

(7)

BioImaging

27/06/2014 7

Joint work with Myers group (Dagmar, Florian, and others)

Atlas

Instance

Machine Learning 2

(8)

3D Scene Understanding

• Training time: 3D objects

• Test time:

(9)

Advertisement

27/06/2014 9

• If you are excited about any these topics … come to us for a

“forschungspraktikum”, master thesis, diploma thesis, etc

• If you want to collaborate with top industry labs or university

… come to us. Examples:

• BMW, Adobe, Microsoft Research, Daimler, etc.

• Top universities: in Israel, Oxford, Heidelberg, etc.

Machine Learning 2

(10)

Advertisement

Smart 3D point cloud processing:

- 3D fine-grained recognition: type of aircraft, vehicle, objects,…

- Tracking: 3D models with varying degree of information - Structured data: how to define a CRF/RTF?

- Combine physics based vision (generative models) with machine learning

There is an opening for a master project / PhD student – if you are interested talk to me after lecture!

Joint project with “Institut für Luftfahrt und Logistik“

Lidar scanner

(11)

Reminder: Pairwise energies

27/06/2014 11

𝐸 𝑥 =

𝑖∈𝑉

𝜃𝑖(𝑥𝑖) +

𝑖,𝑗 ∈𝐸

𝜃𝑖𝑗(𝑥𝑖, 𝑥𝑗) + 𝜃𝑐𝑜𝑛𝑠𝑡

𝐺 = (𝑉, 𝐸) undirected graph

For now, 𝑥 ∈ {0,1}

Visualization of the full energy:

Submodular Condition:

𝜃𝑖𝑗(0,1) 𝑥𝑗 = 0

𝑥𝑖 = 0

𝜃𝑖𝑗 0,0 + 𝜃𝑖𝑗 1,1 ≤ 𝜃𝑖𝑗 1,0 + 𝜃𝑖𝑗 0,1

𝜃𝑖𝑗 (0,0)

𝜃𝑖𝑗(1,1) 𝜃𝑖𝑗 (1,0)

• If all terms are submodular then global optimum can be computed in polynomial time with graph cut

• If not…this lecture

𝜃𝑖 (0) 𝜃𝑖 (1)

𝑥𝑗 = 1

𝑥𝑖 = 1 𝑥𝑖 = 0

𝑥𝑖 = 1

Machine Learning 2

𝜃𝑖𝑗 (0,0) also sometimes written as: 𝜃𝑖𝑗;00

(12)

How often do we have submodular terms?

Label smoothness is often the natural condition:

In alpha expansion (reminder later) energy is often “naturally” submodular:

Neigboring pixels have more often than not the same label. We may choose:

𝜃𝑖𝑗 0,0 =𝜃𝑖𝑗 1,1 = 0; 𝜃𝑖𝑗 1,0 =𝜃𝑖𝑗 0,1 ≥ 0

𝜃𝑖𝑗 0,0 + 𝜃𝑖𝑗 1,1 ≤ 𝜃𝑖𝑗 1,0 + 𝜃𝑖𝑗 0,1

Image – left(a) Image – right(b) labelling

|𝑥𝑖 − 𝑥𝑗| 𝑐𝑜𝑠𝑡

(13)

Importance of good optimization

27/06/2014 13

[Data courtesy from Oliver Woodford]

Problem: Minimize a binary 4-connected energy (non-submodular) (choose a colour-mode at each pixel)

Input: Image sequence

Output: New view

Machine Learning 2

(14)

Importance of good optimization

Belief Propagation ICM, Simulated Annealing

Ground Truth

QPBOP

[Boros ’06, see Rother ‘07]

Global Minimum Graph Cut with truncation

[Rother et al ‘05]

QPBO [Hammer ‘84]

(black unknown)

(15)

Most simple idea to deal with non-submodular terms

• Truncate all non-submodular terms:

27/06/2014 Machine Learning 2: QPBO and Dual-Decomposition 15

𝜃𝑖𝑗 0,0 + 𝜃𝑖𝑗 1,1 > 𝜃𝑖𝑗 1,0 + 𝜃𝑖𝑗 0,1

𝜃𝑖𝑗 0,0 − 𝛿 + 𝜃𝑖𝑗 1,1 − 𝛿 = 𝜃𝑖𝑗 1,0 + 𝛿 + 𝜃𝑖𝑗 0,1 + 𝛿

𝛿 = 1

4[𝜃𝑖𝑗 0,0 + 𝜃𝑖𝑗 1,1 − 𝜃𝑖𝑗 1,0 − 𝜃𝑖𝑗 0,1 ] Better techniques to come…

(16)

How often do we have non-submodular terms?

• Learning (unconstraint parameters)

Graph connectivity: 64

MRF DTF

Red: non-submodular blue: submodular

Training Data Test Data

(17)

Texture Denoising

27/06/2014 17

Test image Test image (60% Noise)

Training images

Result MRF 9-connected

(7 attractive; 2 repulsive)

Result MRF 4-connected Result MRF

4-connected (neighbours)

Machine Learning 2

(18)

How often do we have non-submodular terms?

Deconvolution:

Hand-crafted scenarios:

Many more examples later: Diagram recognition, fusion move, etc.

Input Image User Input Global optimum

(19)

Reparametrization

27/06/2014 19

Two reparametrizations we need:

+𝛿

+𝛿

𝜃𝑐𝑜𝑛𝑠𝑡 − 𝛿

Pairwise transform

unary transform

[Minimizing non-submodular energies with graph cut, Kolmogorov, Rother, PAMI 2007]

Machine Learning 2

(20)

Put energies into “normal form”

1) Apply all pairwise transformations until For all pairs of incoming edges it is:

min 𝜃𝑝𝑞0𝑗, 𝜃𝑝𝑞1𝑗 = 0 for all directed edges p->q and all 𝑗 ∈ 0,1 2) Apply all unary transform until:

min 𝜃𝑝0, 𝜃𝑝1 = 0 for all p

(21)

Construct the graph

27/06/2014 Machine Learning 2 21

Minimum Cut through the graph gives the solution 𝑥 = 𝑎𝑟𝑔𝑚𝑖𝑛 𝐸(𝑥)

(22)

Construct the graph

Minimum Cut through the graph gives the solution 𝑥 = 𝑎𝑟𝑔𝑚𝑖𝑛 𝐸(𝑥)

(23)

QPBO method

27/06/2014 Machine Learning 2 23

[Hammer et al. ’84, Boros et al ’91; see Kolmogorov, Rother ‘07]

• Double number of variables:

• is submodular!

• Construct graph and solve with graph cut:

less than double the runtime for graph cut

• Method is called QPBO: Quadratic Peusdo Boolean Optimization (not good name)

) , (

) , (

) ( })

({

q p pq

q p pq

p p p

x x E

x x E

x E x

E

p p

p x x

x  ,

unary

pairwise submodular

pairwise non-submodular

 

 

 

2

) , 1

( )

1 , (

2

) 1

, 1

( )

, (

2

) 1

( )

}) ( {

}, ({

'

q p pq

q p

pq

q p

pq q

p pq

p p

p p p

p

x x E

x x

E

x x

E x

x E

x E

x x E

x E

(non-sub.) (sub.)

(24)

Read out the solution

• Assign labels based on minimum cut in auxiliary graph:

0

;

1 

p

p

x

x x

p

 1

1

;

0 

p

p

x

x x

p

 0

0

;

0 

p

p

x

x x

p

 ?

1

;

1 

p

p

x

x x

p

 ?

(25)

Properties

27/06/2014 25

• Autarky(Persistency) Property:

• Partial Optimality: labeled pixels in belong to a global minimum

• Labeled nodes have the same result as LP relaxation of the problem E (but QPBO is a very fast solver)

[Hammer et al ’84, Schlesinger ‘76, Werner ’07, Kolmogorov, Wainright ’05; Kolmogorov ’06]

0 0 0 0

0 0 0 0

0 0 0 0

1 1 ? ?

1 1 ? ?

1 1 ? ?

1 1 0 0

1 1 0 0

1 1 0 0

x (partial) y (any complete) z = FUSE(x,y)

Global optimum

x

Machine Learning 2

(26)

When do we get all nodes labeled?

• function is submodular

• t

• If there exist a flipping that makes the energy fully submodular, then QPBO will find it

• We can be simply “lucky”

• What to do with unlabelled nodes: run some other method

(e.g. BP)

(27)

Extension: QPBOP (“P” standard for “Probing”)

27/06/2014 27

0 ? ? ? ? ?

r

p q s t

0 0 0 ? ? 0 0 1 0 ?

• for a global minimum remove node from energy

• remove node from energy

• add directed link

• Why did QPBO not find this solution?

Enforce integer constraint on (tighter relaxation)

r

p q s t

 0 x

q

r

p

x

x

x

q

x

r s

p

x

x ,

QPBO:

Probe Node p:

0 1

p

r

p q s t

Machine Learning 2

(28)

Two extensions: QPBOP, QPBOI

1. Run QPBO - gives set of unlabeled nodes U

2. Probe a p U

3. Simplify energy: Remove nodes and add links

4. Run QPBO, update U

5. Stop if energy stays for all p U otherwise go to 2.

Properties: - New energy preserves global optimality

and (sometimes) gives the global minimum

- Order may effect result

(29)

QPBO versus QPBOP

27/06/2014 29

QPBOP

Global Minimum (0.4sec) QPBO

73% unlabeled (0.08sec)

Machine Learning 2

(30)

Extension: QPBOI (“I” standard for “Improve”)

• Property: [persistency property]

0 ? ? ?

? ? ? ?

? ? ? ?

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 1

0 0 1 0

0 0 0 0

x (partial)

y (e.g. from BP) y’ = FUSE(x,y)

0 0 ? ?

0 0 ? ?

? ? ? ?

0 0 0 1

0 0 1 ?

? ? ? ?

(31)

Extension: QPBOI (“I” standard for “Improve”)

27/06/2014 31

0 ? ? ?

? ? ? ?

? ? ? ?

• Property: [autarky property]

• QPBOI-algorithm: choose sequence of nested sets

• QPBO-stable: No set changes labelling - sometimes global minima

0 0 0 1

0 0 1 0

0 0 0 0

0 0 0 1

0 0 1 0

0 0 1 1

x (partial)

y’ y’’ = FUSE(x,y’)

0 0 0 1

0 0 1 ?

0 0 1 1

Machine Learning 2

(32)

Results

Three important factors:

• Degree of non-submodularity (NS)

• Unary strength

• Connectivity (av. degree of a node)

(33)

Results – Diagram Recognition

27/06/2014 33

Ground truth

GrapCut E= 119 (0 sec) ICM E=999 (0 sec) BP E=25 (0 sec)

QPBO: 56.3% unlabeled (0 sec) QPBOP (0sec) - Global Min.

P+BP+I, BP+I E=0 (0sec) Sim. Ann. E=0 (0.28sec)

2700 test cases: QPBOP solved all

Machine Learning 2

(34)

Results - Deconvolution

Ground Truth Input QPBO 45% unlab. (red) (0.01sec)

ICM E=14 (0sec)

QPBO-C 43% unlab. (red) (0.4sec) GC E=999 (0sec)

C+BP+I, Sim. Ann. E=0 (0.4sec)

BP E=5 (0.5sec) BP+I E=3.6 (1sec)

(35)

Move on to multi-label

• Let’s apply QPBO(P/I) methods to multi-label problems

• In particular alpha expansion

27/06/2014 Machine Learning 2: QPBO and Dual-Decomposition 35

(36)

Reminder: Alpha expansion

Sky House Tree Ground

Initialize with Tree Status: Expand Ground Expand House Expand Sky

• Variables take label a or retain current label

(37)

37 θij (xa,xb) = 0 iff xa=xb

Examples: Potts model, Truncated linear(not truncated quadratic)

[Boykov , Veksler and Zabih 2001]

Other moves strategies: alpha-beta swap, range move, etc.

θij (xa,xb) + θij (xb,xc) ≥ θij (xa,xc) θij (xa,xb) = θij(xb,xa) ≥ 0

• Given the original energy 𝐸(𝑥)

• At each step we have two solutions: 𝒙𝟎, 𝒙𝟏

• Define the (variable-wise) combination: 𝑥𝑖01 = (1 − 𝑥𝑖) 𝑥𝑖0 + 𝑥𝑖 𝑥𝑖1 (where 𝒙′ ∈ {0,1} is selection variable)

• Construct a new energy 𝐸′ such that 𝐸’(𝒙’) = 𝐸(𝒙𝟎𝟏)

• The move energy 𝐸’(𝒙’) is submodular if:

Reminder: Alpha expansion

𝑥0 𝑥1

Machine Learning 2

(38)

Reminder: Alpha Expansion

• What to do if non-submodular?

• Run QPBO

• For unlabeled pixels:

• choose solution (𝑥

0

or 𝑥

1

) that has lower energy 𝐸

• Replace unlabeled nodes with chosen solution

• Guarantees that new solution has equal or better energy

than both 𝐸 𝑥

0

and 𝐸 𝑥

1

(see Persistency property)

(39)

Fusion Move

27/06/2014 39

• Given the original energy 𝐸(𝑥)

• At each step we have two arbitrary solutions: 𝑥

0

, 𝑥

1

• Define the (variable wise) combination:

𝑥

𝑖01

= (1 − 𝑥

𝑖

) .∗ 𝑥

𝑖0

+ 𝑥

𝑖

.∗ 𝑥

𝑖1

(where 𝑥′ ∈ {0,1} is selection variable)

• Construct a new energy 𝐸′ such that 𝐸’(𝑥’) = 𝐸(𝑥

01

)

• Run QPBO an fix unlabeled nodes as above

• Comment, in practice often submodular if both solutions are good (since energy prefers neighboring node to be similar)

𝑥0 𝑥1

Machine Learning 2

(40)

Fusion move to make alpha expansion parallel

One processor needs 7 sequential alpha expansions for 8 labels:

1,2,3,4,5,6,7,8

Four processors need only 3 sequential steps (still 7 alpha expansions):

∎(1-2) ∎(3-4) ∎(5-6) ∎(7-8)

p1 p2 p3 p4

∎(1-4) ∎(5-8)

∎(1-8)

∎ means fusion

(41)

Fusion move for continuous label-spaces

27/06/2014 41

Local gradient cost: 𝑥𝑖 − 𝑥𝑖+1

Victor Lempitsky, Stefan Roth, and Carsten Rother, Fusion Flow:Discrete- Continuous Optimization for Optical Flow Estimation, CVPR 2008

Machine Learning 2

(42)

FusionFlow - comparisons

(43)

LogCut – Dealing efficiently with large label spaces

27/06/2014 43

Victor Lempitsky, Carsten Rother, and Andrew Blake, LogCut- Efficient Graph Cut Optimization for Markov Random Fields, in ICCV, 2007

Machine Learning 2

Optical flow:

1024 discrete labels

Ground truth

(44)

Log Cut – basic idea

q p

q p pq p

p

p

x E x x

E E

,

) , ( )

( )

(x

with

x

p

 [ 0 , K ]

• Encode label space 𝐾 (e.g. 𝐾=64) with log 𝐾 (e.g. 6 bits):

Example: 44 = 101100

• Alpha Expansion: we need 𝐾-1 binary decision to get a labeling out

We only need log 𝐾(here 6) binary decision to get a labeling out

(45)

Example stereo matching

27/06/2014 45

Stereo (Tsukuba) - 16 Labels:

Bit 4: Bit 3: Bit 2: Bit 1

0xxx 00xx 001x 0010

Machine Learning 2

0-7 versus 8-15 0-3 versus 4-7 0-1 versus 2-3 2 versus 3

(46)

How to choose the energy?

)) (

( min )

0 (

'

[0,3] p p

p x

E x

E

p

q p

q p pq p

p

p x E x x

E E

,

) ' , ' ( ' )

' ( ' )

' (

' x with

x '

p

 { 0 , 1 }

e.g. bit 3:

Unary:

E’ lower bound of E (tight if no pairwise terms)

] 3 , 0

[

xp xp[4,7]

(47)

How to choose the energy?

27/06/2014 47

Pairwise:

0 0

0 0

0 0

0 0 ]

, ) (

min[

) ,

, (x x a x x b

Ep q p qpq p

Approximations:

1. Choose One 2. Min

3. Mean

4. Weighted Mean 5. Training

) ' , ' (

'p,q x p xq E

) 0 , 0

,q( Ep

) 0 , 0 ( 'p,q E

1 3

3 3

b E

|

|xp xq

1 1 1

2 2 2

2 2

2 1

1 1 3

3 3

3 3 3 3

3 3 3 3 3

3 3 3 3

3 3 3 3

2 3 3 3

1 2 3 3 1 1 1

2 2 3

2 2 1

1 1 3

Machine Learning 2

(48)

Comparison

Image Restoration (2 different models):

One Min Mean weight

Mean TrainingaExp One Min Mean weight

Mean TrainingaExp

(49)

LogCut

27/06/2014 49

Iterative LogCut:

1. One Sweep – log(K) optimizations 2. Shift Labels

3. One Sweep – log(K) optimizations 4. Fuse with current solution

5. Go to 2.

Ener gy

no shift

½ shift

full shift

Labels:

1,2,3,4,5,6,7,8 Shift by 3:

6,7,8,1,2,3,4,5

Machine Learning 2

(50)

Results

Speed-up factor: 20.7

LogCut (2 iter); 8sec E=8767 LogCut (64 iter); 150sec E=8469

AExp (6 iter); 390sec E=8773 Ground Truth

Training Test

(51)

Results

27/06/2014 51

Train

(out of 10)

Test

(out of 10)

LogCut 1.5sec

Effic. BP 2.1sec

AExp 4.7sec

TRW 90sec

Machine Learning 2

Referenzen

ÄHNLICHE DOKUMENTE

Module A 04 – specialisation financial restructuring and insolvency management.. » redevelopment strategies and chances to restructure companies in

Die Projekte treten auf unterschiedliche Weise in direkte Interaktion mit dem Besucher und betonen die Bedeutung von Kommunikation und Dialog im Museum.. Kunst und deren

We will show that what is termed statistical shape models in computer vision and med- ical image analysis, are just special cases of a general Gaussian Process formulation, where

A major problem regarding such semantic verification approaches concerns rule dynamics, as semantic verification rules do not target the (stable) modeling language but rather the

Creating coherent life cycle databases for ecodesign and product declaration of agroindustrial products: how to deal with 4.. High demand for LCI databases in the

c) Sage aber auch, dass Benzin so teuer ist, dass es vielleicht fraglich ist, ob sich ein Wegzug aus der Stadt lohnt.. d) Sage, in welcher Stadt du am liebsten leben würdest

possible?.

remarkable technological breakthrough.