VC dimension of bisectors between curves

(1)

Master’s Thesis Seminar

VC dimension of bisectors between curves

Carolin Kaffine 3rd April 2020

(2)

Overview

• Recap of basic definitions

• Main goals

• Upper bounds:

• Approach 1: via composition lemma for halfspaces

• Approach 2: via VC dimension of function spaces

• Lower bounds

• Summary of results

• Outlook

(3)

Basic definitions

Definition:

• curve in R^d: continuous function V: [0,1]→R^d

• polygonal curve: piecewise linear

• X^d_k: space of all piecewise linear curves in R^d with k vertices

(4)

Basic definitions

Hausdorff and discrete Hausdorff distance:

d_H(V,W) := max (

sup

p∈V

q∈Winf d(p,q), sup

q∈W

p∈Vinf d(p,q) )

d_dH(V,W) := max

maxv∈V min

w∈Wd(v,w),max

w∈Wmin

v∈Vd(v,w)

(5)

Basic definitions

Fréchet distance:

d_F(V,W) = inf

f,g max

α∈[0,1]

V(f(α))−W(g(α)) ,

(6)

Basic definitions

• Range space (X,R): ground setX, ranges R∈ R ⊆2^X

• given Y ⊆X, it is shattered byR if {R∩Y |R∈ R}=2^Y

• VC dimension: greatest cardinality of shattered subset

(7)

Examples for VC dimension

Ground setX =R², ranges are disks:

⇒ VCdim≥3 ⇒ VCdim<4

In general:balls and halfspaces inR^d have VCdim=d+1.

(8)

Basic definitions

Shatter function(or growth function) for a range space (X,R):

π_(X,R)(m) = max

Y⊆X,|Y|=m

{R∩Y|R∈ R}

Shatter function lemma (or Sauer’s lemma)

For a range space(X,R) with VC dimension at most δ, we have π_(X_,R)(m)≤Φ_δ(m) :=

m 0

+ m

1

+· · ·+ m

δ

⇒polynomial growth inm sinceΦ_δ(m)≤ ^em_δ δ

∈ O(m^δ)

(9)

Bisectors between curves

Bisector range space:(X^dm,B_d,k)

• ground setX^dm =set of all curves inR^d withm vertices

• range setB_d,k with ranges

R_(V_,W₎={S ∈X^dm |d(V,S)≤d(W,S)}

for (V,W)∈X^d_k ×X^d_k.

(10)

Bisectors inR² form=1 (i.e. all distance functions are the same):

X²1,B_d,1

X²1,B_d,2

(11)

Form>1: no graphic representation of ranges anymore, because dimension gets higher than 3

(12)

Goal of Master’s thesis

Main goal:

Find upper and lower bounds on VC dimension of the bisector range space, dependent on

• m (complexity of shattered curves)

• k (complexity of curves that define bisectors)

• d (dimension)

• d_dH, d_H, d_dF, or d_F (used distance function)

Main paper:

“The VC Dimension of Metric Balls under Fréchet and Hausdorff Distances” by A. Driemel, A. Nusser, J.M. Phillips, and I. Psarros

(13)

Upper bound 1: via composition lemma

Composition lemma (simplified):

For a range space(X,R) with VCdim =δ, the range space of all unions/intersections ofn ranges inR has VC dimension

O(nδlogn).

Idea(for m=1):

• Write bisector ranges as unions and intersections of halfspaces

• By the composition lemma, we can bound VC dimension of bisector range space by considering VC dimension of halfspaces (which isd +1)

(14)

Step 1:For two points: bisector rangeR_(v₁_,w₁₎ is halfspace Step 2:Ranges get intersected when adding a point to one curve Step 3:Final range of two curves is union of intersections

(15)

(16)

Step 1:For two points: bisector rangeR_(v₁_,w₁₎ is halfspace

Step 2:Ranges get intersected when adding a point to one curve Step 3:Final range of two curves is union of intersections

(17)

Step 1:For two points: bisector rangeR_(v₁_,w₁₎ is halfspace

Step 2:Ranges get intersected when adding a point to one curve Step 3:Final range of two curves is union of intersections

(18)

Step 1:For two points: bisector rangeR_(v₁_,w₁₎ is halfspace Step 2:Ranges get intersected when adding a point to one curve

Step 3:Final range of two curves is union of intersections

(19)

(20)

(21)

(22)

(23)

(24)

For generald:

• range R_(V_,W₎=S

w∈W

T

v∈V h(v,w), where h(v,w)is halfspace of points that are closer tov than tow

• ifV,W have lengthk, we took(k−1)² ∈ O(k²) unions and intersections

⇒ VC dimension is in

O (k−1)²(d +1) log((k−1)²)

=O(k²dlogk)

(25)

Upper bound 2: via Thm on VC dimension of function spaces

Theorem:

Leth:R^a×R^b→ {0,1}and

H ={x 7→h(α,x)|α∈R^a}.

Supposeh can be computed by an algorithm that takes

(α,x)∈R^a×R^b as input and returnsh(α,x)after no more than t simple operations.

Then, the VC dimension ofH ist≤4a(t+2).

(26)

• Write bisector range R_(V_,W₎ as functionh (V,W),· that takes a curve S and outputs 1 ifS is closer to V than to W, and 0 else

⇒ Bisector range space can be written as(X^dm,H) for H=

S 7→h (V,W),S S ∈X^d_m,(V,W)∈X^d_k ×X^d_k

• so h: (X^d_k ×X^d_k

| {z }

∼=R^2dk

)×X^d_m→ {0,1}, i.e. a=2dk

• it remains to compute t, i.e. check how fasth can be computed

(27)

Upper bound 2: via Thm on VC dimension of function spaces Examplefor discrete Hausdorff distance:

Step 1:Calculate d(v,s)2

for allv ∈V,s∈S

Step 2:Find ddH(V,S) = max maxs∈Sminv∈Vd(v,s),maxv∈Vmins∈Sd(v,s) Step 3:Do same forW and take minimum of ddH(V,S)and ddH(W,S)

(28)

Examplefor discrete Hausdorff distance:

for allv ∈V,s ∈S

(29)

(30)

Step 2:Find ddH(V,S) = max maxs∈Sminv∈Vd(v,s),maxv∈Vmins∈Sd(v,s)

Step 3:Do same forW and take minimum of ddH(V,S)and ddH(W,S)

(31)

(32)

(33)

(34)

(35)

(36)

• In total: calculation of 2mk squared euclidean distances between vertices, each in O(d)

• O(mk) comparisons to find d_dH(V,S)

• All in all: t ∈ O(mkd) simple operations

⇒ VCdim≤4·2dk(c·mkd −1)∈ O(mk²d²)

(37)

Lower bounds

Idea:Find lower bounds for k =1 and/or m=1

⇒valid lower bound for all distance functions and all k andm Easy lower bound(for m=k =1:)

Bisector ranges look like halfspaces, so VCdim≥d +1

(38)

Lower bounds

Lower bound(for m=1):

• VC dimension of (open) k-gons is 2k+1

• Bisector ranges can look like open k-gons, so their VC dimension is≥2k+1

⇒Combining the two lower bounds we get VCdim ∈Ω(max(k,d))

(39)

Lower bounds

(40)

Lower bounds

(41)

Lower bounds

(42)

Lower bounds

(43)

Lower bounds

(44)

Lower bounds

(45)

Lower bounds

⇒Combining the two lower bounds we get VCdim ∈

(46)

Summary of results so far:

Upper bounds:

Distance function m arbitrary m=1 discrete Hausdorff O(mk²d²)

O(dk²logk) Hausdorff

– discrete Fréchet

O(k²d²) Fréchet

Lower bound:

Ω(max(k,d))

(47)

Outlook

Further goals:

• establish upper bounds for other distance functions than the discrete Hausdorff distance that depend on m

• establish better lower bounds by using geometric properties of bisector range spaces

• reduce gap between upper and lower bounds