Homology inference for scalar fields - Topological Analysis of Point Clouds

Topological Analysis of Point Clouds

6.4 Homology inference for scalar fields

Suppose we are only given a finite sampleP⊂ Xfrom a smooth manifoldX ⊂ R^d together with a potentially noisy version ˆf of a smooth function f : X → R presented as a vertex function fˆ : P → R. We are interested in recovering the persistent homology of the sub-level filtration of f from ˆf. That is, the goal is to approximate the persistent homology induced by f from the discrete samplePand function values ˆf on points inP.

6.4.1 Problem setup For simplicity, we often write the filtration and the corresponding persistence module as Ff = {F_α}_α∈RandH_pFf ={H(Fα)}α∈R, when the choices of maps connecting their elements are clear.

Our goal is to approximate the persistence diagram Dgm_p(Ff) from point samples P and fˆ : P → R. Intuitively, we construct a specific ˇCech (or Rips) complex C^r(P), use ˆf to in-duce a filtration ofC^r(P), and then use its persistent homology to approximate Dgm_p(Ff). More specifically, we need to considernested pair filtrationfor eitherC^r(P) orVR^r(P).

Nested pair filtration. LetP_α ={p∈P | f(ˆ p) ≤ α}be the set of sample points with the func-tion value for ˆfat mostα, which presumably samples the sublevel setF_αofXw.r.t. f. To estimate the topology of F_α from these discrete sampleP_α, we consider either the ˇCech complexC^r(Pα) or the Rips complexVR^r(Pα). For the time being, considerVR^r(Pα). As we already saw in pre-vious sections, the topological information ofF_αcan be inferred from a pair of nested complexes VR^r(Pα),→^j^α VR^r⁰(Pα) for some appropriater< r⁰. To studyFf, we need to inspectF_α→F_βfor map is well-defined as the diagram above commutes. This gives rise to a persistence module im (iα∗);φ^β_α _α≤β, that is, a family of totally ordered vector spaces

im (iα) with commutative ho-momorphismsφ^β_αbetween any two elements. We formalize and generalize the above construction below.

Definition 6.11 (Nested pair filtration). A nested pair filtrationis a sequence of pairs of com-plexes{AB_α= (Aα,B_α)}α∈Rwhere (i)A_α,→ⁱ^α B_αis inclusion for everyαand (ii)AB_α ,→AB_βfor α≤βis given byA_α,→A_βandB_α ^j

βα

,→B_β. Thep-th persistence module of the filtration{AB_α}_α∈R is given by the homology module {im (H_p(Aα) → H_p(Bα));φ^β_α}_α≤β where φ^β_α is the restriction of j^β_α∗ on the imi_α∗. For simplicity, we say the module is induced by the nested pair filtration {A_α,→B_α}.

The high level approach of inferring persistent homology of a scalar field f :X →Rfrom a set of pointsPequipped with ˆf :P→Rinvolves the following steps:

Step 1. Sort all points ofαi in non-decreasing ˆf-values, P = {p₁, . . . ,pn}. Set αi = fˆ(pi) for i∈[1,n].

Step 2. Compute the persistence diagram induced by the filtration of nested pairs{VR^r(Pαi) ,→ VR^r⁰(Pα_i)}i∈[1,n](or{C^r(Pα_i),→C^r⁰(Pα_i)}i∈[1,n]) for appropriate parameters 0<r<r⁰. The persistent homology (as well as persistence diagram) induced by the filtration of nested pairs is computed via the algorithm in [105]. To obtain an approximation guarantee for the above approach, we consider an intermediate object defined by the intrinsic Riemannian metric on the manifoldX. Indeed, note that the filtration ofXw.r.t. f is intrinsic in the sense that it is indepen-dent of howXis embedded inR^d. Hence it is more natural to approximate its persistent homology with an object defined intrinsically forX.

Given a compact Riemannian manifoldX ⊂ R^d embedded inR^d, letdX be the Riemannian metric ofXinherited from the Euclidean metricd_E ofR^d. LetB_X(x,r) :={y∈X|d_X(x,y)≤r}be thegeodesic ball on Xcentered atxand with radiusr, andB^o_X(x,r) be the open geodesic ball. In contrast,BE(x,r) (or simplyB(x,r)) denotes the Euclidean ball inR^d. A ballB^o_X(x,r) isstrongly convexif for every pairy,y⁰ ∈B_X(x,r), there exists a unique minimizing geodesic betweenyand y⁰whose interior is contained withinB^o_X(x,r). For details on these concepts, see [76, 164].

Definition 6.12(Strong convexity). For x ∈ X, letρc(x;X) denote the supreme of radiusr such that the geodesic ballB^o_X(x,r) is strongly convex. Thestrong convexity radius of(X,dX) is defined asρc(X) :=infx∈Xρc(x;X).

LetdX(x,P) := infp∈PdX(x,p) denote the closest geodesic distance between x and the set P⊆ X.

Definition 6.13(ε-geodesic sample). A point setP⊂X is anε-geodesic sample of(X,dX) if for all x∈X,d_X(x,P)≤ε.

Recall thatP_α is the set of points in Pwith ˆf-value at mostα. The union of geodesic balls P^δ;X_α = S

p∈PαB_X(p, δ) is intuitively the “δ-thickening" ofP_αwithin the manifoldX. We use two kinds of ˇCech and Rips complexes. One is defined with the metricd_E of the ambient Euclidean space which we call(extrinsic) ˇCech complexC^δ(Pα) and(extrinsic) Rips complexVR^δ(Pα). The other isintrinsic ˇCech complexC^δ_X(Pα) andintrinsic Rips complexVR^δ_X(Pα) that are defined with the intrinsic metric d_X. Note that C^δ_X(Pα) is the nerve complex of the union of geodesic balls formingP^δ;X_α . Also the interleaving relation between the ˇCech and Rips complexes remains the same as for general geodesic spaces; that is,C^δ_X(Pα)⊆VR^δ_X(Pα)⊆C^2δ_X(Pα) for anyαandδ.

6.4.2 Inference guarantees

Recall from Chapter 4.1 that twoε-interleaved filtrations lead toε-interleaved persistence mod-ules, which further mean that the bottleneck distance between their persistence diagrams are bounded byε. Here we first relate the space filtration with the intrinsic ˇCech filtrations and then relate these intrinsic ones with the extrinsic ˇCech or Rips filtrations of nested pairs as illustrated in Eqn. 6.24 below.

{F_α}^oo ^//{P^r;X_α }^oo ^//{C^r_X(Pα)}^oo ^//{C^r(Pα),→C^r⁰(Pα)}or{VR^r(Pα),→VR^r⁰(Pα)} (6.24)

Proposition 6.14. Let X ⊂ R^d be a compact Riemannian manifold with intrinsic metricd_X, and let f :X →Rbe a C-Lipschitz function. Suppose P⊂X is anε-geodesic sample of X, equipped with fˆ:P →Rso that fˆ= f|_P. Then, for any fixedδ≥ ε, the filtration{F_α}_αand the filtration {P^δ;X_α }_αare (Cδ)-interleaved w.r.t. inclusions.

The intrinsic ˇCech complex C^δ_X(Pα) is the nerve complex for {B_X(p, δ)}p∈Pα. Furthermore, for δ < ρc(X), the family of geodesic balls in {BX(p, δ)}p∈P_α form a cover of the union P^δ;X_α that satisfies the condition of the Nerve Theorem (Theorem 2.1). Hence, there is a homotopy equivalence between the nerve complexC^δ_X(P_α) andP^δ;X_α . Furthermore, using the same argument for showing that diagram in Eqn. (6.16) commutes (Lemma 3.4 of [91]), one can show that the following diagram commutes for anyα≤β∈Randδ≤ξ < ρc(X):

Here the horizontal homomorphisms are induced by inclusions, and the vertical ones are isomor-phisms induced by the homotopy equivalence between a union of geodesic balls and its nerve complex. The above diagram leads to the following result (see Lemma 2 of [87] for details):

Corollary 6.15. Let X, f , and P be as in Proposition 6.14 (although f does not need to be C-Lipschitz). For anyδ < ρc(X),{P^δ;X_α }_α∈Rand{C^δ_X(Pα)}α∈Rare0-interleaved. Hence they induce isomorphic persistence modules which have identical persistence diagrams.

Combining with Proposition 6.14, this implies that the filtration{C^δ_X(Pα)}_αand the filtration {F_α}_αare Cδ-interleaved forε≤δ < ρc(X).

However, we cannot access the intrinsic metricdX of the manifoldXand thus cannot directly construct intrinsic ˇCech complexes. It turns out that for points that are sufficiently close, their Euclidean distance forms a constant factor approximation of the geodesic distance between them onX.

Proposition 6.16. Let X⊂R^dbe an embedded Riemannian manifold with reachρX. For any two points x,y∈X withdE(x,y)≤ρX/2, we have that:

This implies the following nested relation between the extrinsic and intrinsic ˇCech complexes:

C^δ_X(Pα)⊆C^δ(Pα)⊆C Note that a similar relation also holds between the intrinsic ˇCech filtration and the extrinsic Rips complexes due to the nested relation between extrinsic ˇCech and Rips complexes. To infer persis-tent homology from nested pairs filtrations for complexes constructed under the Euclidean metric, we use the following key lemma from [87], which can be thought of as a persistent version as well as a generalization of Fact 6.3.

Proposition 6.17. Let X,f , and P be as in Proposition 6.14. Suppose that there existε⁰ ≤ ε⁰⁰ ∈ [ε, ρc(X))and two filtrations{G_α}_αand{G⁰_α}_α, so that

for allα∈R, C^ε_X(Pα)⊆G_α⊆C^ε_X⁰(Pα)⊆G⁰_α⊆C^ε_X⁰⁰(Pα).

Then the persistence module induced by the filtration{F_α}_α for f and that induced by the nested pairs of filtrations{G_α ,→G⁰_α}_αare Cε⁰⁰-interleaved, where f is C-Lipschitz.

Combining this proposition with the sequences in Eqn. (6.26), we obtain the following results on inferring the persistent homology induced by a function f :X →R.

Theorem 6.18. Let X ⊂ R^d be a compact Riemannian manifold with intrinsic metric dX, and f :X → Ra C-Lipschitz function on X. LetρX andρc(X)be the reach and the strong convexity radius of (X,d_X) respectively. Suppose P ⊂ X is an ε-geodesic sample of X, equipped with

fˆ:P→Rsuch that fˆ= f|_P. Then:

(i) for any fixed r such that ε ≤ r ≤ min{₁₆⁹ρc(X),₃₂⁹ρX}, the persistent homology module induced by the sublevel-set filtration of f : X → R and that induced by the filtration of nested pairs{C^r(Pα),→C⁴³^r(Pα)}αare ¹⁶₉Cr-interleaved; and

(ii) for any fixed r such that2ε ≤ r ≤ min{₃₂⁹ρc(X),₆₄⁹ρX}, the persistent homology module induced by the sublevel set filtration of f and that induced by the filtration of nested pairs {VR^r(Pα),→VR⁸³^r(Pα)}αare ³²₉Cr-interleaved.

In particular, in each case above, the bottleneck distance between their respective persistence diagrams is bounded by the stated interleaving distance between persistence modules.

6.5 Notes and Exercises

Part of Theorem 6.3 is proved in [77, 78]. A complete proof as well as a thorough treatment for geometric complexes such as Rips and ˇCech complexes can be found in [81]. The first approach on data sparsification for Rips filtrations is proposed by Sheehy [274]. The presentation of Chap-ter 6.2.1 is based on a combination of the treatments of sparsification in [56] and [275] (in [275], a net-tower created via net-tree data structure (e.g., [182]) is used for constructing sparse Rips filtration). Extension of such sparsification to ˇCech complexes and a geometric interpretation are provided in [70]. The Rips sparsification is extended to handle weighted Rips complexes derived from distance to measures in [56]. Sparsification via simplicial towers is introduced in [125].

This is an application of the algorithm we presented in Section 4.2 for computing persistent ho-mology for a simplicial tower. Simplicial maps allow batch-collapse of vertices and leads to more aggressive sparsification. However, in practice it is observed that it also has the over-connection issues as one collapses the vertices. This issue is addressed in [135]. In particular, the SimBa algorithm of [135] exploits the simplicial maps for sparsification, but connects vertices at sparser levels based on a certain distance between two sets (each of which intuitively is the set of original points mapped to a vertex at the present sparsified level). While SimBa has similar approximation guarantees in sparsification, in practice, the sparsified sequence of complexes has much smaller size compared to prior approaches.

Much of the materials in Section 6.3 are taken from [81, 87, 91, 245]. We remark that there have been different variations of the medial axis in the literature. We follow the notation from [119]. We also note that there exists a robust version of the medial axis, called theλ-medial axis, proposed in [89]. The concept of the local feature size was originally proposed in [270] in the context of mesh generation and a different version that we describe in this chapter was introduced in [8] in the context of curve/surface reconstruction. The local feature size has been widely used in the field of surface reconstruction and mesh generation; see the books [98, 119]. Critical points of the distance field were originally studied in [177]. See [89, 90, 225] for further studies as well as the development on weak feature sizes.

In homology inference for manifolds, we note that Niyogi, Smale and Weinberger in [245]

provide two deformation retract results from union of balls overPto a manifoldX; Proposition 3.1 holds for the case when P ⊂ X, while Proposition 7.1 holds when P is within a tubular neighborhood ofX. The latter has much stronger requirement on the radiusα. In our presentation, Proposition 6.10 uses a corollary of Proposition 3.1 of [245] to obtain an isomorphism between the homology groups of union of balls and of X. This allows a better range of the parameter α– however, we lose the deformation retraction here; see the footnote above Proposition 6.10.

Results in Chapter 6.4 are mostly based on the work in [87].

This chapter focuses on presenting the main framework behind homology (or persistent ho-mology) inference from point cloud data. The current theoretical guarantees hold when input points sample the hidden domain well within Hausdorffdistance. For more general noise models that include outliers and statistical noise, we need a more robust notion of distance field than what we used in Section 6.3.1. To this end, an elegant concept called distance to measures (DTM) has been proposed in [79], which has many nice properties and can lead to more robust homo-logical inferences; see, e.g., [82]. An alternative approach using kernel-distance is proposed in [256]. See also [56, 79, 246] for data sparsification or homology inference for points corrupted with more general noise, and [55] for persistent homology inference under more general noise for input scalar fields.

Exercise

1. Prove Part (i) of Theorem 6.3.

2. Prove the bound on the Rips pseudo-distancedRips(P,Q) in Part (ii) of Theorem 6.3.

3. Given two finite sets of pointsP,Q ⊂ R^d, letd_P andd_Qdenote the restriction of the Eu-clidean metric overPandQrespectively. Consider the HausdorffdistanceδH =^dH(P,Q) betweenPandQ, as well as the Gromov-HausdorffdistanceδGH =^dGH((P,d_P),(Q,d_Q)).

(i) Prove thatδGH ≤δH.

(ii) AssumeP,Q⊂R². LetTstand for the set of rigid transformations overR²(rotation, reflection, translations and their combinations). Let δ^∗_H := in ft∈TδH(P,t(Q)) denote the smallest Hausdorffdistance possible betweenPand a copy ofQunder rigid trans-formation. Give an example ofP,Q⊂ R²such thatδ^∗_H is much larger thanδGH, say δ^∗_H ≥10δGH(in fact, this can hold for any fixed constant).

4. Prove Proposition 6.5.

5. Consider the greedy permutation approach introduced in Chapter 6.2, and the assignment of exit-times for points p ∈ P. Construct the open tower{N_γ} and closed tower {N_γ} as described in the chapter. Prove that bothN_γandN_γ areγ-nets forP.

6. Suppose we are givenP₀ ⊃ P₁sampled from a metric space (Z,d) whereP₁ is aγ-net of P₀. Defineπ:P₀ →P₁asπ(p)7→argmin_q∈P₁d(p,q) (if argmin_q∈P₁d(p,q) contains more than one point, then setπ(p) to be any pointqthat minimizesd(p,q)).

(a) Prove that the vertex mapπinduces a simplicial mapπ:VR^α(P₀)→VR^α⁺^γ(P₁).

(b) Consider the following diagram. Prove that the mapj◦πis contiguous to the inclusion mapi.

VR^α(P₀) ⁱ ^//

VR^α⁺^γ(P₀)

VR^α⁺^?^γ(P1)

OO (6.27)

7. Let Pbe a set of points in R^d. Let d₂ andd₁ denote the distance metric under L₂ norm and underL₁norm respectively. LetC₂(P) andC₁(P) be the ˇCech filtration overPinduced byd₂andd₁respectively. Show the relation between the log-scaled version of persistence diagrams Dgm_logC2(P) and Dgm_logC1(P), that is, boundd_b(Dgm_logC2(P),Dgm_logC1(P)) (see the discussion above Corollary 4.4 in Chapter 4).

8. Prove Proposition 6.14. Using the fact that Diagram 6.25 commutes, prove Corollary 6.15.

Im Dokument Computational Topology for Data Analysis (Seite 177-184)