• Keine Ergebnisse gefunden

Homology inference for scalar fields

Im Dokument Computational Topology for Data Analysis (Seite 177-184)

Topological Analysis of Point Clouds

6.4 Homology inference for scalar fields

Suppose we are only given a finite sampleP⊂ Xfrom a smooth manifoldX ⊂ Rd together with a potentially noisy version ˆf of a smooth function f : X → R presented as a vertex function fˆ : P → R. We are interested in recovering the persistent homology of the sub-level filtration of f from ˆf. That is, the goal is to approximate the persistent homology induced by f from the discrete samplePand function values ˆf on points inP.

6.4.1 Problem setup For simplicity, we often write the filtration and the corresponding persistence module as Ff = {Fα}α∈RandHpFf ={H(Fα)}α∈R, when the choices of maps connecting their elements are clear.

Our goal is to approximate the persistence diagram Dgmp(Ff) from point samples P and fˆ : P → R. Intuitively, we construct a specific ˇCech (or Rips) complex Cr(P), use ˆf to in-duce a filtration ofCr(P), and then use its persistent homology to approximate Dgmp(Ff). More specifically, we need to considernested pair filtrationfor eitherCr(P) orVRr(P).

Nested pair filtration. LetPα ={p∈P | f(ˆ p) ≤ α}be the set of sample points with the func-tion value for ˆfat mostα, which presumably samples the sublevel setFαofXw.r.t. f. To estimate the topology of Fα from these discrete samplePα, we consider either the ˇCech complexCr(Pα) or the Rips complexVRr(Pα). For the time being, considerVRr(Pα). As we already saw in pre-vious sections, the topological information ofFαcan be inferred from a pair of nested complexes VRr(Pα),→jα VRr0(Pα) for some appropriater< r0. To studyFf, we need to inspectFα→Fβfor map is well-defined as the diagram above commutes. This gives rise to a persistence module im (iα∗);φβα α≤β, that is, a family of totally ordered vector spaces

im (iα) with commutative ho-momorphismsφβαbetween any two elements. We formalize and generalize the above construction below.

Definition 6.11 (Nested pair filtration). A nested pair filtrationis a sequence of pairs of com-plexes{ABα= (Aα,Bα)}α∈Rwhere (i)Aα,→iα Bαis inclusion for everyαand (ii)ABα ,→ABβfor α≤βis given byAα,→AβandBα j

βα

,→Bβ. Thep-th persistence module of the filtration{ABα}α∈R is given by the homology module {im (Hp(Aα) → Hp(Bα));φβα}α≤β where φβα is the restriction of jβα∗ on the imiα∗. For simplicity, we say the module is induced by the nested pair filtration {Aα,→Bα}.

The high level approach of inferring persistent homology of a scalar field f :X →Rfrom a set of pointsPequipped with ˆf :P→Rinvolves the following steps:

Step 1. Sort all points ofαi in non-decreasing ˆf-values, P = {p1, . . . ,pn}. Set αi = fˆ(pi) for i∈[1,n].

Step 2. Compute the persistence diagram induced by the filtration of nested pairs{VRr(Pαi) ,→ VRr0(Pαi)}i∈[1,n](or{Cr(Pαi),→Cr0(Pαi)}i∈[1,n]) for appropriate parameters 0<r<r0. The persistent homology (as well as persistence diagram) induced by the filtration of nested pairs is computed via the algorithm in [105]. To obtain an approximation guarantee for the above approach, we consider an intermediate object defined by the intrinsic Riemannian metric on the manifoldX. Indeed, note that the filtration ofXw.r.t. f is intrinsic in the sense that it is indepen-dent of howXis embedded inRd. Hence it is more natural to approximate its persistent homology with an object defined intrinsically forX.

Given a compact Riemannian manifoldX ⊂ Rd embedded inRd, letdX be the Riemannian metric ofXinherited from the Euclidean metricdE ofRd. LetBX(x,r) :={y∈X|dX(x,y)≤r}be thegeodesic ball on Xcentered atxand with radiusr, andBoX(x,r) be the open geodesic ball. In contrast,BE(x,r) (or simplyB(x,r)) denotes the Euclidean ball inRd. A ballBoX(x,r) isstrongly convexif for every pairy,y0 ∈BX(x,r), there exists a unique minimizing geodesic betweenyand y0whose interior is contained withinBoX(x,r). For details on these concepts, see [76, 164].

Definition 6.12(Strong convexity). For x ∈ X, letρc(x;X) denote the supreme of radiusr such that the geodesic ballBoX(x,r) is strongly convex. Thestrong convexity radius of(X,dX) is defined asρc(X) :=infx∈Xρc(x;X).

LetdX(x,P) := infp∈PdX(x,p) denote the closest geodesic distance between x and the set P⊆ X.

Definition 6.13(ε-geodesic sample). A point setP⊂X is anε-geodesic sample of(X,dX) if for all x∈X,dX(x,P)≤ε.

Recall thatPα is the set of points in Pwith ˆf-value at mostα. The union of geodesic balls Pδ;Xα = S

p∈PαBX(p, δ) is intuitively the “δ-thickening" ofPαwithin the manifoldX. We use two kinds of ˇCech and Rips complexes. One is defined with the metricdE of the ambient Euclidean space which we call(extrinsic) ˇCech complexCδ(Pα) and(extrinsic) Rips complexVRδ(Pα). The other isintrinsic ˇCech complexCδX(Pα) andintrinsic Rips complexVRδX(Pα) that are defined with the intrinsic metric dX. Note that CδX(Pα) is the nerve complex of the union of geodesic balls formingPδ;Xα . Also the interleaving relation between the ˇCech and Rips complexes remains the same as for general geodesic spaces; that is,CδX(Pα)⊆VRδX(Pα)⊆CX(Pα) for anyαandδ.

6.4.2 Inference guarantees

Recall from Chapter 4.1 that twoε-interleaved filtrations lead toε-interleaved persistence mod-ules, which further mean that the bottleneck distance between their persistence diagrams are bounded byε. Here we first relate the space filtration with the intrinsic ˇCech filtrations and then relate these intrinsic ones with the extrinsic ˇCech or Rips filtrations of nested pairs as illustrated in Eqn. 6.24 below.

{Fα}oo //{Pr;Xα }oo //{CrX(Pα)}oo //{Cr(Pα),→Cr0(Pα)}or{VRr(Pα),→VRr0(Pα)} (6.24)

Proposition 6.14. Let X ⊂ Rd be a compact Riemannian manifold with intrinsic metricdX, and let f :X →Rbe a C-Lipschitz function. Suppose P⊂X is anε-geodesic sample of X, equipped with fˆ:P →Rso that fˆ= f|P. Then, for any fixedδ≥ ε, the filtration{Fα}αand the filtration {Pδ;Xα }αare (Cδ)-interleaved w.r.t. inclusions.

The intrinsic ˇCech complex CδX(Pα) is the nerve complex for {BX(p, δ)}p∈Pα. Furthermore, for δ < ρc(X), the family of geodesic balls in {BX(p, δ)}p∈Pα form a cover of the union Pδ;Xα that satisfies the condition of the Nerve Theorem (Theorem 2.1). Hence, there is a homotopy equivalence between the nerve complexCδX(Pα) andPδ;Xα . Furthermore, using the same argument for showing that diagram in Eqn. (6.16) commutes (Lemma 3.4 of [91]), one can show that the following diagram commutes for anyα≤β∈Randδ≤ξ < ρc(X):

Here the horizontal homomorphisms are induced by inclusions, and the vertical ones are isomor-phisms induced by the homotopy equivalence between a union of geodesic balls and its nerve complex. The above diagram leads to the following result (see Lemma 2 of [87] for details):

Corollary 6.15. Let X, f , and P be as in Proposition 6.14 (although f does not need to be C-Lipschitz). For anyδ < ρc(X),{Pδ;Xα }α∈Rand{CδX(Pα)}α∈Rare0-interleaved. Hence they induce isomorphic persistence modules which have identical persistence diagrams.

Combining with Proposition 6.14, this implies that the filtration{CδX(Pα)}αand the filtration {Fα}αare Cδ-interleaved forε≤δ < ρc(X).

However, we cannot access the intrinsic metricdX of the manifoldXand thus cannot directly construct intrinsic ˇCech complexes. It turns out that for points that are sufficiently close, their Euclidean distance forms a constant factor approximation of the geodesic distance between them onX.

Proposition 6.16. Let X⊂Rdbe an embedded Riemannian manifold with reachρX. For any two points x,y∈X withdE(x,y)≤ρX/2, we have that:

This implies the following nested relation between the extrinsic and intrinsic ˇCech complexes:

CδX(Pα)⊆Cδ(Pα)⊆C Note that a similar relation also holds between the intrinsic ˇCech filtration and the extrinsic Rips complexes due to the nested relation between extrinsic ˇCech and Rips complexes. To infer persis-tent homology from nested pairs filtrations for complexes constructed under the Euclidean metric, we use the following key lemma from [87], which can be thought of as a persistent version as well as a generalization of Fact 6.3.

Proposition 6.17. Let X,f , and P be as in Proposition 6.14. Suppose that there existε0 ≤ ε00 ∈ [ε, ρc(X))and two filtrations{Gα}αand{G0α}α, so that

for allα∈R, CεX(Pα)⊆GαCεX0(Pα)⊆G0αCεX00(Pα).

Then the persistence module induced by the filtration{Fα}α for f and that induced by the nested pairs of filtrations{Gα ,→G0α}αare Cε00-interleaved, where f is C-Lipschitz.

Combining this proposition with the sequences in Eqn. (6.26), we obtain the following results on inferring the persistent homology induced by a function f :X →R.

Theorem 6.18. Let X ⊂ Rd be a compact Riemannian manifold with intrinsic metric dX, and f :X → Ra C-Lipschitz function on X. LetρX andρc(X)be the reach and the strong convexity radius of (X,dX) respectively. Suppose P ⊂ X is an ε-geodesic sample of X, equipped with

fˆ:P→Rsuch that fˆ= f|P. Then:

(i) for any fixed r such that ε ≤ r ≤ min{169ρc(X),329ρX}, the persistent homology module induced by the sublevel-set filtration of f : X → R and that induced by the filtration of nested pairs{Cr(Pα),→C43r(Pα)}αare 169Cr-interleaved; and

(ii) for any fixed r such that2ε ≤ r ≤ min{329ρc(X),649ρX}, the persistent homology module induced by the sublevel set filtration of f and that induced by the filtration of nested pairs {VRr(Pα),→VR83r(Pα)}αare 329Cr-interleaved.

In particular, in each case above, the bottleneck distance between their respective persistence diagrams is bounded by the stated interleaving distance between persistence modules.

6.5 Notes and Exercises

Part of Theorem 6.3 is proved in [77, 78]. A complete proof as well as a thorough treatment for geometric complexes such as Rips and ˇCech complexes can be found in [81]. The first approach on data sparsification for Rips filtrations is proposed by Sheehy [274]. The presentation of Chap-ter 6.2.1 is based on a combination of the treatments of sparsification in [56] and [275] (in [275], a net-tower created via net-tree data structure (e.g., [182]) is used for constructing sparse Rips filtration). Extension of such sparsification to ˇCech complexes and a geometric interpretation are provided in [70]. The Rips sparsification is extended to handle weighted Rips complexes derived from distance to measures in [56]. Sparsification via simplicial towers is introduced in [125].

This is an application of the algorithm we presented in Section 4.2 for computing persistent ho-mology for a simplicial tower. Simplicial maps allow batch-collapse of vertices and leads to more aggressive sparsification. However, in practice it is observed that it also has the over-connection issues as one collapses the vertices. This issue is addressed in [135]. In particular, the SimBa algorithm of [135] exploits the simplicial maps for sparsification, but connects vertices at sparser levels based on a certain distance between two sets (each of which intuitively is the set of original points mapped to a vertex at the present sparsified level). While SimBa has similar approximation guarantees in sparsification, in practice, the sparsified sequence of complexes has much smaller size compared to prior approaches.

Much of the materials in Section 6.3 are taken from [81, 87, 91, 245]. We remark that there have been different variations of the medial axis in the literature. We follow the notation from [119]. We also note that there exists a robust version of the medial axis, called theλ-medial axis, proposed in [89]. The concept of the local feature size was originally proposed in [270] in the context of mesh generation and a different version that we describe in this chapter was introduced in [8] in the context of curve/surface reconstruction. The local feature size has been widely used in the field of surface reconstruction and mesh generation; see the books [98, 119]. Critical points of the distance field were originally studied in [177]. See [89, 90, 225] for further studies as well as the development on weak feature sizes.

In homology inference for manifolds, we note that Niyogi, Smale and Weinberger in [245]

provide two deformation retract results from union of balls overPto a manifoldX; Proposition 3.1 holds for the case when P ⊂ X, while Proposition 7.1 holds when P is within a tubular neighborhood ofX. The latter has much stronger requirement on the radiusα. In our presentation, Proposition 6.10 uses a corollary of Proposition 3.1 of [245] to obtain an isomorphism between the homology groups of union of balls and of X. This allows a better range of the parameter α– however, we lose the deformation retraction here; see the footnote above Proposition 6.10.

Results in Chapter 6.4 are mostly based on the work in [87].

This chapter focuses on presenting the main framework behind homology (or persistent ho-mology) inference from point cloud data. The current theoretical guarantees hold when input points sample the hidden domain well within Hausdorffdistance. For more general noise models that include outliers and statistical noise, we need a more robust notion of distance field than what we used in Section 6.3.1. To this end, an elegant concept called distance to measures (DTM) has been proposed in [79], which has many nice properties and can lead to more robust homo-logical inferences; see, e.g., [82]. An alternative approach using kernel-distance is proposed in [256]. See also [56, 79, 246] for data sparsification or homology inference for points corrupted with more general noise, and [55] for persistent homology inference under more general noise for input scalar fields.

Exercise

1. Prove Part (i) of Theorem 6.3.

2. Prove the bound on the Rips pseudo-distancedRips(P,Q) in Part (ii) of Theorem 6.3.

3. Given two finite sets of pointsP,Q ⊂ Rd, letdP anddQdenote the restriction of the Eu-clidean metric overPandQrespectively. Consider the HausdorffdistanceδH =dH(P,Q) betweenPandQ, as well as the Gromov-HausdorffdistanceδGH =dGH((P,dP),(Q,dQ)).

(i) Prove thatδGH ≤δH.

(ii) AssumeP,Q⊂R2. LetTstand for the set of rigid transformations overR2(rotation, reflection, translations and their combinations). Let δH := in ft∈TδH(P,t(Q)) denote the smallest Hausdorffdistance possible betweenPand a copy ofQunder rigid trans-formation. Give an example ofP,Q⊂ R2such thatδH is much larger thanδGH, say δH ≥10δGH(in fact, this can hold for any fixed constant).

4. Prove Proposition 6.5.

5. Consider the greedy permutation approach introduced in Chapter 6.2, and the assignment of exit-times for points p ∈ P. Construct the open tower{Nγ} and closed tower {Nγ} as described in the chapter. Prove that bothNγandNγ areγ-nets forP.

6. Suppose we are givenP0 ⊃ P1sampled from a metric space (Z,d) whereP1 is aγ-net of P0. Defineπ:P0 →P1asπ(p)7→argminq∈P1d(p,q) (if argminq∈P1d(p,q) contains more than one point, then setπ(p) to be any pointqthat minimizesd(p,q)).

(a) Prove that the vertex mapπinduces a simplicial mapπ:VRα(P0)→VRα+γ(P1).

(b) Consider the following diagram. Prove that the mapj◦πis contiguous to the inclusion mapi.

VRα(P0)  i //

π

&&

VRα+γ(P0)

VRα+?γ(P1)

j

OO (6.27)

7. Let Pbe a set of points in Rd. Let d2 andd1 denote the distance metric under L2 norm and underL1norm respectively. LetC2(P) andC1(P) be the ˇCech filtration overPinduced byd2andd1respectively. Show the relation between the log-scaled version of persistence diagrams DgmlogC2(P) and DgmlogC1(P), that is, bounddb(DgmlogC2(P),DgmlogC1(P)) (see the discussion above Corollary 4.4 in Chapter 4).

8. Prove Proposition 6.14. Using the fact that Diagram 6.25 commutes, prove Corollary 6.15.

Im Dokument Computational Topology for Data Analysis (Seite 177-184)