• Keine Ergebnisse gefunden

Algorithms in the PL-setting

Im Dokument Computational Topology for Data Analysis (Seite 187-197)

Reeb Graphs

7.2 Algorithms in the PL-setting

Piecewise-linear setting. Consider a simplicial complex Kand a PL-function f :|K| →Ron it. SinceRf depends only on the connectivity of each levelset, for a generic function f (where no two vertices have the same function value), the Reeb graph of f depends only on the 2-skeleton

of K. From now on, we assume that f is generic and K = (V,E,T) is a simplicial 2-complex with vertex setV, edge setEand triangle setT. Letnv,neandnt denote the size ofV,E, andT, respectively, and setm= nv+ne+nt. We sketch algorithms to compute the Reeb graph for the PL-function f. Sometimes, they output the so-calledaugmented Reeb graph, which is essentially a refinement of the Reeb graphRf with certain additional degree-2 vertices inserted in arcs ofRf. Definition 7.3 (Augmented Reeb). Given a PL-function f : |K| → R defined on a simplicial complexK=(V,E,T), letRf be its Reeb graph andΦf :|K| →Rf(K) be the associated quotient map. Theaugmented Reeb graphof f :|K| →R, denoted bybRf, is obtained by inserting each point inΦf(V) :={Φf(v)|v∈V}as graph nodes toRf (if it is not already in).

r

q

p

w

Φf(p)

Φf(w)

Φf(p) Φf(q)

Φf(r)

Φf(w)

(a) (b) (c)

Figure 7.4: (a) A simplicial complexK. The set of 2-simpices ofK include4r pq,4r pw,4rqw, as well as the two dark-colored triangles incident to pand tow, respectively. (b) Reeb graph of the height function on|K|. (c) Its augmented Reeb graph.

For a PL-function, each critical point of the Reeb graphRf (w.r.t. ˜f :RfRinduced by f) is necessarily the image of some vertex inK, and thus the critical points form a subset of points in Φf(V). The augmented Reeb graphbRf then includes all remaining points inΦf(V) as (degree-2) graph nodes. See Figure 7.4 for an example, where as a convention, we plot a nodeΦf(v) at the same height (function value) asv.

We now sketch the main ideas behind two algorithms that compute the Reeb graph for a PL-function with the best time complexity, one deterministic and the other randomized.

7.2.1 AnO(mlogm)time algorithm via dynamic graph connectivity

Here we describe an O(mlogm)-time algorithm [252] for computing the Reeb graph of a PL-function f :|K| → R, whose time complexity is the best among all existing algorithms for Reeb graph computation. We assume for simplicity that no two vertices inV share the same f-value.

AsK= (V,E,T) is a simplicial 2-complex, the level set f1(a) for any function valuea con-sists of nodes (intersection of the level set f−1(a) with edges inE) and edges (intersection of the levelset f−1(a) with some triangles in T). This can be viewed as yet another graph, which we denote byGa = (Wa,Fa) and refer to as thepre-image graph: Each vertex inWa corresponds to some edge inE. Each edge inFaconnects two vertices inWaand thus can be associated to a pair of edges in Eadjoining a certain triangle inT. See Figure 7.5 for an example. Obviously, con-nected components inGacorrespond to connected components in f−1(a), and under the quotient mapΦ, each component is mapped to a single point in the Reeb graphRf.

v f

a

Ga

b

Gb

Figure 7.5: As one sweeps past v, the combinatorial structure of the pre-image graph changes.

Gahas 3 connected components (one of which contains a single point only), whileGbhas only 2 components.

A natural idea to construct the Reeb graph Rf of f : |K| → R is to sweep the domain K with increasing value ofa, track the connected components inGa during the course, and record the changes (merging or splitting of components, or creation and removal of components) in the resulting Reeb graph.

Furthermore, as f is a PL-function, the combinatorial structure ofGa can only change when we sweep past a vertex v ∈ V. When that happens, only edges/triangles from K incident tov can incur changes inGa. See Figure 7.5. Let sv denote the total number of simplicies incident on v. It is easy to see that as one sweeps through the vertexv, only O(sv) number of insertions and deletions are needed to update the pre-image graphGa. To be able to build the Reeb graph Rf, we simply need to maintain the connectivity ofGaas we sweep. Assuming we have a data structure to achieve this, the high level framework of the sweep algorithm is then summarized in Algorithm 12:Reeb-SweepAlg.

Algorithm 12Reeb-SweepAlg(K, f) Input:

A simplicial 2-complexKand a vertex function f :V(K)→R Output:

The Reeb graph of the PL-function induced by f

1: Sort vertices inV ={v1, . . . ,vnv} in increasing order of f-values

2: Initialize the Reeb graphRand the pre-image grpahGato be empty

3: fori=1 tonvdo

4: LC=LowerComps(vi)

5: UpdatePreimage(vi) \∗Update the pre-image graphGa∗\

6: UC =UpperComps(vi))

7: UpdateReebgraph(R,LC,UC,vi)

8: end for

9: OutputRas the Reeb graph

In particular, suppose we have a data structure, denoted by DynSF, that maintains a spanning forest of the pre-image graph at any moment. Each connected component in the pre-image graph is associated with a certain vertexvfromV, calledrepresentative vertexof this component, which

indicates that this component is created when passing throughv. We assume that the data structure DynSF allows the following operations: First, assume that a graph nodeea ∈Wain the pre-image graphGais generated by edgee∈K, that is,eais the intersection ofewith the levelset f−1(a).

• Find(e): given an edgee ∈ E, returns the representative vertex of the component in the current pre-image graphGacontaining the nodeea ∈Wagenerated bye.

• Insert(e,e0), Delete(e,e0): inserts an edge (ea,e0a) into Ga and deletes (ea,e0a) from Ga respectively while still maintaining a spanning forest forGaunder these operations.

Using these operations, the pseudo-codes for the subroutines called in algorithm Reeb-SweepAlg are given in Algorithms 13:LowerComps, 14:UpdatePreImage, and 15:UpdateReebGraph. (The routine UpperComps is symmetric to LowerComps and thus omitted.) These codes assume that edges ofKnot intersecting the levelsets are still in the pre-image graphs as isolated nodes; hence there is no need to add or remove isolated nodes.

Algorithm 13LowerComps(v) Input:

a vertexv∈K Output:

A listLcof connected components in the pre-image graph generated by the lower-star ofv

1: LC=empty list

2: forall edgesein the lower-star ofvdo

3: c=DynSF.Find(e)

4: ifcis not marked ‘listed’then

5: LC.add(c); and markcas ’listed’

6: end if

7: end for

Time complexity analysis. Suppose the input simplicial 2-complexK =(V,E,T) hasnvertices and m simplices in total. Sorting the vertices takes O(nlogn) time. Then steps 4 to 7 of the algorithm Reeb-SweepAlgperformsO(m) numbers of Find, Insertand Deleteoperations using the data structure DynSF.

One could use state-of-the-art data structure for dynamic graph connectivity as DynSF – in-deed, this is the approach taken in [146]. However, note that this is an offline version of the dynamic graph connectivity problem, as all insertions/deletions are known in advance and thus can be pre-computed. To this end, we assign each edge in the pre-image graph a weight, which is the time (f-value) it will be deleted from the pre-image graphGa. We then maintain a maximum spanning forest ofGa during the sweeping to maintain connectivity. In general, a deletion of a maximum-spanning tree edge (u,v) can incur expensive search in the pimage graph for a re-placement edge (asuandvmay still be connected). However, because of the specific assignment of edge weights, this expensive search is avoided in this case. If a maximum spanning tree edge is to be deleted, it will simply break the tree in the maximum spanning forest containing this edge, and no replacement edge needs to be identified. One can use a standard dynamic tree data

Algorithm 14UpdatePreImage(v) Input:

A vertexv∈K Output:

Update the pre-image graph after sweeping pastv

1: forall trianglesuvwincident onvdo

2: \∗w.l.o.g. assume f(u)< f(w)∗\

3: if f(v)< f(u)then

4: DynSF.Insert(vu,vw)

5: else

6: if f(v)> f(w)then

7: DynSF.Delete(vu, vw)

8: else

9: DynSF.Delete(uv,uw)

10: DynSF.Insert(vw,uw)

11: end if

12: end if

13: end for

Algorithm 15UpdateReebGraph(R,LC,UC,v) Input:

Current Reeb graphRfor f−1(−∞,f(v)), a vertexv, the listLC(resp. UC) of components in the lower-star (resp. upper-star) ofv

Output:

Update Reeb graphRto be that for sublevel set f−1(−∞, f(v)+ε] for an infinitesimally small ε >0

1: Create a new node ˆvinRcorresponding tov

2: Assign node ˆvto each component inUC

3: Create an arc inRbetween ˆv and the Reeb graph node corresponding to the representative vertex of eachcinLC

4: Return updated Reeb graphR

structure, such as the Link-Cut trees [280], to maintain the maximum spanning forest efficiently in O(logm) amortized time for each find/insertion/deletion operation. Putting everything together, it takesO(mlogm) time to compute the Reeb graph by the sweep.

Theorem 7.1. Given a PL-function f :|K| →R, let m denote the total number of simplices in the 2-skeleton of K. One can compute the (augmented) Reeb graphRf of f in O(mlogm)time.

7.2.2 A randomized algorithm withO(mlogm)expected time

In this section we describe a randomized algorithm [185] whose expected time complexity matches the previous algorithm. However, it uses a strategy different from sweeping: Intuitively, it directly

models the effect of the quotient mapΦ, but does so in a randomized manner so as to obtain a

Figure 7.6: The vertices are randomly ordered. Starting from the initial simplicial complex in (a), the algorithm performs vertex-collapse for vertices in this random order, as shown in (b) – (f).

In general, given f : X → R and associated quotient map Φ : X → Rf, each connected component (contour) C within a level set f−1(a) is mapped (collapsed) to a single point Φ(C) inRf. For the case where X = |K| and f is piecewise-linear over simplices in K, the image of the collection of contours passing through every vertex inV decides the nodes in the augmented Reeb graphbR, and intuitively contains sufficient information for constructingbR. The high level algorithm to compute the augmented Reeb graphbRis given in Algorithm 16:Reeb-RandomAlg. See Figure 7.6 for an illustration of the algorithm.

Algorithm 16Reeb-RandomAlg(K, f) Input:

A simplicial 2-complexKand a vertex function f :V(K)→R Output:

The augmented Reeb graph of the PL-function induced by f

1: LetV ={v1, . . . ,vnv}be a random permutation of vertices inV

2: SetK0 =Kand f0= f

3: fori=1 tonvdo

4: Collapse the contour of fi−1 : |Ki−1| → R passing through (incident to) vi and obtain complexKi

5: fi :|Ki| →Ris the PL-function onKiinduced from fi−1

6: end for

7: Output the final complexKnv as the augmented Reeb graph

In particular, algorithm Reeb-RandomAlgstarts with function f0 = f defined on the original simplicial complex K0 = K. Take a random permutation of all vertices inV = V(K). At the beginning of the i-th iteration, it maintains a PL-function fi−1 : |Ki−1| → R over a partially

q q

Figure 7.7: The function f is the height function. The contour incident to pointqfor the complex in (a) is collapsed, resulting a new complex in (b); and (c) the collapse of the contour within a single triangle incident to q. (d) An example where this triangle is bordering another triangle.

(e) There are two triangles incident toq that hasqbeing the mid-vertex; and they both need to be processed. The triangle qp1p4 does not have q as mid-vertex, and it is not touched while processingq.

collapsed simplicial complex Ki−1 whose augmented Reeb graph is the same as that of f. It then “collapses" the contour of fi−1 passing through the vertexvi and obtains a new PL-function fi :|Ki| →Rover a further collapsed simplicial complexKi that maintains the augmented Reeb graph.

The key is to implement this “collapse" step (lines 4-5). To see the effect of collapsing the contour incident to a vertex, see Figure 7.7 (a) and (b). To see how is the collapse implemented, first consider the triangle qp1p2 incident to vertexq as in Figure 7.7 (c), and assume thatq is themid-vertexof this triangle, that is, its height value ranks second among the three vertices of the triangle. Intuitively, we need to map each horizontal segment (part of a contour at different height) to the corresponding point along the edgesqp1andqp2. If this triangle incident toqthat we are collapsing has one or more triangles sharing the edgep1p2as shown in Figure 7.7 (d), then for each such incident triangle, we need to process it appropriately. In particular, see one such triangle (p1,p2,r) in Figure 7.7 (d), then, asq0is sent toq, the dotted edgerq0becomes edgerqas shown. Thus, the triangler p1p2is now split into two new trianglesqr p1andqr p2. In this case, it is easy to see that at most one of the new triangles will haveqas the mid-vertex. We collapse this triangle and continue the process until no more triangle withqas the mid-vertex is left (Figure 7.7 (b)). Triangle(s) incident toqbut not havingqas the mid-vertex are not processed, e.g., triangle qp1p4 in Figure 7.7 (e). At this point, the entire contour passing through qis collapsed into a single point, and lines 4-5 of the algorithm are executed.

After processing each vertex as described above, the algorithm Reeb-RandomAlgin the end

computes the final complex Knv in line 7. It is necessarily a simplicial 1-complex because no vertex can be the mid-vertex of any triangle, implying that there is no triangle left. It is easy to see that, by construction,Knvis the augmented Reeb graph w.r.t. f :|K| →R.

Time complexity. For each vertex v, the time complexity of the collapse is proportional to the number of triangles Tv intersected by the contour Cv passing through v. In the worst case, Tv = |nt|, giving rise toO(nvnt) worst case running time for algorithm Reeb-RandomAlg. This worst case time complexity turns out to be tight. However, if one processes the vertices in a random order, then the worst case behavior is unlikely to happen, and the expected running time can be proven to beO(mlognv) = O(mlogm). Essentially, one argues that an original triangle from the input simplicial complex is split onlyO(lognv) = O(logm) expected number of times thus creatingO(logm) expected number of intermediate triangles which takesO(logm) expected time to collapse. The argument is in spirit similar to the analysis of the path length in a randomly built binary search tree [109].

Theorem 7.2. Given a PL-function f : |K| → R defined on a simpicial 2-complex K with m simplices, one can compute the (augmented) Reeb graph in O(mlogm)expected time.

7.2.3 Homology groups of Reeb graphs

Homology groups for a graph can have non-trivial ranks only in dimension zero and one. There-fore, for a Reeb graphRf , we only need to considerH0(Rf) andH1(Rf). In particular, their rank β0(Rf) andβ1(Rf) are simply the number of connected components and the number of indepen-dent loops inRf respectively.

Fact 7.1. For a tame function f :X→R,β0(X)=β0(Rf)andβ1(X)≥β1(Rf).

The equalityβ0(X) =β0(Rf) in the above statement follows from the fact thatRf is the quo-tient spaceX/∼and each equivalent class itself is connected (it is a connected component in some levelset). The relation onβ1can be proven directly, and it is also a by-product of Theorem 7.4 below (combined with Fact 7.2). The above statement also implies that ifXis simply connected, thenRf is loop-free.

For the case whereX is a 2-manifold, more information about Xcan be recovered from the Reeb graph of a Morse function defined on it.

Theorem 7.3([107]). Let f : X →Rbe a Morse function defined on a connected and compact 2-manifold.

(i) if X is orientable,β1(Rf)=β1(X)/2; and (ii) if X is non-orientable,β1(Rf)≤β1(X)/2.

We now present a result that characterizesH1(Rf) w.r.t. H1(X) in a more precise manner, which also generalizes Theorem 7.3.

Horizontal and vertical homology. Given a continuous function f : X → R, let X=a :=

f−1(a) andXI := f−1(I) denote its levelset and interval set as before fora∈Rand for an open or closed intervalI ⊆Rrespectively. We first define the so-called horizontal and vertical homology groups with respect to f.

Ap-th homology classh∈Hp(X) ishorizontalif there exists a finite set of values{aiR}i∈A, whereAis a finite index set, such thathhas a pre-image under the mapHp(S

i∈AX=ai)→Hp(X) induced by inclusion. The set of horizontal homology classes form a subgroupHp(X) ofHp(X) since the trivial homology class is horizontal, and the addition of any two horizontal homology classes is still horizontal. We call this subgroupHp(X) thehorizontal homology group of X with respect to f. Thevertical homology group of X with respect to f is then defined as:

p(X) :=Hp(X)/Hp(X), the quotient ofHp(X) withHp(X).

The cosetω+Hp(X) for every classω∈Hp(X) provides an equivalence class in ˇHp(X). We call h a vertical homology class ifh+ Hp(X) is not 0 in ˇHp(X). In other words,h < Hp(X). Two homology classesh1andh2arevertically homologousifh1∈h2+Hp(X).

Fact 7.2. By definition,rank (Hp(X))=rank (Hp(X))+rank ( ˇHp(X)).

LetIbe aclosed intervalofR. We define theheight of I=[a,b] to beheight(I)=|b−a|; note that the height could be 0. Given a homology classh ∈Hp(X) and an intervalI, we say thathis supported by I ifh ∈ im (i) where i : Hp(XI) →Hp(X) is the homomorphism induced by the canonical inclusionXI ,→X. In other words,XIcontains ap-cycleγfrom the homology classh.

We define theheight of a homology class h∈Hp(X) to be height(h)= inf

Isupportshheight(I).

Isomorphism betweenHˇ1(X)andH1(Rf). The surjectionΦ:X→Rf(X) induces a chain map Φ#from the 1-dimensional singular chain group ofXto the 1-dimensional singular chain group of Rf(X) which eventually induces a homomorphismΦ :H1(X) →H1(Rf(X)). For the horizontal subgroup H1(X), we have thatΦ(H1(X)) = 0 ∈ H1(Rf(X)). HenceΦ induces a well-defined homomorphism between the quotient groups

Φˇ : ˇH1(X)= H1(X) H1(X)

H1(Rf(X))

H1(Rf(X)) =H1(Rf(X)).

The right equality above follows from thatH1(Rf(X)) = 0, which holds because every level set ofRf(X) consists only of a finite set of disjoint points due to the levelset-tameness of function f : X →R. It turns out that ˇΦis an isomorphism – Intuitively, this is not surprising asΦmaps each contour in the level set to a single point, which in turn collapses every horizontal cycle.

Theorem 7.4. Given a levelset tame function f : X → R, let Φˇ : ˇH1(X) → H1(Rf(X)) be the homomorphism induced by the surjectionΦ : X → Rf(X)as defined above. Then the map Φˇ is an isomorphism. Furthermore, for any vertical homology class h ∈ Hˇ1(X), we have that height(h)=height( ˇΦ(h)).

Persistent homology for f : RfR. We have discussed earlier that the Reeb graph of a levelset tame function f : X → R can be represented by a graph whose edges have monotone function values. Then, the function f : RfR can be treated as a PL-function on the simpli-cial 1-complexRf. This gives rise to the standard setting where a PL-function f is defined on a simplicial 1-complexRf whose persistence is to be computed. We can apply algorithm ZeroP-erDgfrom Section 3.5.3 to compute the 0-th persistence diagram Dgm0(f). For computing one dimensional persistence diagram Dgm1(f), one can modify this algorithm slightly by registering the function values of the edges that create cycles. These are edges that connect vertices in the same component. The function values of these edges are the birth points of the 1-cycles that never die. This algorithm takesO(nlogn+mα(n)) time wheremandnare the number of vertices and edges respectively inRf.

We can also compute the levelset zigzag persistence of f (Section 4.5) using the zigzag per-sistence algorithm in Section 4.3.2. However, taking advantage of the graph structures, one can compute the levelset zigzag persistence for a Reeb graph withnvertices and edges inO(nlogn) time using an algorithm of [5] that takes advantage of mergeable tree data structure [169]. Only the 0-th persistence diagram Dgm0(f) is nontrivial in this case. We can read the zeroth persistence diagram for the standard persistence using Theorem 4.15 from this level set persistence diagram.

Furthermore, for every infinite bar [ai,∞) in the standard one dimensional persistence diagram, we get a pairing (aj,ai) (open-open bar) in the zeroth levelset diagram Dgm0(f).

Reeb graphs can be a useful tool to compute the zeroth levelset zigzag persistence diagram of a function on a topological space. Let f : X → R be a continuous function whose zeroth persistence diagram we want to compute. We already observed that the function f induces a continuous function on the Reeb graphRf. To distinguish the two domains more explicitly, we denote the former function fXand the latter asfR. The following observation helps computing the zeroth levelset zigzag persistence diagram Dgm0(fX) because computationally it is much harder to process a space, say the underlying space of a simplicial complex than only a graph (simplicial 1-complex).

Proposition 7.5. Dgm0(fX) = Dgm0(fR)where the diagrams are for the zeroth levelset zigzag persistence.

The result follows from the following observation. Consider the levelset zigzag filtrationsFX andFRfor the two functions as in sequence (4.15).

FX :X(a0,a2)←-· · ·,→X(ai−1,ai+1)←-X(ai,ai+1) ,→X(ai,ai+2)←-· · ·,→X(an−1,an+1) FR:Rf(a

0,a2)←-· · ·,→Rf(a

i1,ai+1) ←-Rf(a

i,ai+1) ,→Rf(a

i,ai+2) ←-· · ·,→Rf(a

n1,an+1)

Using notation for interval setsXij = X(ai,aj)andRij = Rf(ai,aj), we have the following com-mutative diagram between the 0-th levelset zigzag persistence modules.

H0FX : H0(X00) //H0(X01)oo H0(X11)· · · //H0(Xnn−1)oo H0(Xnn)

H0FR: H0(R00) //H0(R10)oo H0(R11)· · · //H0(Rnn−1)oo H0(Rnn)

All vertical maps are isomorphism because the number of components in Xij is exactly equal to the number of components in the quotient space Rij = Xij/ ∼ which is used to define the Reeb graph. All horizontal maps are induced by inclusions. It follows that every square in the above diagram commutes. Therefore the above two modules are isomorphic.

Im Dokument Computational Topology for Data Analysis (Seite 187-197)