Linear-Time Algorithm for Trees

5.2 Results for Trees

5.2.4 Linear-Time Algorithm for Trees

This section discusses the algorithm contained in Theorem 1.8, which is a linear-time algorithm that computes an exact cut with the properties in Theorem5.12in a forest.

Theorem 5.14 (algorithmic version of Theorem 5.12, Theorem 1.8 restated).

For every forestG onnvertices, anm-cut (B, W)of width at most eG(B, W) ≤ 1

2 ∆(G) log₂ 1 diam^∗(G)

+ 7 log₂ 1 diam^∗(G)

+ 6

can be computed inO(n)time.

Remark 5.15.

The previous theorem also implies that a cut with the properties in Theorem5.6can be computed inO(n) time as the constructions of the cut in Theorem5.6and Theorem5.12are identical. Furthermore, it is not hard to adapt the algorithm to compute a cut with the properties in Theorem5.1or Lemma5.3.

Consider the algorithm contained in the previous theorem. When the input forest is not connected, it is not sufficient to work with one component as we did in the proof of Lemma4.1 for example. Indeed, consider a forest G that is composed of ` ≥ 2 components T1, T2, . . . , T`. The construction from Section5.2.1-5.2.3works with the relative diameter ofG. Denote bynthe number of vertices ofGand letnhbe the number of vertices ofThfor allh∈[`]. Note that

diam^∗(G) = 1 n

h∈[`]

nhdiam^∗(Th)

and, hence, there is an h ∈ [k] with diam^∗(Th) ≥ diam^∗(G). However, it might not be possible to distribute the vertex sets of the other components to the setsB andW of an exact cut (B, W) inGsuch that (B, W) cuts only edges withinTh. For a concrete example, consider the forestGon 50 vertices that is composed of a perfect ternary treeT1on 40 vertices, i. e., a perfect ternary tree of height three, and a path T₂on 10 vertices, see Figure 5.9. Then, diam^∗(G) = ₅₀¹ (7 + 10) = 0.34, diam^∗(T1) = ₄₀⁷ = 0.175, and diam^∗(T2) = ¹⁰₁₀ = 1. However, there is no bisection in G, which cuts only edges within T2, as T1

contains more than half of the vertices ofG.

When proving Theorem5.14, we will follow the construction presented in Section5.2.2and Section5.2.3, where only trees were considered due to Lemma5.2. The reason for doing so is that the notation simplifies when the considered forest is connected. However, the method would also work for a disconnected forest when considering a collection of paths that contains one longest path for each component of the forest instead of one longest path in the given tree. In order to use the simpler notation and to closely follow the construction, the following algorithmic version of Lemma5.2is derived.

Lemma 5.16 (algorithmic version of Lemma 5.2).

For every forestG onnvertices and for every m∈[n] the following can be computed inO(n) time a) if ∆(G)≥3, a tree T onn vertices withG⊆T,∆(T) = ∆(G), anddiam^∗(T) = diam^∗(G)or b) if∆(G)≤2, anm-cut of width at most∆(G)inG.

To prove Parta)of the previous lemma, one needs to compute a longest path in each component of the considered forest. Dijkstra described the procedure presented in Algorithm5.2to compute a longest path in a tree in linear time, see also [Bul+02]. There, for two verticesv, w of the input tree, dist(v, w) denotes the length of the uniquev,w-path in the input tree. A proof of correctness for Algorithm5.2is presented in this section as this procedure is generalized later in Section5.3.3to compute a heaviest path in a tree decomposition. Observe that the brute-force approach to compute the distance between every pair of vertices takes quadratic time, even in the simple case of an unweighted tree.

Algorithm 5.2:Computes a longest path in a tree.

Input: treeT onnvertices.

Output: a longest pathP ⊆T as an ordered list.

1 If n= 1then

2 ReturnP=T.

3 Endif

4 RootT at an arbitrary vertexr;

5 Find a leafsofT with dist(r, s)≥dist(r, s⁰) for all leaves s⁰ ofT;

6 RootT ats;

7 Find a leaftofT with dist(s, t)≥dist(s, t⁰) for all leavest⁰ ofT;

8 Returnthe uniques,t-path inT.

Lemma 5.17.

Algorithm5.2computes a longest path in the input tree T inO(n) time, wherendenotes the number of vertices of T.

Proof. Let T = (V, E) be an arbitrary tree on nvertices. First, it is shown that, when applied to T, Algorithm 5.2computes a longest path in T. Ifn= 1, thenT itself is a path and clearly Algorithm5.2 returns a longest path. So, from now on, assume thatn≥2. LetP be a longest path inT and denote byxandy the two leaves ofP. Note thatxandy must be leaves ofT as well, as otherwise the pathP could be extended. Letr,s, and tbe the vertices used in Algorithm5.2when applied to the treeT and letQbe the uniquer,s-path inT. For any three verticesu, v, w∈V, ifv is on the uniqueu,w-path inT, then dist(u, w) = dist(u, v) + dist(v, w), and otherwise dist(u, w)<dist(u, v) + dist(v, w).

For a contradiction, assume thatP andQhave no common vertex. Letz be the unique vertex inP with dist(r, z)≤dist(r, v) for allv ∈V(P). Furthermore, let z⁰ be the first vertex on the path fromz tor that is onQ, and note thatz⁰ ∈/ V(P). See Figure 5.10a)for a visualization. The choice of s in Line5 implies that dist(r, s)≥dist(r, y). Asz⁰ is onQas well as on the uniquer,y-path inT, it follows that dist(z⁰, s)≥dist(z⁰, y). The unique path fromxto sinT uses the vertexz⁰. Therefore,

dist(x, s) = dist(x, z⁰) + dist(z⁰, s) ≥ dist(x, z⁰) + dist(z⁰, y) > dist(x, y),

where the last inequality is strict becausez⁰is not onP. Hence, dist(x, s)>dist(x, y) and this contradicts thatP is a longest path. Consequently,P andQhave at least one common vertex.

Q r

z⁰

y s x

a) Notation ifP andQdo not have a common vertex.

Q r

z⁰

x y s

b)Notation ifP andQhave a common vertex.

Figure 5.10: Proof of Lemma5.17.

Letz andz⁰ be the first and the last vertex in Q, respectively, that are in P when traversingQfromr tos. Without loss of generality, we may assume that the pathP when traversed fromxtoy uses first the vertexz and then the vertexz⁰. Therefore,z⁰ is on the uniquer,y-path inT. See also Figure5.10b)for a visualization. The choice of sin Line 5implies that dist(r, s)≥dist(r, y). Asz⁰ is onQ and on the uniquer,y-path inT, it follows that dist(z⁰, s)≥dist(z⁰, y). Now,z⁰ being on the uniquex,s-path inT implies that

dist(x, s) ≥ dist(x, z⁰) + dist(z⁰, s) ≥ dist(x, z⁰) + dist(z⁰, y) ≥ dist(x, y).

Consequently, the uniquex,s-path inT is a longest path inT. So, there is a longest path inT that starts insand the choice of tin Line7implies that the unique s,t-path in T is a longest path inT.

To complete the proof, the running time of Algorithm5.2is estimated. By Lemma2.33, Line4and Line6each takeO(n) time. To find a leaf that is furthest away from the root of a tree, the algorithm can use the vertex discovered last with a breadth-first traversal that was started at the root. Therefore, Line5 and Line7each takeO(n) time. A list of the vertices on the uniques,t-path in T can be computed by following the path fromt up to the rootr, which takes at mostO(n) time. All in all, each step can be

executed inO(n) time and the desired running time is achieved. 2

Proof of Lemma 5.16. Recall the proof of Lemma5.2. The algorithm presented here follows the same construction. LetGbe an arbitrary forest on nvertices and let `be the number of components ofG.

Denote by T₁, . . . , T` the components of G. Furthermore, for every h∈ [`], let nh be the number of vertices ofTh. First, the algorithm determines the maximum degree ofGby traversing its adjacency lists, which takes O(kGk) =O(n) time. Afterwards, it determines the number `as well as the numbersnh for everyh∈[`], which takesO(kGk) =O(n) time according to Lemma2.25.

a) Assume that ∆(G) ≥ 3. If ` = 1, the algorithm returns T = G. Otherwise, for every h∈ [`], the algorithm computes a longest path Ph ⊆Th and its ends xh and yh. For each h∈ [`], this takes O(nh) time by Lemma5.17 and, hence, this takes O(n) time for all components together.

Since, for every h∈[`−1], the edge{yh, xh+1} can be added to Gin constant time, the overall running time isO(n).

b) Assume that ∆(G)≤2 and recall that This a path for every h∈[`]. The setB will be returned as an unordered listLB of vertices. Traversing the paths successively and adding the vertices to the listLB until it contains exactlymvertices gives the desired cut in O(n) time. 2

Recall that the construction used to prove Lemma5.7in Section 5.2.2uses aP-labeling, whereP is a longest path in the input tree. Using the general assumption that the input treeT satisfiesV(T) = [n] for some integern, aP-labeling can be stored in two integer arraysAL andAV of lengthn. More precisely, forv∈V(G), the entryAL[v] is the label of the vertexvand, for`∈[n], the entry AV[`] is the vertex ofGthat received label`. The next lemma says that such arrays can be computed in linear time.

Lemma 5.18.

For every treeT onnvertices and every pathP ⊆T, a P-labeling of the vertices ofT and the path-vertex for every vertex x∈V(T)can be computed inO(n)time.

Proof. LetT be an arbitrary tree onnvertices and letP be a path inT. Denote byx0 andy0the ends ofP. IfP is given as a graph and not as a list, the algorithm first determines a list (v0=x₀, v₁, . . . , v`=y₀) of the vertices onP in the order in which they appear onP, which takesO(n) time. Then, for eachh∈[`], the algorithm traverses the adjacency list of vh and movesvh−1 to the beginning of the adjacency list ofvh, which takes at mostOP

h∈[`]deg(vh)

=O(n) time. Using these reordered adjacency lists, the algorithm traversesT with a depth-first search starting aty₀. Observe that the vertices inP turn gray in the orderv`=y0, v`−1, . . . , v0=x0 and turn black in the orderv0=x0, v1, . . . , v`=y0. During the depth-first traversal, the algorithm labels each vertex when it turns black. While doing so, the path vertices are computed by keeping track of the vertex inP that is the last one that turned gray and is not yet black. To keep track of such a vertex, the algorithm uses a stack, where it pushes each vertex ofP when it turns gray and pops the top vertex when it turns black. The extra time needed at each vertex is

constant and, therefore, the entire procedure takesO(n) time. 2

Lemma 5.19 (algorithmic version of Lemma 5.13).

For every forest Gonn vertices and everym∈[n], a cut(B, W, Z)inGsatisfying one of the following options

1) Z =∅, |B|=m, and eG(B, W, Z)≤2, or

2) Z 6=∅, |Z| ≤ ¹₂n, |B| ≤m≤ |B|+|Z|, eG(B, W, Z)≤log₂

diam8^∗(G)

∆(G), and diam^∗(G[Z])≥2 diam^∗(G)

can be computed inO(n)time. The setsB,W, andZ are output as unordered lists of vertices.

Proof. LetGbe an arbitrary forest onnvertices and fix anm∈[n]. We follow the construction presented in the proof of Lemma5.7and use the same notation. Recall that the improvement in the bound on the width of the cut in Option2)presented in Section5.2.3is due to a tighter analysis and did not require to modify the construction. First, Lemma5.16implies that we may assume that Gis a tree as otherwise there is a cut satisfying Option1)or there is a tree with the same vertex set, the same maximum degree, and the same relative diameter asG, and both can be computed inO(n) time. The construction of the cut (B, W, Z) with the desired properties is summarized in Algorithm5.3. It uses the following additional definitions

g^b(v) := |T_v⁰|+|H_v^b|

|P_v^b| for every b-specialv∈VP and g^f(v) := |T_v⁰|+|H_v^f|

|Pv^f| for every f-specialv∈VP,

as well asg^b(v) :=∞for allv ∈VP that are not b-special and g^f(v) :=∞ for allv∈VP that are not f-special. Furthermore, in Line 10, a vertex v ∈VP is called adoubling vertex ifv satisfies one of the following

Algorithm 5.3:Computes a cut (B, W, Z) with the properties in Lemma5.19.

Input: treeG= (V, E) onnvertices, an integer m∈[n].

Output: cut (B, W, Z) with the properties stated in Lemma5.19.

1 Compute a longest pathP = (VP, EP) in Gand letLP be a list of the vertices on P;

2 d←diam^∗(G);

3 Compute aP-labeling of the vertices inGand the path-vertex for every vertex ofG.

Denote byL(v) the label of the vertexv∈V and byL⁻¹(`) the vertex that received label`∈[n];

4 If there is a v∈VP withv∈VP and L⁻¹(L(v) +m)∈VP then

5 Letvbe a vertex with v∈VP andL⁻¹(L(v) +m)∈VP;

6 B← {w: L(w) is betweenL(v) + 1 andL(v) +m}, Z ← ∅;

7 Else

8 For allv∈VP, computep^b(v) :=|P_v^b|, p^f(v) :=|P_v^f|, h^b(v) :=|H_v^b|, and h^f(v) :=|H_v^f|;

9 For allv∈VP, computeg^b(v) andg^f(v);

10 Determine a doubling vertexv∈VP;

11 If g^b(v)≤g^f(v)then

12 Z←H_v^b∪P_v^b;

13 Determine the vertexv_`^b and let ˜v be the vertex beforev onP;

14 If v^b_`= ˜v then B₁← ∅ else B₁← {x∈V: L(x) is betweenL(v^b_`) + 1 andL(˜v)};

15 Else

16 Z←H_v^f∪P_v^f;

17 Determine the vertexv^f;

18 B1← {x∈V: L(x) is betweenL(v) andL(v^f),x6=v^f};

19 Endif

20 c←2−₁₋¹_d, m˜ ←m− |B₁|;

21 Let (B2, W2) be ac-approximate ˜m-cut inG[Tv⁰] withe_G[Tv0](B2, W2)≤ log₂ ¹_d+ 1∆(G);

22 B←B1∪B2;

23 Endif

24 W ←V \(B∪Z);

25 Return(B, W, Z);

b) v is f-special andg^f(v)≤g^b(w) as well asg^f(v)≤g^f(w) for allw∈VP.

Proposition5.11implies that a b-special or an f-special vertex onPexists. Thus, if a doubling vertexv∈VP

is determined in Line10, thenv is b-special or f-special. Moreover, Proposition5.11implies that, if the doubling vertexvdetermined in Line10satisfiesg^b(v)≤g^f(v), thenvis b-special and satisfiesg^b(v)≤ ¹_d−1, which means that Case 2a) from the proof of Lemma 5.7 applies. Otherwise, the doubling vertex v determined in Line10is f-special and satisfiesg^f(v)≤ ¹_d−1, which means that Case 2b) from the proof of Lemma5.7applies.

In the construction in the proof of Lemma 5.7, vertices were identified with their labels. In the implementation, the vertices ofGare not renamed. From now on,L(v) is used to refer to the label of a vertexv∈V andL⁻¹(`) is used to refer to the vertex which received label`∈[n]. For example, for a vertexv∈V, the vertex whose label comesmsteps after the label ofvin the numeration isL⁻¹(L(v)+m).

The definition of NmandNm⁻¹ are adjusted to return vertices and to receive vertices and not labels, i. e., for eachv∈V, letNm(v) =L⁻¹(L(v) +m) andN_m⁻¹(v) =L⁻¹(L(v)−m).

So, all that is left to do is an analysis of the running time of Algorithm5.3. We will show that each line

can be implemented to run inO(n) time, which gives a total running time ofO(n). As diam^∗(G) = _n¹|VP|, and by Lemma5.17and Lemma5.18, Lines1-3take O(n) time. The algorithm stores theP-labeling in two arrays as discussed before Lemma5.18. Therefore, the algorithm can determine the label of a vertexv in constant time and also, when given a label, it can determine the corresponding vertex in constant time. To execute Line4, i. e., to check whether Case 1 applies, the algorithm needs to check if there is a vertexv∈VP withNm(v)∈VP. As a vertexv∈V lies inVP if and only if its path-vertex is vitself, the algorithm can check in constant time whether a vertexv∈V lies in VP. So, by traversing the listLP, the algorithm can execute Line4 inO(|VP|) =O(n) time. Moreover, if executed, this yields also the vertex v that is chosen in Line5. The sets in Line6 can be read off theP-labeling in O(n) time.

From now on, assume that Case 1 does not apply, so Lines 8-22 are executed. To compute the numbers in Line8, note that a vertex x∈VP lies inPv^b if and only if the path-vertex of Nm(x) isv.

Indeed, P_v^b = N_m⁻¹(Tv⁰)∩VP = N_m⁻¹(Tv⁰ ∪ {v})∩VP because Case 2 applies when Line 8 is executed.

First, the algorithm setsp^b(v) = 0 for all v ∈ VP. Then, it traverses all vertices xin the listLP and increasesp^b(v) by one for the path-vertexv ofNm(x). While doing so, for eachv∈VP, the algorithm keeps track of the smallest and the largest vertex ˆv∈Tv⁰ such that there is anx∈VP withNm(x) = ˆv. For eachv∈VP, if such vertices, say ˆvsand ˆv`withL(ˆvs)≤L(ˆv`), exist, thenN_m⁻¹(ˆvs) =v^bandN_m⁻¹(ˆv`) =v^b_` andh^b(v) = ˆv`−ˆvs+ 1−p^b(v), since each vertex betweenv^b andv_`^b lies inHv^b∪Pv^b. For eachv∈VP, if such vertices do not exist, thenvis not b-special and the algorithm setsh^b(v) = 0. Using the same ideas, the valuesp^f(v) andh^f(v) for everyv∈VP can be computed. All in all, the computation of the values in Line8takesO(|VP|) =O(n) time. Sincev∈VP is b-special if and only ifp^b(v)6= 0 andv∈VP is f-special if and only ifp^f(v)6= 0, the algorithm can now determine in constant time whether a vertexv∈VP is b-special or f-special. Therefore, Line9and Line 10together takeO(|VP|) =O(n) time. From now on, denote byv the vertex determined in Line10.

Assume that Lines12-14 are executed. Then, the vertex v is b-special as argued above. Using the vertices ˆvsand ˆv`, that were computed when determiningp^b(v) and that must exist sincevis b-special, the algorithm can computev^b andv^b_` in constant time. Thus, the setZ in Line12, which is the set of vertices betweenv^b andv_`^b, can be read off theP-labeling inO(n) time. The vertex ˜v in Line13can be obtained from the listLP and, hence, Line 13takesO(n) time. Line14can be executed inO(n) time by using the P-labeling. Similarly to Lines12-14, Lines16-18can be executed inO(n) time. Line20takes constant time as the algorithm can keep track of the size ofB1while computing it. The subgraph G[Tv⁰] can be computed inO(n) time by a depth-first traversal ofGthat is started atv, ignores the neighbors ofv that are inVP, and in the end deletes v. According to Lemma4.5b), Line21then takesO(n+|Tv⁰|) =O(n) time. AsB₁ andB₂are stored as lists and are disjoint as argued in the construction, Line22takesO(1)

time. Finally, Line24takesO(n) time by Proposition2.20. 2

This completes the description of the subroutines needed for the algorithm in Theorem5.14.

Proof of Theorem 5.14. Let G be a forest on n vertices and fix an arbitrarym ∈ [n]. Recall that Algorithm5.1, which was presented in the proof of Theorem5.12, can be applied to compute anm-cut inG. So, it suffices to analyze the running time of Algorithm5.1. As in the proof of Theorem5.12, lets^∗ be the number of executions of the while loop and, fors∈[s^∗], denote bynsthe number of vertices of the graphGafter thes^th execution of the while loop. Definen₀=n. Fix ans∈[s^∗], and consider the s^th execution of the while loop. Then, the application of Lemma5.13in Line3takesO(ns−1) time by Lemma5.19and a list of the vertices in ˜B as well as a list of the vertices in ˜Z are computed. Furthermore, the union in Line4is disjoint by(ii)from the proof of Theorem5.12and, hence, takes constant time when the setB is stored as a list. With the list of the vertices in ˜Z, the update ofGin Line5takesO(ns−1)

time by Corollary2.23. Therefore, thes^thexecution of the while loop takesO(ns−1) time. Using(iv)from

In this section, the inequality relating the minimum bisection width and the diameter of a tree in Theorem 5.6and its improved version in Theorem5.12are extended to tree-like graphs. The aim is to prove Theorem1.9, which was introduced in Section1.2.3. There, a graphG= (V, E) onnvertices and a tree decomposition (T,X) withX = (Xⁱ)i∈V(T) of Gare considered. In the general case, we do not work with the relative diameter of a graph but the parameter r(T,X). For a pathP ⊆T, theweight ofP with respect toX is defined asw_X(P) :=S that the relative weight of a path is at most one andr(T,X)≤1 holds for every tree decomposition (T,X), which is similar to diam^∗( ˜G)≤ 1 for every forest ˜G. Furthermore, if (T,X) is a path decomposition, thenr(T,X) = 1, which is similar to diam^∗( ˜G) = 1 if every component of ˜Gis a path. The next lemma points out further relations between the relative weight of a heaviest path in a tree decomposition and the relative diameter of a forest.

Lemma 5.20.

Every forest Gon nvertices allows a tree decomposition (T,X) of width at most1and size O(n) that satisfiesr(T,X)≥diam^∗(G).

Im Dokument On the Minimum Bisection Problem in Tree-Like and Planar Graphs (Seite 175-181)