Graph Theory - Optimization Challenges of the Future Federated Internet

(a) Simple graph

(b) Directed graph

Figure 2.1: A simple graph and a directed graph.

2.3 Graph Theory

In this section, we review the basics of graph theory and related algorithms required in later chapters of this work. We follow the definitions from [176].

Definition 2.3.1 (Graph). A graph G = (V, E) with n = |V| vertices an m = |E| edges consists of a vertex setV(G) ={v₁, . . . , v_n}and edge setE(G) ={e₁, . . . , e_m}, where each edge consists of two (possibly equal) vertices called its endpoints. Ife={u, v} ∈E(G), thenu andvare adjacent. A loop is an edge whose endpoints are equal. Parallel edges are edges that have the same pair of endpoints. A simple graph is a graph having no loops or multiple edges.

We will useV andE as shorthand forV(G) andE(G)whenGis clear from the context. We will use the word graph to denote simple graphs unless stated otherwise.

Definition 2.3.2(Directed Graph). A directed graphG= (V, A)consists of a vertex setV(G) and arc setA(G), where each arc is an ordered pair of vertices. Ifa= (u, v)∈A(G),uis the head andvthe tail of the arc. The choice of head and tail gives an arc a direction, from head to tail. A simple directed graph is a directed graph in which each ordered pair of vertices occurs at most once as an arc.

We will useAas shorthand forA(G)whenGis clear from the context.

Figure 2.1 shows an example for a simple graph and a directed graph. Both graphs have a vertex (node) set V = {1,2,3,4}. The simple (undirected) graph 2.1a has edge set E = {{1,2},{1,3},{2,4},{2,3},{3,4}}. The directed graph 2.1b has arc setA = {(1,2),(1,3), (2,3),(3,2),(2,4),(3,4)}.

Definition 2.3.3(Dense and Sparse Graphs). A graphGis considered to be dense ifm∝n², it is sparse ifm∝n.

Definition 2.3.4(Source and Target of an Arc). Given an arca = (u, v) ∈ A, s(a) denotes the source (head) of the arc, i.e.,s(a) = u whilet(a) denotes the target (tail) of the arc, i.e., t(a) =v.

Definition 2.3.5(Subgraph). A subgraph of a graph G is a graph H such thatV(H) ⊆V(G) andE(H)⊆E(G).

Definition 2.3.6 (Degree). The degree of vertex v of a simple graph G, written δ(v), is the number of edges containingv.

Definition 2.3.7(Incident Edges). The incident edges of vertexv of a simple graphGare the edges containingv.

Definition 2.3.8(In-Degree). The in-degree of a vertexvof a directed graphG, writtenδ_v⁻, is the number of arcs withvas target.

Definition 2.3.9(Out-Degree). The out-degree of a vertexvof a directed graphG, writtenδ⁺_v, is the number of arcs withvas source.

Definition 2.3.10(Shadow). The shadow of a directed graphGis an undirected graphS with the same vertex set. The edge set ofSis chosen such that adjacent vertices inGare adjacent in Sand vice versa.

Definition 2.3.11(Reversal Graph). The reversal graphG_Rof a directed graphGcontains all vertices ofG, and(u, v)∈A(GR)if and only if(v, u)∈A(G).

The degree of node2of the simple graph in Figure 2.1a is 3, i.e.,δ(2) = 3. The incident edges of node1of this graph are{1,2}and{1,3}. The in-degree of node3of the directed graph in Figure 2.1b is 2, i.e., δ₃⁻ = 2. The out-degree of node4is 0, i.e.,δ₄⁺ = 0. Graph 2.1a is the shadow of graph 2.1b.

Definition 2.3.12(Path). A pathpof lengthkis a sequencev₀, e₁, v₁, e₂, . . . , e_k, v_kof vertices and edges such that e_i = {v_i−1, v_i}, ∀i ∈ [1, k]. A path with no repeated vertices is called simple. The source or start ofp,s(p), isv0, the target or end,t(p), isvk. The nodes of the path pareN(p) ={v₀, . . . , v_k}, the edges of the path areE(p) ={e₁, . . . , e_k}.

Definition 2.3.13((Simple) Cycle). A pathpof length at least 2 is called a cycle, ifs(p) =t(p).

Ifpis simple (with the exception of its source and target node), the cycle is called simple.

Definition 2.3.14(Weighted Path). Given a pathpin graphGand a functionw:E(G) →R, the weight ofpisP

e∈E(p)w(e). If a weight function for edges is available, the length of a path refers to its weight instead of the number of its edges.

Definition 2.3.15(Node and Edge Disjoint Path). Two pathsp₁, p₂are node disjoint ifN(p₁)∩ N(p₂) =∅. They are edge disjoint ifE(p₁)∩E(p₂) =∅.

The definitions of a path for directed graphs and arc disjointness are analogous. Let p₁ = 1,(1,2),2,(2,4),4in graph 2.1b. Then the length ofp1 is 2,s(p1) = 1,t(p1) = 4,N(p1) = {1,2,4}andA(p1) ={(1,2),(2,4)}. Letp2 = 2,(2,3),3andp3 = 3. Thenp1andp2are arc disjoint but not node disjoint,p₁ andp₃are arc and node disjoint.

9 8

7 2

6 5

Figure 2.2: A tree.

Definition 2.3.16(Connected Graph). A graphGis connected if there is a path between every pair of vertices fromV(G).

Definition 2.3.17(Connected Components). The connected components of a graph G are its maximal connected subgraphs. A connected graph has one connected component.

Definition 2.3.18(Tree). A graphGis a tree, if it is connected and the path between each pair of vertices is unique. For trees,n=m+ 1. The root of a tree is a node ofV(G)and often used as starting point for algorithms on trees.

Figure 2.2 shows an example tree. Node 1 is the root of the tree. Nodes 4–9 are the leafs.

Nodes 2 and 3 are intermediate nodes and the children of 1. The parent of 4 is 2. Nodes 7–9 are siblings. Node 1 is ancestor of Node 7. The height (or depth) of the tree is 2 and is the length of the longest path from the root to one of the leafs.

Definition 2.3.19(Depth-First Search). Depth-first search is a traversal order of the nodes of the tree. Starting at the root, we select one of its children and then one of the children of the child and so on, until we have reached a leaf. Then we go back in direction of the root (back-tracking).

We stop at the first node that still has unexplored children and continue with one of those. The search is finished if all children of the root have been explored.

Depth-first search applied to the tree in Figure 2.2 could visit nodes in the following order (not showing nodes visited during back-tracking):1,3,8,9,7,2,4,6,5.

Definition 2.3.20(Biconnected Graph). A graphG is biconnected, if any vertex of V(G) or edge ofE(G)can be removed andGremains connected.

Definition 2.3.21(Articulation Point). Vertexv of graphGis an articulations point, if its re-moval increases the number of connected components.

Definition 2.3.22(Bridge). Edgeeof graphGis a bridge, if its removal increases the number of connected components.

Definition 2.3.23(Block). A block of a graphGis a maximal connected subgraph of G that has no articulation points.

If a block has more than two vertices, then it is biconnected. If it has two vertices, the edge connecting them has to be a bridge. Two blocks in the same graph share at most one vertex, hence the blocks of a graph partition its edge set, i.e., all edges belong to exactly one block.

A shared vertex has to be an articulation point, every articulation point belongs to at least two blocks.

Definition 2.3.24(Block Tree). The block treeB is built from a connected graphG, by adding all articulation points ofGtoB, and one vertex for every block ofG. Verticesv₁, v₂ ofB are connected, ifv₁represents an articulation point ofGwhich belongs to the block represented by v2.

Definition 2.3.25 (Weakly Connected Graph). A directed graph is weakly connected, if its shadow is connected.

Definition 2.3.26(Strongly Connected Graph). A directed graph is strongly connected, if there is a path in both directions between every pair of vertices.

Definition 2.3.27(Strongly Connected Components). The strongly connected components of a directed graph are its maximal strongly connected subgraphs.

The strongly connected components of a graph G can be calculated in O(m +n) by using Tarjan’s algorithm [166] based on depth-first search.

Definition 2.3.28(Strong Articulation Point). A vertex is a strong articulation point, if its re-moval increases the number of strongly connected components of a directed graph.

Definition 2.3.29(Strong Bridge). An arc is a strong bridge, if its removal increases the number of strongly connected components of a directed graph.

Definition 2.3.30(Flowgraph). A flowgraphG(s) = (V, A, s)is a directed graph with a start vertexsinV such that every vertex inV is reachable froms.

Definition 2.3.31(Dominator). Given a flowgraphG(s), vertexu is a dominator of vertex v if all paths from stov includeu. The trivial dominators of uare sandu. D(s)is the set of non-trivial dominators inG(s).

Definition 2.3.32(Immediate Dominator). Given a flowgraph G(s), vertexu is an immediate dominator ofvifuis a dominator ofvand every other non-trivial dominator ofvalso dominates u. The immediate dominator is unique.

Definition 2.3.33(Dominator Tree). The dominator treeDT(s)of a flowgraphG(s)contains all vertices ofG. There is an arc from a vertexu to a vertexvinDT(s)ifuis the immediate dominator of v. DT(s) is a tree rooted at s, the dominators of a vertex in G(s) are all its ancestors inDT(s)

Definition 2.3.34(Planar Graph). A graph Gis called planar, if it can be drawn in the plane without edge crossings.

Definition 2.3.35(Diameter of a Graph). The diameter of a graph is the length of the longest shortest path between any pair of vertices.

Definition 2.3.36(Small World Graph). A graph is a small world graph, if the average degree of each node is small, but the graph also has a small diameter. The diameter grows proportionally to the logarithm of the number of vertices.

2.3.1 Dominators

Efficiently calculating dominators and the dominator tree has been an open problem for a long time. Lengauer and Tarjan [115] presented an algorithm solving this problem inO(mα(m, n)), whereα(m, n)is the extremely slow-growing functional inverse of the Ackermann function, in 1979. Truly linear-time algorithms have been proposed by Harel [77], Alstrup [4] and Buchs-baum [22]. These algorithms either turned out to be wrong or far too complicated for a practical implementation. Georgiadis et al. [63] were able to present an implementable algorithm for finding dominators inO(m+n)in 2004 and Buchsbaum et al. [23] were able to correct their algorithm in 2005. The best source for actually implementing a linear time dominator algorithm seems to be the work of Buchsbaum, Georgiadis and Tarjan et al. [21] from 2008. For an easily implementable algorithm for dominators inO(n²)see the work of Cooper et al. [34].

2.3.2 Strong Articulation Points

The advances made with algorithms for finding the dominators in a flowgraph enabled Italiano et al. [94, 95] to formulate aO(m+n)algorithm for finding all strong articulation points in a directed strongly connected graphG. We will just present the main ideas here and refer to the referenced work for more details and proofs.

The first step of the algorithm is to determine for an arbitrary nodesif it is a strong articula-tion point. This is done by removingsfromGand checking if the remainder is still strongly connected, which can be done inO(m+n). In the second step, we calculate the dominators in G(s)and its reversalG_R(s), which is also inO(m+n). These dominators (possibly together withsdepending on the outcome of the first step) give all strong articulation points ofG.

To see why this is so, consider the following argument. It is clear that every dominator has to be an articulation point, since crossing a dominator is the only way to reach the node it dominates.

Removing the dominator means that the dominated nodes cannot be reached any more, which increases the number of strongly connected components, the defining property of articulation points. Therefore, we only need to be certain that we do not miss any strong articulation points, i.e., every strong articulation point has to be a dominator in either G(s) or GR(s). Assume there is a strong articulation pointaand a nodeb, nodeb being in another strongly connected component thansifawere to be removed. InG, there have to be paths fromstoband fromb tos. In one direction, there is only allowed to be a single path, which has to crossa, otherwise this would violate the assumption thatais a strong articulation point. If the path from stob crossesa, thenais a dominator inG(s). If the path frombtoscrossesa, thenais a dominator inGR(s). Therefore, it is not possible to miss a strong articulation point by using the outlined algorithm. Basically the same method can be used to find all strong bridges.

2.3.3 All Pair Shortest Path

The All Pair Shortest Path Problem is defined as follows:

Definition 2.3.37 (All Pair Shortest Path Problem). Given a graph G and a function w : E(G)→R, determine for each pair of vertices ofGthe shortest path.

There are two well-known algorithms for solving this problem. Both allow negative edge weights, but no cycles of negative length. The first algorithm is Johnson’s algorithm [97], solv-ing the problem inO(n²log(n) +nm)by essentially calculating for each vertex in theGthe shortest path to all other vertices. The alternative is the Floyd-Warshall algorithm [56] requiring a run-time ofO(n³). The modern implementation of this algorithm is essentially a series of n−1matrix multiplications [93].

Based on the run-time complexities, Johnson’s algorithm is the fastest choice for sparse graphs, while the Floyd-Warshall algorithm has an advantage for dense graphs. In this work, we deal with very sparse graphs, so Johnson’s algorithm is used to solve the All Pair Shortest Path Problem.

Im Dokument Optimization Challenges of the Future Federated Internet (Seite 40-45)