Prerequisites for the Correctness Proof - An Advanced Application: Tarjan’s Algorithm

4.7 An Advanced Application: Tarjan’s Algorithm

4.7.2 Prerequisites for the Correctness Proof

To establish the correctness theorem about Tarjan’s algorithm, we have to develop some basic concepts beforehand.

As laid out earlier in Section 4.4, the framework is meant to be extended easily, even with general properties about depth-first search without having to modify the original theories⁹. For this reason, the following parts are modeled as being part ofDFS_invar, that is, any other DFS-based algorithm only needs to import the respective theory to gain access to those properties.

Root of an SCC

The first such concept is formalizing theroot of an SCC, i. e., the node of an SCC with the highest position in the tree (or, equivalent: the lowest discovery time of the SCC). This is expressed as all discovered nodes of the SCC being reachable from the root in the search tree:

9Of course, exceptions apply for any properties that need additional (general) information. For those, either an extension toparam_DFSandDFS_invarneeds to be created, or the original must be extended.

definition(inDFS_invar)scc_rootwhere scc_root s v scc←→is_scc scc

∧ v∈scc

∧ v∈dom(discovered s)

∧ scc∩dom(discovered s)⊆(tree s)^∗ ``{v}

Of course, this entails the existence of a path in the search tree from the root of the SCC to x, for any discovered nodex, asserted thatxand the root are not identical:

lemma(inDFS_invar)scc_root_scc_tree_trancl:

Jscc_root s v scc; x∈scc; x∈dom(δ s); x6=vK

=⇒ (v,x)∈ (tree s)⁺.

It can also be shown that a root is unique:

lemma(inDFS_invar)scc_root_unique_root:

Jscc_root s v scc; scc_root s v' sccK

=⇒ v = v'

Utilizing the knowledge about the search tree, we can eventually show that a node of an SCC is the root iff it has the minimum discovery time of that SCC.

lemma(inDFS_invar)scc_root_iff_Min_disc:

Jis_scc scc; r∈scc; r∈dom(discovered s)_K

=⇒ scc_root s r scc←→ δs r = Min{δs v | v∈scc∩dom(discovered s)}

This is an important fact, and a future building block for the proof of our correctness theorem: It allows to deduce the root of the SCC from the set of discovery times of that SCC.

Another important property is that during the search the (determined) root of an SCC does not change. The following lemma proves that given our states, in any possible future states'the root remains stable. The assumptions model the “possible future”, i. e., it is not built differently froms. Naturally, the rootrmust be discovered ins:

lemma(inDFS_invar)scc_root_transfer:

assumesr∈dom(discovered s) assumesfuture:

DFS_invar G param s'

dom(discovered s)⊆dom(discovered s')

∀x∈dom(discovered s).δ s x =δs' x

∀x∈dom(discovered s')−dom(discovered s).δ s' x≥counter s tree s⊆tree s'

showsscc_root s r scc←→scc_root s' r scc

Lowlink

The second concept to introduce before the correctness proof is a formalization of the lowlink that is used in Tarjan’s algorithm. While in the main algorithm, thelowlinkis a simple map, it lacks any semantics. Therefore we are going to define an expressive version of lowlink that can be used to define what lowlink-value any node will have at any point in time of the exploration.

4.7 An Advanced Application: Tarjan’s Algorithm

For this, we develop the concept of alowlink_path¹⁰: definition(inDFS_invar)lowlink_pathwhere

lowlink_path s v p w≡path E v p w∧p6= []

∧(last p, w)∈cross_edges s∪back_edges s

∧(length p > 1−→

p!1∈dom(_{finished s})

∧ (∀k < length p−1.(p!k, p!Suc k)∈tree s))

From this definition, alowlink_pathis a (non-empty) path along the search tree – except for the final edge, which is either a cross or back edge. It denotes those paths that are inherently necessary to build non-trivial SCCs: Every non-trivial SCC needs a cross or back edge, for else there is no cycle.

We can then collect the set of nodes reachable via such paths, resulting in thelowlink_set:

definition(inDFS_invar)lowlink_setwhere lowlink_set s v≡ {w∈dom(discovered s).

v = w

∨(v,w)∈E⁺ ∧(w,v)∈E⁺∧ (∃p. lowlink_path s v p w)}

Here, the setlowlink_set s vdenotes the set of possible candiates for the root of the SCC of v, given the current search states. This time, we also include trivial one-node SCCs by having the additional conditionw = v.

Finally, we define the property LowLink s vto be the minimum discovery time of all such possible candidates:

definition(inDFS_invar)LowLink s v≡Min(δ s ` lowlink_set s v)

From the basic understanding of lowlink follows that it cannot point further down the tree, thus:

lemma(inDFS_invar)LowLink_le_disc:

v∈dom(discovered s) =⇒ LowLink s v≤ δs v

A further intuition about lowlink is that wheneverLowLink s v = δs v, thenvis a root of its SCC. Of course, this does not hold at any time: Initially, when no successors ofvhave been discovered, the equality holds trivially, while the implication does not.

We can show this intution in its own lemma where the assumptions reflect the state shortly before a node is popped from the stack, or where it has been popped already. This is not incidentally.

lemma(inDFS_invar)LowLink_eq_disc_iff_scc_root:

v∈dom(finished s)∨ (stack s6= []∧v = hd(stack s)∧pending s ``{v}={})

=⇒ LowLink s v =δ s v←→ scc_root s v(scc_of E v)

The proof of this lemma is pretty straightforward in the←direction, using the fact that LowLink s v≤δs v. The→direction on the other hand is more involved: We need to show that every node of the SCC is reachable fromvvia a path in the search tree. This proof

10The formalization of paths is again the same introduced for automata (Section 3.2) and also the same as used earlier for Nested DFS (Section 4.6.2). That is, the predicatepath E v p wdenotes a path fromvtow inE, wherepcontains all the nodes visitedexcept for the final node. Thuswis not contained inp, given we do not visit it twice.

then makes heavy use of (consequences of) the Parenthesis Theorem, which allows to assume paths through the tree using timing information, for instance:

lemma(inDFS_invar)parenthesis_impl_tree_path:

assumesv∈dom(finished s)andw∈dom(finished s) and δs v <δs wandϕs v >ϕs w

shows (v,w)∈ (tree s)⁺

A final important lemma is a transfer lemma, i. e., showing that theLowLinkvalue of a node does not change, under certain conditions, when developping a state further:

lemma(inDFS_invar)LowLink_eqI:

assumesDFS_invar G param s' assumesdiscovered s⊆discovered s' assumeslowlink_set s w⊆lowlink_set s' w andlowlink_set s' w⊆lowlink_set s w∪X andw∈dom(discovered s)

and ^Vx.Jx∈X; x∈lowlink_set s' wK=⇒δs' x≥LowLink s w showsLowLink s w = LowLink s' w

Im Dokument CAVA – A Verified Model Checker (Seite 85-88)