Generation of Counterexamples for Model Checking of Markov Decision Processes

(1)

Generation of Counterexamples for Model Checking of Markov Decision Processes

Husain Aljazzar and Stefan Leue

Department of Computer and Information Science University of Konstanz

D-78457 Konstanz, Germany

{Husain.Aljazzar, Stefan.Leue}@uni-konstanz.de

Abstract—The practical usefulness of a model checker as a debugging tool relies on its ability to provide diagnostic information, sometimes also referred to as a counterexample.

Current stochastic model checkers do not provide such di- agnostic information. In this paper we address this problem for Markov Decision Processes. First, we define the notion of counterexamples in this context. Then, we discuss three methods to generate informative counterexamples which are helpful in system debugging. We point out the advantages and drawbacks of each method. We also experimentally compare all three methods. Our experiments demonstrate the conditions under which each method is appropriate to be used.

Keywords-Stochastic Model Checking; Markov Decision Pro- cesses; Counterexamples; k-Shortest-Paths Search; Directed Search; K^∗

I. INTRODUCTION

The success of model checking [1], [2] as a debugging tool lies in the diagnostic information, also referred to as counterexamples, it returns to the user when a fault is detected. Due to the numerical nature of the algorithms used in stochastic model checking [3], [4], [5], [6], [7], stochastic model checkers such as PRISM [8] and MRMC [9]

cannot provide such diagnostic information. In precursory work [10], [11] we have addressed the generation of counterexamples for the violation of timed probabilistic reachability properties fordiscrete-andcontinuous-time Markov Chains (DTMCs and CTMCs) using heuristics guided, on- the-fly explicit state space search. That approach can not be directly applied toMarkov Decision Processes (MDPs).

An MDP is a discrete time probabilistic model similar to a DTMC. However, it possesses both probabilistic and non- deterministic transitions: each transition of an MDP consists of a non-deterministic choice of actions and, after an action has been selected, a discrete time probabilistic transition to some successor state. The semantics of an MDP depends on an assumed scheduler which resolves the non-deterministic choices [12]. Consequently, the probability mass of executions also depends on the chosen scheduler. Intuitively, a scheduler turns an MDP into a DTMC by removing the non-determinism. This insight motivates a meta approach for providing counterexamples for MDPs. For a given MDP and a property one computes an adequate, i.e., a maximizing

scheduler by a policy iteration procedure which originates from dynamic programming, such as proposed in [7], [13].

The MDP together with the determined scheduler represents a DTMC for which we can generate a counterexample using one of the existing approaches. The first method (A) is to apply the counterexample generation method presented in [14] which is based onk-shortest-paths(KSP). In Method A we just replace the the recursive enumeration algorithm (REA) [15], which is used in [14], with the more efficient lazy variant of Eppstein’s algorithm [16]. The original version of Eppstein’s algorithm is presented in algorithm [17].

We propose a new method (B) which is an improvement of Method A. Method B uses our directed KSP search algorithm K^∗ [18] instead of the lazy variant of Eppstein’s algorithm. The computation of an adequate scheduler for a very large MDP as well as computing the induced DTMC can be very expensive. Hence, we propose a third method (C) which can be applied to very large MDPs under tight memory constraints. This method performs K^∗ search directly on the MDP. It avoids generating and processing the whole state space of the MDP.

The contributions of this work are:

• Defining the notion of a counterexample for an MDP, relative to a property specified inPCTL logic[4], [7], taking schedulers into account.

• Devising Methods B and C for generating counterexamples.

• Implementing and experimentally comparing all three methods regarding their performance in different set- tings using a set of case studies.

As it will become clear in this paper, counterexamples in the context of stochastic model checking can be very complex.

Presenting and analysing such complex counterexamples are big challenges. We emphasize that this work does not discuss solutions to these issues. We presented an approach using interactive visualization techniques to support the human user in this task in [19].

Related Work. First results regarding counterexamples for MDPs are discussed in [20]. Method C presented in this paper is an improvement of that work. In [21] a

197

Publ. in: Proceedings of 6th International Conference on the Quantitative Evaluation of SysTems (QEST '09), IEEE Computer Society Press, 2009

Konstanzer Online-Publikations-System (KOPS)

(2)

variant of Method A was proposed where the counterexample generation method proposed by [14], [22] is applied to the obtained DTMC. A recent proposal [23] addresses the problem of generating and analysing counterexamples for MDPs against LTL formulae. It follows the idea of computing a scheduler and generating a counterexample for the induced DTMC, following the method from [21] and Method A mentioned above. However, it applies KSP search to the graph ofstrongly connected components(SCCs) of the derived DTMC. It eliminates cyclic paths which are a big challenge for KSP search algorithms. However, this has got a negative side effect. Once an SCC is touched by a selected path the complete SCC will be added to the counterexample.

Notice that a single SCC may be very large and could even comprise the whole model in which case the counterexample would be useless for debugging. In [23], the authors also present a method to facilitate the analysis of complex counterexamples based on partitioning the counterexample such that each partition conveys a similar set of diagnostic information.

In [11] we proposed a highly scalable method based on eXtended Best-Firstsearch (XBF) to generate counterexamples for DTMCs and CTMCs. It could be adopted as a further method for generating counterexamples for MDPs in two different ways. First, we could apply XBF, instead of KSP search, to the DTMC derived by computing the scheduler. The counterexample is delivered in the form of a diagnostic subgraph. The probability of the diagnostic subgraph is computed using a model checker. Second, we could apply XBF directly to the MDP in order to select a diagnostic subgraph. The non-determinism is removed from the seldiagnostic subgraph by the model checker. We expect both methods, in particular the second one, to be very efficient. In both cases, the counterexample is provided as a subgraph (or sub DTMC), whereas KSP based approaches provide a list of offending paths. The form of the provided counterexample determines the subsequent analysis techniques applied to it for the purpose of debugging. Therefore the application context of these methods differs from the context of the three methods we consider here. Thus, we do not consider them in this paper and refer their investigation to future research.

Structure of the Paper. In Section II we introduce the formal foundations of MDPs. We define the notion of a counterexample in Section III. The three methods for generating counterexamples are presented in Section IV.

Section V contains our experimental evaluation of these methods. We conclude and discuss further research goals in Section VI.

II. MARKOVDECISIONPROCESSES

Markov decision processes(MDPs) allow the modelling of both non-deterministic and probabilistic system be-

haviour. They are very useful when modelling the asynchronous, concurrent composition of probabilistic models, such as the ones given by discrete-time Markov chains (DTMCs) [24], [12]. An MDP is formally defined as follows:

Definition 1: A labelled Markov decision process (MDP) M is a tuple (S,ˆs,A,L), where ^S is a finite set of states, ˆ

s∈S is the initial state, ^A:S−→2^Distr(^S⁾ is a transition function, and ^L:S−→2^AP is a labelling function.

The transition functionÂmaps each state sto a non-empty finite subset ofDistr(S), which is the set of all probability distributions over^S. In other words, for a states an element αofÂ(s)is a functionα:S→[0,1]such that^P_s′∈Sα(s^′) = 1. We call elements ofÂ(s)actionsofs. A transition leaving an arbitrary state s begins with a non-deterministic choice between the actions available in Â(s). After an action α is chosen, a probabilistic choice will be made between all possible successors, i.e., states for whichαis not zero.

Figure 2 depicts an MDP which was obtained as the result of the asynchronous concurrent composition of two instances of the DTMC in Figure 1 using the composition semantics proposed in [12]. The non-deterministic transitions represent the decision which of the concurrent processes is executing the next step. We assume that the composed system crashes if one of the concurrent processes crashes. Hence, any global state of the composed MDP is labelled by the atomic propositioncrash, if one of the processes is at its local state s₂. Note that for enhanced readability we do not display the labelling function in Figure 2.

A pathis a concrete execution of the system and corre- sponds to a sequence of state transitions. Since we consider reactive systems, paths are assumed to be infinite. However, sometimes we need to refer to finite path prefixes. Let M = (S,s,ˆA,L) be an MDP, then a path through M is formally defined as follows:

Definition 2: An (infinite) path in the MDP M is an infinite sequence s₀−−−→^α⁰ s₁ −−−→^α¹ . . . with α_i ∈A(s_i) and α_i(s_i+1)>0for alli≥0. Afinite path is a finite prefix of an infinite path.

Figure 1: DTMC Figure 2: MDP

(3)

For a finite or an infinite path π=s₀−−−→^α⁰ s₁−−−→^α¹ . . .,l(π) denotes the length of πdetermined by the number of states touched along π. Note that for an infinite path π, l(π) is equal to ∞. For a natural numberk such that0≤k < l(π), π[k]refers to thek-th state ofπ, namelys_kandπ^(k)denotes the k-th prefix of π, i.e., the prefix of π consisting of the first k transitions, namely s₀−−−→^α⁰ . . .−−−−−→^α^k⁻¹ s_k. For 0 ≤ k < l(π)−1,^A_π(k) is thek-th action inπ, namelyα_k. For a finite path π,last(π)denotes the last state ofπ.P aths^M denotes the set of all infinite paths andP aths^M_{f in}denotes the set of all finite paths inM. For any states,P aths^M(s)and P aths^M_{f in}(s)refer to the sets of infinite or finite paths which start ats. WhenMis clear from the context we usually omit the corresponding superscript.

A. Schedulers And Probability Measures

The non-deterministic choices in an MDP are made by a scheduler(also calledpolicyoradversary) [12]. A scheduler constrains the set of allowed executions of an MDP by selecting an action based on the execution history of the system. Formally, for an MDPM= (S,s,ˆA,L), a scheduler d is a function mapping every finite pathπ inM onto an action d(π)∈A(last(π)). Based on results from [25], [26]

we can focus on deterministicschedulers which determinis- tically select an action. Such schedulers induce the maximal and minimal probability measures which are of interest when model checking MDPs, as we shall show later.

Paths which are allowed under a scheduler d are called valid underd, as captured by the following definition.

Definition 3: A finite or an infinite pathπ in an MDP is valid under a given scheduler diff for all0≤k < l(π)−1 it holds that ^Aπ(k) = d(π^(k)) and ^Aπ(k)(s_k+1) > 0. Otherwise, we say that πisinvalid underd.

We refer to the set of all infinite paths which are valid under a schedulerd asP aths_d.

A schedulerd resolves the non-determinism of an MDP M. By doing so it transforms M into a DTMC for which the probability of paths is measurable (c.f. [12] for a detailed discussion of this transformation). This transformation induces a probability measure P r_d over the paths of the MDP. The underlying σ-algebra is formed by the cylinder sets which are induced by finite paths in M. Each finite path s₀−−−→^α⁰ . . . −−−−−→^α^k⁻¹ s_k induces a cylinder set cyl_d(s₀−−−→^α⁰ . . .−−−−−→^α^k−¹ s_k) = {π ∈ P aths_d | π^(k) = s₀−−−→^α⁰ . . .−−−−−→^α^k⁻¹ s_k}. For any π∈ P aths_{f in}, the probability of the cylinder setcyl_d(π)is defined as follows:

P r_d(cyl_d(π)) = 8>

>>

<

>>

>:

P r_d(cyl_d(π⁽⁰⁾))· l(π)Q i=1

A_π(i−1)(π[i]), if πis valid underd

0, otherwise

where P r_d(cyl_d(π⁽⁰⁾)) = 1 if π⁽⁰⁾ = ˆs, or 0 otherwise.

Consequently, the cylinder set induced by any finite path

π possesses two possible probabilities. The first is0for all schedulers under whichπis invalid. The second is

γ(π) =γ(π⁽⁰⁾)· l(π)Y i=1

A_π(i−1)(π[i]), (1)

with γ(π⁽⁰⁾) = 1 if π⁽⁰⁾ = ˆs, or 0 otherwise, for all schedulers under which π is valid. The function γ will be useful later when we introduce our approach to the generation of counterexamples.

B. Model Checking of MDPs

The logic PCTL offers operators which allow reasoning about the occurrence probability and time of system states and events of interest in an MDP [4]. A stochastic model checker can then be used to verify a PCTL formula on the given MDP. The interesting PCTL formulae are those of the formP⊲⊳p(ϕ), where⊲⊳ is a comparison operator out of {<,≤, >,≥ } andp∈[0,1]is a probability bound. Such a formula asserts that the probability to satisfy ϕ fulfils the comparison⊲⊳ p. The sub-formulaϕis anUntilformula like (ϑ U ϕ) or a time-bounded Until formula like(ϑ U^≤^t ϕ), for two state formulaeϑ andϕandt∈N.

The satisfaction of P⊲⊳p(ϕ) depends on the probability mass of the set of all paths satisfyingϕ. The probability of paths in MDPs is only defined in association with a given scheduler. A formulaP⊲⊳p(ϕ)is satisfied on an MDPMiff for any schedulerdit holds thatP r_d(ϕ)⊲⊳ p, whereP r_d(ϕ) refers to the probability, under d, of the set of all paths satisfying ϕ. Notice that this probability is measurable for any possible PCTL path formulaϕ[3].

This means we have to consider the extreme schedulers, maximizing andminimizing schedulers. Let P r_max(ϕ) and P r_min(ϕ)refer to the maximal and minimal probability of ϕ which means P rmax(ϕ) = max{P r_d(ϕ) | d ∈ D} and P r_min(ϕ) =min{P r_d(ϕ)|d∈D}, where^D is the set of all schedulers. It is trivial to deduce that M 2 P_≤p(ϕ) ⇔ P r_max(ϕ)> p. Similar implications can easily be made for other probability bounds<,≥and> p.

III. NOTION OFCOUNTEREXAMPLES

In this section we define the notion of a counterexample for an MDP against a PCTL formulaeP_⊲⊳p(ϕ). In case⊲⊳is

<or ≤we call the property upper-bounded, otherwise we call itlower-bounded. We define the notion of a counterexample for upper-bounded properties. We will demonstrate in Section IV-D how to deal with lower-bounded properties.

The validity ofϕ, as being anUntilformula, can be proved by finite paths starting atˆs[14]. We refer to such finite paths as diagnostic paths.

Definition 4: A diagnostic path of P⊲⊳p(ϕ) is a finite path π ∈ P aths_{f in}(ˆs) with π |= ϕ, where |= is the usual satisfaction relation from PCTL.

Let X be a set of diagnostic paths. The probability P r_d`S

π∈Xcyl_d(π)´

, or short P r_d(X), is measurable for

(4)

any schedulerdsinceX consists of finite paths [14]. In the case that P r_d(X) violates the upper-bound constraint ⊲⊳ p for some schedulerd, thenX is a proof of the violation of P_⊲⊳p(ϕ). We call such a set a counterexampleof P_⊲⊳p(ϕ). LetP r(Xc )be the maximal probability ofX, i.e.,

c

P r(X) =sup{P r_d(X)|d∈D}.

It is obvious thatXis a counterexample if and only ifP r(X)c violates the upper-bound constraint⊲⊳ p.

Definition 5: Let M be an MDP violating an upper- bounded formula P⊲⊳p(ϕ). A counterexample of P⊲⊳p(ϕ) is a set X of diagnostic paths in M such thatP r(X)c ⊲⊳ p does not hold.

In the case of anon-strictupper-bound (≤p) property it is ensured that a counterexample exists which consists of a finite number of diagnostic paths. This is not guaranteed for the case of strict upper-bound (< p) properties. The approach presented in our paper aims at providing finite counterexamples. Hence, we approximate the case of a strict upper-bound< p by a non-strict upper-bound≤(p−ε), for a given arbitrarily small ε > 0. From a practical point of view such an approximation is very adequate in order to support debugging and we hence focus on generating finite counterexamples for formulae of the form P_≤p(ϕ) in the remainder of this paper.

IV. GENERATINGCOUNTEREXAMPLES

Let M be an MDP which refutes the property P_≤p(ϕ). Due to Definition 5, a counterexample is a set X of diagnostic paths with P r(Xc ) > p. The task of selecting such a set can be represented as ak-Shortest-Paths problem (KSP). This is the problem of findingkshortest paths, for an arbitrary natural numberk, from a start node to a target node in a weighted directed graph. Notice that an MDP and also a DTMC can be viewed as astate transition graph which is a weighted directed graph defined as follows. The nodes of the graph represent the states and the edges represent the transitions of the MDP or DTMC. The weight of an edge (s, s^′)is equal to the transition probability. The start node of the search is the initial stateˆs. We get the KSP algorithm to select diagnostic paths using the following construction. The path formulaϕ is of the form (φ₁ U φ₂) or (φ₁ U^≤t φ₂). Since any finite path ending in a state violatingφ₁ can not be completed to a diagnostic path, such states are turned into sinks by ignoring all their outgoing transitions. This ensures that any finite path ending in a state satisfyingφ₂ is a diagnostic path. W.l.o.g., we can assume that there is a unique state satisfying φ₂ that represents a target node g. Otherwise, we add an absorbing state g as a target and replace all outgoing transitions of each state satisfying φ₂ by a single transition togwith the weight1.0. Anyˆs-gpath then represents exactly one diagnostic path, modulo the last transition that can easily be removed. In order to select the most probable paths instead of the shortest ones we define

the weight of a path as the product, instead of the sum, of the weights of its edges. Further, we inverse the meaning of “shortest” and search for paths with maximal instead of minimal weights. Note that the definition of transition and path weights that we use here differs from the approach presented in [14], [22]. Our approach is possible since the weight product forms a cost algebra [27].

We can generate a counterexample for the MDP by applying a KSP search algorithm to its state transition graph.

The KSP search algorithm will find the most probable diagnostic paths. We only consider KSP algorithms which do not require the number k to be specified in advance.

Examples for such algorithms are Eppstein’s algorithm [17], the recursive enumeration algorithm (REA) [15] and K^∗ [18]. These algorithms enumerate diagnostic paths one by one until the user decides to stop. The diagnostic paths are delivered starting with the most probable ones. Notice that diagnostic paths indicating high probabilities represent the executions which most significantly contribute to the property violation. At any time of the search let Rdenote the set of already found diagnostic paths. The set R is a counterexample onceP r(R)c > p. However, the computation of P r(R)c is not trivial. Note that in general P r(R)c is not equal to the sum of the individual probabilities, more precisely the γ-values, of the paths in R. This is because it is not guaranteed that a scheduler exists under which all paths in R are valid. Also, R is not very convenient for debugging since it contains a lot of “noise”. We recall that P r(R) =c P r_d(R)for some schedulerd. All elements fromR which are invalid underddo not contribute toP r(R)c . Hence, they should not be included in the counterexample. The goal is to provide a counterexample which is “noise free” and to state its probability. In the following three sections we present three methods to achieve this goal.

A. Method A

The idea of this method is to eliminate the non- determinism before a counterexample is generated. Hence, a maximizing schedulerd is determined using a policy iteration procedure as proposed in [7], [13]. This scheduler turns M into a DTMCM_dby removing the non-determinism. A counterexample of P_≤(ϕ) on the DTMC M_d is generated using a method which is similar to the one represented in [14], [22]. The main idea of the approach from [14], [22] is to apply KSP search to the state transition graph of the obtained DTMC. In [14], [22] the REA algorithm [15] is recommended. However, we use the lazy version of Eppstein’s algorithm [16] since it has a better asymptotic complexity and also has been proven to perform better in practice.

Algorithm 1 illustrates the algorithmic structure of this method. The result is a set X of diagnostic paths with P r_d(X)> p. Note that all paths fromX exist in the original MDPMand thatP r_d(X)> p impliesP r(X)c > p. Thus,X

(5)

Algorithm 1: Method A

Input: An MDP Mand a propertyP_≤p(ϕ) Compute a maximizing schedulerdfor Mand

1

P_≤p(ϕ). LetM_dbe the DTMC induced byd. Run Eppstein’s algorithm on the state transition graph

2

ofM_d. X ← {}

3

whileP r_d(X)≤pdo

4

Ask Eppstein’s algorithm for the next probable

5

diagnostic pathπ. AddπtoX

6

returnX as a counterexample.

7

is also a counterexample of M. Further, all paths from X are valid under the schedulerd which means thatX is free of “noise”.

B. Method B

We propose a new method B which is an improvement of Method A. It has the same algorithmic structure as Method A, cf. Algorithm 1. Method B computes the DTMC induced by a maximizing scheduler. It then generates a counterexample by running a KSP algorithm on the derived DTMC. The difference compared to Method A is that we use K^∗ instead of Eppstein’s algorithm, see Algorithm 1, Line 2. K^∗ is a directed algorithm for solving the KSP problem [18]. For a proper understanding of this paper it is fully sufficient to consider K^∗ as an adoption of the lazy variant of Eppstein’s algorithm [16], [17]. The main feature of K^∗ is that it works on-the-fly and can also be guided using heuristic functions. In other words, unlike Eppstein’s algorithm, K^∗ does not need to process the entire state transition graph to provide a solution.

C. Method C

Both methods A and B compute a maximizing DTMC in advance which requires to process the entire state space of the MDP. This can be very expensive for large models.

Method C avoids this effort. It directly applies K^∗to the state transition graph of the MDP. All diagnostic paths discovered by K^∗ are stored in the set R. In order to provide a “noise- free” counterexample we filter out all “noisy” paths fromR. We explain soon how to do this.

Definition 6: Letdbe amaximizing schedulerofR, i.e.

P r_d(R) =P r(R)c . The set of all diagnostic paths from R which are valid underdis called amaximumof R. We can easily prove for a maximumX ⊂ Rthat P r(R) =c

c

P r(X) =P

π∈Xγ(π). Consequently, ifRis a counterexample, then each maximum of Ris a “noise-free” counterexample. This leads to the strategy illustrated in Algorithm 2.

Now, we need a method to compute a maximum X of R

Algorithm 2: Strategy for generating counterexamples according to Method C

Input: An MDPMand a propertyP_≤p(φ) R ← {}

1

X ← {}

2

whileP r(X)c ≤pdo

3

Ask K^∗ for the next probable diagnostic pathπ.

4

AddπtoR

5

Compute a maximum ofRand assign it toX.

6

returnX as a counterexample.

7

and to compute its probabilityP r(Xc )which can be used at Line 6 in Algorithm 2.

Computing a Maximum.: In computing a maximum of Rwe have to take the following issues into account. First, Rgrows with each new diagnostic path found by K^∗. As we see in Algorithm 2, we have to (re-)compute a maximum of R and compute its probability, wheneverR grows. Hence, our technique should be designed as an online algorithm which efficiently updates the maximum computed so far when new diagnostic paths are added into R. Second, we have to compute a maximum without knowing a concrete maximizing scheduler for R. To this end we need first to introduce the notion of (scheduler) compatiblepaths.

Definition 7: A set of pathsCis(scheduler) compatible iff there is a schedulerd such that all paths inC are valid underd.

The relevance of this notion arises from the insight that each scheduler compatible subset X of R with a maximal probability is a maximum of R.

Recall that a scheduler is a deterministic function mapping a finite path of the MDP onto an action. Thus, for diagnostic paths, the first branching (after a common prefix) is decisive for scheduler compatibility. Diagnostic paths can not be compatible when branching in a state. On the other hand, it is always possible to define a scheduler that allows a set of diagnostic paths which branch in actions. Thus, diagnostic paths which branch in an action are compatible.

Consequently, checking for scheduler compatibility can be accomplished as follows. We implement the set R as an AND/OR tree. Each diagnostic path is stored in R by mapping its states and actions to nodes. We map states to OR nodes and actions to AND nodes.

Algorithm 3 illustrates the steps of adding diagnostic paths into R and how scheduler compatible subsets are identified and their probabilities are computed. The steps up to Line 4 add a diagnostic path π toR. Notice that the construction ensures thatRis an AND/OR tree rooted at the OR nodesˆ. The remaining steps beginning at Line 5 perform the identification of scheduler compatible subsets ofRand the computation of their probabilities in an online fashion upon addingπ toR. The AND/OR tree structure makes all

(6)

Algorithm 3: Adding diagnostic paths to the AND/OR tree implementingR

Input: A newly found diagnostic pathπ=s0 α0

−−−→. . .−−−−−→s^α^k⁻¹ k. Interpretπas an alternating sequence of OR and AND nodes 1

Sπ=hs0, α0, . . . , αk−1, ski.

ifRis emptythen Add all nodes and the edges 2

{(s0, α0),(α0, s1), . . . ,(αk−1, s_k)}toRand go to Line 5.

Determine, starting from the rootsˆ¹, the longest prefix ofSπwhich 3

is already contained inR, i.e., the longest prefix whichSπ shares with any path inR.

Attach the remainder ofSπto the last node of the prefix as a new 4

subgraph ofR. During the addingSπ is not folded.

In a bottom-up manner, assign to each node on the newly added path 5

two marks, a probability markMpand an integer markMi, as follows:

Assign to the leaf, i.e., the nodesk, the marksMp=γ(π)and 6

Mi= 1.

foreachinner nodenonSπ do 7

ifnis an AND nodethen 8

Add theMpmarks assigned to the children ofnand 9

assign the sum tonas a probability markMp.

Add theMimarks assigned to the children ofnand assign 10

the sum tonas anMimark.

else // If n is an OR node 11

Select the childn^′ofnwith the maximalMpmark. Ifn 12

has more than one child theMpmark of which is maximal, then select that one with the minimalMimark.

Marknwith identicalMpandMimarks asn^′. 13

Attach tona reference MAX referring ton^′. 14

scheduler compatible subsets ofRavailable. Our algorithm compares these subsets in terms of probability and size at Line 12 and selects, using the reference MAX, the action which leads to the scheduler compatible subset with maximal probability and minimal size. The aim of minimizing the size is to compute practically useful counterexamples carrying a maximal probability with a minimal number of diagnostic paths.

Note that a global processing of the entire AND/OR tree is never required. The computational effort of Algorithm 3 is limited to traversing the tree only along the newly inserted path, which is important for the performance of our method and its adequacy as an online technique.

As from now letX ⊆ Rbe the set of the diagnostic paths corresponding to those paths in the tree, from the root to the leaves, which contain only AND and MAX references.

Our algorithm ensures that X is a smallest maximum of R. This means, X is a maximum of R with a minimal number of diagnostic paths. So X is the most informative counterexample which can be extracted from R. In other words, X is an optimal counterexample over R. It carries a maximal probability with minimal size. However, it is possible to find a counterexampleX^′ consisting of elements fromRtogether with other diagnostic paths which have not been delivered yet by K^∗. The set X^′ might carry a higher probability and might contain fewer diagnostic paths than X. However, the probability of the diagnostic paths fromX^′ which are not elements ofRmay be very low such that they

are insignificant for debugging.X has the advantage that it includes only diagnostic paths fromR which are the most probable and crucial ones. In our experiments we did not observe any penalty in the counterexample quality compared to the methods A and B. Note that A and B provide optimal counterexamples.

In addition to computing the smallest maximum of R, the AND/OR tree includes an explicit representation of the scheduler corresponding to the computed maximum. The MAX references represents the scheduler’s decision for each non-deterministic choice. This scheduler can be useful in debugging since it indicates a set of extreme choices which lead to property violation.

Example.: Letφbe the path formula(true U^≤3crash). The MDP from Figure 2 violates the property P_≤0.09(φ) because P rmax(φ) = 0.095>0.09. We use our method to generate a counterexample. K^∗ searches the state space for most probable diagnostic paths which are then added into Rwhich is implemented as an AND/OR tree. The obtained AND/OR tree is illustrated in Figure 3. Some nodes of interest are additionally labelled with the marks used in Algorithm 3. We see that the root is marked with 0.095 and 3, which means that the found maximum (highlighted in the figure by bold lines) has the probability 0.095 and consists of 3 diagnostic paths. Since 0.095 > 0.09, the maximum that was found is a counterexample. Out of a total of 14, which is the number of all paths stored in the tree, our counterexample consists of only 3 compatible paths. This counterexample facilitates gaining insight why the property is violated. For instance, it shows that a large portion of the total probability mass of the counterexample of 0.095, namely0.05, is caused by the immediate crashof the second component after it becomesbusy. Another major contributor is thecrashof the first component after the loop (s₀, s₀)−−−→^β¹ (s₀, s₀)with a mass of0.025. The remainder is related to the loop(s₀, s₁)−−−→^β³ (s₀, s₁)which increases the likelihood of acrashafter stayingbusyfor a while. To debug

Figure 3: The AND/OR tree from the example in Sec- tion IV-C

(7)

the system, meaning to make it immune against violations of the propertyP_≤0.09(true U^≤3crash), one has to reduce the probability of the paths making up the counterexample.

The effective approach would be to redesign the system such that the transition probability frombusytocrashis reduced.

If this was not sufficient, then one has to lessen the impact of the loops at the ready and busy states in terms of the likelihood tocrash.

D. Lower-Bounded Properties

A given MDP M refutes a lower-bounded property P⊲⊳p(ϕ) (⊲⊳ is either > or ≥) iff P r_min(ϕ) does not reach/exceed the specified lower-bound p. Thus, in both methods A and B a minimizing scheduler d instead of a maximizing one is computed. The DTMC M_d induced by the minimizing scheduler refutes P⊲⊳p(ϕ) too. Then, a counterexample is generated for the DTMCM_d. In order to provide a proof that a DTMC refutes a lower-bounded property we have to show that the probability of all diagnostic paths in the model does not reach/exceed the specified lower- bound. In contrast to the upper-bounded case, any strict subset of diagnostic paths does fulfil this requirement. Thus, a lower-bounded property is reformulated into an equivalent upper-bounded one as shown in [14]. This reformulation requires the computation of the bottom strongly connected components(BSCCs) of the DTMC.

In Method C we have to perform similar preparations as in Methods A and B. We can reformulate P⊲⊳p(ϕ) as an upper-bounded property on the MDP directly. We need then to compute themaximum end componentsof the MDP [13]. These are the counterpart of BSCCs in a DTMC. For this computation the entire state space has to be generated.

This weakens the advantage of the on-the-fly feature of Method C. This result is a similar to the fact that directed methods are more successful for safety than for liveness properties [28]. Notice that lower-bounded properties here are the probabilistic counterpart of liveness properties.

V. EXPERIMENTALEVALUATION

In the experiments we compare the performance of all three methods investigating which method is adequate under which conditions. We report here the experimental results for two significant case studies which are both available from the PRISM web page². We implemented the three methods in Java. All measurements were made on a machine with an Intel Pentium 4 CPU (3.2 GHz)³and 2 GB RAM. Java code was executed by the Java runtime environment JDK 1.6.

A. IPv4 Zeroconf Protocol

This case study is a model of a dynamic configuration protocol for IPv4 addresses. We are interested in the probability

2http://www.prismmodelchecker.org/

3The CPU had got 2 cores, but we do not exploit them because our implementation does not employ parallelism.

Table I: IP – Properties of the MDPs

Table II: IP – Computing maximizing DTMCs

that a host picks an IP address already in use. We configured the model with the parameters N O RESET, N = 1000 (number of hosts) and loss = 0.1 (probability of message loss). In order to study the scalability of the methods, we considered three variants of the model with different sizes.

For this purpose, we set the parameterK(number of probes to send) to K = 2, K = 4 and K = 8. We list essential characteristics of these model variants in Table I. Next to the number of states and transitions of the MDP we give in the column labelled with “Probability” the probability which PRISM computed for the property for each model variant. The columns “MC Time” and “MC Memory” record the runtime (in seconds) and memory (in KB) needed for building and checking the models using PRISM.

Both methods A and B need to generate the DTMC induced by a maximizing scheduler. Table II shows for each model variant the size of the computed DTMC and the effort required by PRISM to generate it. Column “Gen. Time”

indicates the generation time (in seconds) and Column

“Memory” indicates the memory space (in KB) needed to store the DTMC. Each DTMC has the same number of states as the original MDP. Resolving the non-determinism reduces the number of transitions in each variant of this model approximately to the half. This is a clear sign for a high degree of non-determinism in the original MDP.

We ran the three methods A, B and C on all three variants of the model and recorded the results in Table III. We report the runtime (in seconds) and memory consumption (in KB) of generating counterexamples for probability upper-bounds 10%, 40%, 80%, 95% and 100% (see Column “P” in Tables IIIa and IIIb) of the total probability which is given in Table I. Table cells with the content of the pattern “?> x”, for some value x, mean that the corresponding method failed to provide a counterexample for the corresponding probability bound although it ran forxseconds or consumed xKB of memory.

It holds for all three methods that the larger the model is the more effort is required to generate counterexamples.

(8)

(a) Runtime [sec]

(b) Memory [KB]

Table III: IP – Counterexample generation

We notice that the computational costs of both methods A and B is dominated by the costs of computing the DTMC.

For example, to generate a counterexample with 10% of the total probability for the case withK= 2, method A needed 19.076 seconds and 7 797.9 KB. The most of that effort, precisely 16.150 seconds and 7 757.6 KB, is needed for computing and storing the DTMC. We observe this effect for both methods A and B and for all model variants. Method C needs very long time especially for largeKvalues and large probability bounds. It even failed in the case withK= 8to provide a counterexample with 80% of the total probability after running for 144 885 seconds (i.e. 40 hours). However, it requires significantly less memory than both methods A and B. In most cases C required approximately half of the memory required by A or B. The economical memory consumption of method C is due to its on-the-fly nature. The long runtime of C in this case study can be explained by the high degree of non-determinism of the models, which has two consequences. First, the DTMC derived by resolving the non-determinism contains about half of the transitions of the original MDP. This implies that the reachable portion

of the DTMC is very small compared to the original MDP.

On the other hand, method C operates directly on the MDP.

The portion K^∗ had to explore was much bigger than the reachable part of the DTMC. We consider for example the case with K = 8. Method A explores 13 053 states and 22 662 transitions (according to figures which, due to space limitations, are not reported in the tables). This is the size of the reachable part of the DTMC since Eppstein’s algorithm, as we know, has to explore the entire reachable graph before it delivers a solution. Method C explored 379 791 states and 1 286 235 transitions which is a large nuber compared to the 13 053 states and 22 662 transitions constituting the reachable part of the DTMC. Second, the MDP contains much more diagnostic paths than the DTMC. Notice that an MDP models more behaviour than the corresponding DTMC since it also represents nondeterminism. However, only few of these paths are scheduler compatible. Consequently, method C has to find and process many diagnostic paths to achieve a noticeable increase in the counterexample probability. For instance, for K = 8 method C delivered 68 202 diagnostic paths to provide a counterexample with 60% of the total probability. A lot of time is used to process this huge number of paths while only 349 of them form the counterexample.

B. Bounded Retransmission Protocol

This case study is based on the bounded retransmission protocol (BRP) [29], a variant of the alternating bit protocol.

The BRP protocol sends a file in a number of chunks, but allows only a bounded number of retransmissions of each chunk. The model is represented as an MDP. The parameter N denotes the number of chunks andM AX the maximum allowed number of retransmissions of each chunk. We set in our experiments M AX to 5. We scaled the size of the state space by varying the value of N. We list essential characteristics of the considered variants of the model in Table IV. All tables in this section have the same structure as in the previous case study.

Table V shows for each model variant the size of the DTMC and the effort PRISM required to generate it. We can

Table IV: BRP – Properties of the MDPs

Table V: BRP – Computing maximizing DTMCs

(9)

(a) Runtime [sec]

(b) Memory [KB]

Table VI: BRP – Counterexample generation

see that in each model variant, the DTMC still contains about 98% of the transitions. This is evidence for a low degree of non-determinism. The results of running all methods are reported in Table VI. Notice that the probability bounds listed here are very low (c.f. Column “P” in Tables VIa and VIb). We did so because all three methods failed to provide counterexamples for large probability bounds in realistic time. This phenomenon can be explained as follows. Single diagnostic paths in this model have very low individual probability. For instance in the case withN= 3 200, method A selects 5 588 diagnostic paths the accumulated probability of which is only 1% of the total probability. That means, numerous diagnostic paths must be found and processed to achieve significant counterexample example probability. We expect XBF based methods, which we do not consider in this paper, to have a clear advantage in such situations.

Method A and B have comparable performance whenN= 640. However, the larger the models are, the more inefficient method A is. Method C outmatches A and B in most cases with respect to both runtime and memory consumption. This holds particularly for large models withN= 3 200andN= 6 400. We notice that the advantage of C regarding memory

consumption is bigger for larger probability bounds. This is a result of using AND/OR trees. The number of found diagnostic paths is very big for large probability bounds.

Hence, storing them compactly in an AND/OR tree pays off.

C. Summary of Experimental Results

We summarize our experimental conclusions as follows.

Method B outmatches A in most cases, in particular for large models. Thus, we recommend to use Method B instead of A. For large MDPs method C offers a significant advantage with respect to memory consumption. It also outperforms A and B with respect to runtime, in particular for small probability bounds. For models with a very high degree of non-determinism, method C can take a very long time to produce counterexamples for large probability bounds.

However, it has a good runtime behaviour if the degree of non-determinism of the model is not very high. Hence, we recommend method C if the model is large with a low degree of non-determinism. If the model is small or it is highly non- deterministic we can use C for small probability bounds. B seems to be the more promising method for small models or model with high degree of non-determinism, in particular for large probability bounds. We summarize these conclusions in Table VII.

VI. CONCLUSION

We first defined the notion of a counterexample for veri- fying MDPs against PCTL properties. We then demonstrated three methods, called A, B and C, for computing informative counterexamples based onk-Shortest-Paths search. All three methods optimize the selected counterexamples in terms of their probability and the number of included paths. Methods A and B compute an adequate scheduler for a given MDP and a property. They generate a counterexample for the DTMC induced by this scheduler. Method C generates a counterexample directly for the MDP on-the-fly and uses an AND/OR tree structure to accommodate the influence of schedulers. Method B outperforms Method A with respect to computational costs in most cases. Method C, due to its on- the-fly nature, shows promise for large models. The algorithmic core of Methods B and C is the K^∗algorithm, which is a directed algorithm for solving thek-Shortest-Paths problem.

In particular, it is on-the-fly and can be guided by heuristics.

We experimentally compared the performance of all three

Table VII: Which method under which conditions

(10)

methods. Our experimental evaluation shows which method is adequate under which conditions. Future work includes an adoption of XBF do MDPs.

Acknowledgement. This work was partially supported by DFG Research Training Group GK-1042 “Explorative Analysis and Visualization of Large Information Spaces”.

REFERENCES

[1] E. M. Clarke, O. Grumberg, and D. A. Peled,Model Checking (3rd ed.). The MIT Press, 2001.

[2] C. Baier and J.-P. Katoen,Principles of Model Checking. The MIT Press, 2008.

[3] M. Y. Vardi, “Automatic verification of probabilistic concurrent finite-state programs,” in 26th Annual Symposium on Foundations of Computer Science (FOCS 1985). IEEE, 1985, pp. 327–338.

[4] H. Hansson and B. Jonsson, “A logic for reasoning about time and reliability,”Formal Aspects of Computing, vol. 6, no. 5, pp. 512–535, 1994.

[5] C. Courcoubetis and M. Yannakakis, “The complexity of probabilistic verification.”J. ACM, vol. 42, no. 4, pp. 857–

907, 1995.

[6] A. Aziz, V. Singhal, and F. Balarin, “It usually works: The temporal logic of stochastic systems.” in CAV 1995, ser.

LNCS vol. 939. Springer, 1995, pp. 155–165.

[7] A. Bianco and L. de Alfaro, “Model checking of probabalistic and nondeterministic systems,” inFSTTCS 1995, ser. LNCS vol. 1026. Springer, 1995, pp. 499–513.

[8] A. Hinton, M. Kwiatkowska, G. Norman, and D. Parker,

“PRISM: A tool for automatic verification of probabilistic systems,” inTACAS’06, ser. LNCS vol. 3920. Springer, 2006, pp. 441–444.

[9] J.-P. Katoen, M. Khattri, and I. S. Zapreev, “A Markov Reward Model Checker,” in QEST ’05. IEEE Computer Society, 2005, pp. 243–244.

[10] H. Aljazzar, H. Hermanns, and S. Leue, “Counterexamples for timed probabilistic reachability.” in FORMATS ’05, ser.

LNCS vol. 3829. Springer, 2005, pp. 177–195.

[11] H. Aljazzar and S. Leue, “Extended directed search for probabilistic timed reachability.” inFORMATS ’06, ser. LNCS vol. 4202. Springer, 2006, pp. 33–51.

[12] C. Baier and M. Z. Kwiatkowska, “Model checking for a probabilistic branching time logic with fairness,”Distributed Computing, vol. 11, no. 3, pp. 125–155, 1998.

[13] L. de Alfaro, “Formal verification of probabilistic systems,”

Ph.D. Dissertation, Stanford University, 1997.

[14] T. Han and J.-P. Katoen, “Counterexamples in probabilistic model checking,” inTACAS ’07, 2007.

[15] V. M. Jim´enez and A. Marzal, “Computing the k shortest paths: A new algorithm and an experimental comparison,” in Algorithm Engineering, 1999, pp. 15–29.

[16] ——, “A lazy version of eppstein’s shortest paths algorithm,”

inWEA ’03, ser. LNCS vol. 2647. Springer, 2003, pp. 179–

190.

[17] D. Eppstein, “Finding the k shortest paths,” SIAM J.

Computing, vol. 28, no. 2, pp. 652–673, 1998. [Online].

Available: http://dx.doi.org/10.1137/S0097539795290477 [18] H. Aljazzar and S. Leue, “K^∗: A directed on-the-fly algorithm

for finding the k shortest paths,” Univ. of Konstanz, Gemany, Tech. Rep. soft-08-03, March 2008, submitted for publication. [Online]. Available: http://www.inf.uni- konstanz.de/soft/research/publications/pdf/soft-08-03.pdf [19] ——, “Debugging of dependability models using interactive

visualization of counterexamples.” in QEST ’08. IEEE Computer Society Press, 2008.

[20] ——, “Counterexamples for model checking of markov decision processes,” Univ. of Konstanz, Gemany, Tech.

Rep. soft-08-01, December 2007. [Online]. Available:

http://www.ub.uni-konstanz.de/kops/volltexte/2008/4530/

[21] H. Hermanns, B. Wachter, and L. Zhang, “Probabilistic CEGAR,” inComputer Aided Verification, 20th International Conference, CAV 2008, Proceedings, ser. Lecture Notes in Computer Science, vol. 5123. Springer, 2008, pp. 162–175.

[22] T. Han, J.-P. Katoen, and B. Damman, “Counterexample generation in probabilistic model checking,” IEEE Trans.

Softw. Eng., vol. 35, no. 2, pp. 241–257, 2009.

[23] M. E. Andr´es, P. R. D’Argenio, and P. van Rossum, “Sig- nificant diagnostic counterexamples in probabilistic model checking,” CoRR, vol. abs/0806.1139, 2008.

[24] W. J. Stewart, Introduction to the Numerical Solution of Markov Chains. New Jersey, USA: Princeton University Press, 1994.

[25] S. Hart, M. Sharir, and A. Pnueli, “Termination of probabilistic concurrent program,” ACM Trans. Program. Lang. Syst., vol. 5, no. 3, pp. 356–380, 1983.

[26] R. Segala and N. A. Lynch, “Probabilistic simulations for probabilistic processes,” in CONCUR ’94, ser. LNCS vol.

836. Springer, 1994, pp. 481–496.

[27] S. Edelkamp, S. Jabbar, and A. Lluch-Lafuente, “Cost- algebraic heuristic search,” in AAAI ’05, 2005, pp. 1362–

1367.

[28] S. Edelkamp, S. Leue, and A. Lluch-Lafuente, “Directed explicit-state model checking in the validation of commu- nication protocols,” International Journal on Software Tools for Technology Transfer STTT, vol. 5, no. 2–3, pp. 247–267, 2004.

[29] L. Helmink, M. Sellink, and F. Vaandrager, “Proof-checking a data link protocol,” in TYPES’93, ser. LNCS vol. 806, H. Barendregt and T. Nipkow, Eds. Springer, 1994, pp.

127–165.