• Keine Ergebnisse gefunden

An Optimal Message Type Identification Method

7.2 Identifying Message Types

7.2.3 An Optimal Message Type Identification Method

Before we introduce our message type identification method, we first review the executability semantics of message sending and receiving statements in the Promela language.

A message sending statement is executable if the receiving buffer is not full. However, since we interpret buffers in Promela models to have unbounded capacities, a message sending statement is never blocked. On the contrary, a message receiving statement may be blocked due to the following two potential reasons. First, if a statement tries to receive messages from an empty buffer, then it is not executable. Second, a message receiving statement is blocked if it tries to receive a certain kind of messages that is not available in the respective buffer. More precisely, a message receiving statement is not executable if the top message in the respective buffer is not in its coverage set. As an example, consider a message receiving statementch?5,x where x is a boolean variable.

This statement contains a constant message field 5. It is not executable if the top message in the buffer ch is hch,4,truei. The statement is executable if the top message contains 5 as its first field. On the contrary, which boolean value the second field contains has no effect on the executability of the receiving statement.

Based on the above argument, we know that a Promela model discriminates messages in the following two ways. First, the model discriminates messages in different message buffers. Second, the constant message fields in the message receiving statements in the model also cause the discriminations of the messages stored in one same buffer. Such discriminations are the intuition behind the message type identification method that we will propose, i.e., we only need to distinguish two messages if they are distinguished by the model. In the following we explain our method in details, and then prove that it results in an optimal message type definition such that further refining the message type definition cannot improve the precision of the buffer boundedness test. The proposed method applies only to models in which there are no buffer assignments. An adaption of the method with respect to buffer assignments is discussed in Section 7.4.

Given a Promela model, we first collect all the message receiving statements that contain constant message fields. We denote this collection of message re-ceiving statements as the setR. For each bufferchin the model, we identify the types of messages exchanged inchin the following way. We use Mch to denote

the set of all messages that can be stored inchat runtime. Let s1, . . . ,sk ∈R be all the statements receiving messages fromch. In practice we observed that in most cases the coverage sets of these statements Ms1, . . . , Msk are pairwise disjoint. If this is the case, then we can directly identify the set of message types for ch as {Ms1, . . . , Msk, Mch−Sk

i=1Msi}. In particular, if there is no statement inRto receive messages fromch, then we identify only one message type for chas Mch. As an example, for the Promela model in Listing 7.3, we collect all the receiving statements with constant message fields as follows:

s1:toClient?answerA s2:toConsultant?askB

Therefore, we identify for the buffertoClienttwo message types as Ms1 ={htoClient,answerAi},

MtoClient−Ms1 ={htoClient,answerBi}

for the buffer toConsultanttwo message types as

Ms2 ={htoConsultant,askBi}, MtoConsultant−Ms2 ={htoConsultant,askAi}

and for the bufferlogonly one message type as Mlog={hlog,recordi}.

The resulting message type definition is the same as the first message type definition used in Example 7.1, which results in a precise buffer boundedness analysis. While in this example the coverage sets of the collected message re-ceiving statements for each buffer are pairwise disjoint, this cannot generally be assumed, as the following example illustrates.

Example 7.2. Consider a Promela model in which there is a buffer a = [10]

of {int, bool}. Suppose that there are two message receiving statements with constant message fields to receive messages fromaas the following:

t1:a?5,b t2:a?x,true

where bis a boolean variable andxis an integer variable. Obviously Mt1 and Mt2 overlap, e.g., the messageha,5,trueibelongs to both sets. In this case, if we identifyMt1andMt2as two message types, then they violate the disjointness property in Definition 7.2.

In order to maintain the disjointness of the identified message types, we identify message types for a buffer in the following way: GivenR,ch,s1, . . . ,sk

as defined previously, Ms1, . . . , Msk form a partition over the set Mch. More precisely, the partition consists of the following pairwise disjoint subsets ofMch:

Ms1∩Ms2∩ · · · ∩Msk

Ms1∩Ms2∩ · · · ∩Msk

Ms1∩Ms2∩ · · · ∩Msk

Ms1∩Ms2∩ · · · ∩Msk

...

Ms1∩Ms2∩ · · · ∩Msk

Each of the above subsets corresponds to one identified message type for the bufferch. In the above example, we identify the following four message types for the buffera: Mt1∩Mt2 that contains all the messages whose first field is not 5 and whose second field isfalse;Mt1∩Mt2 of messages whose first field is 5 and whose second field isfalse;Mt1∩Mt2 of messages whose first field is not 5 and whose second field istrue; andMt1∩Mt2 of messages whose first field 5 and whose second field istrue.

Now we argue formally that the above method always constructs an opti-mal message type definition such that further refining it does not improve the accuracy of the buffer boundedness test.

Proposition 7.2. Given a Promela model, letD be the message type definition identified by the method described previously. LetDbe a message type definition such that D⊳1D. Let S andS be the independent cycle systems constructed respectively based onD and D. If S is bounded, then S is also bounded.

Proof. LetS= (n, C,eff) andS = (n, C,eff). Without loss of generality, we assume thatDrefines the message typeminDwith message typesm1, . . . , mk. Therefore, we have thatn=n−1 +k. For the sake of convenience, we re-order the effect vector components for both independent cycle systems such that (1) for anyi such that 1 ≤ i ≤ n−1, the i-th effect vector components in both systems correspond to one same message type; and (2) then-th components in S correspond to the message typem; and (3) for anyisuch that 1≤i≤k, the (n−1 +i)-th components inS correspond to the message typemi.

Lettbe a local transition in some cycle inC. Suppose thattcorresponds to a message receiving statementsin the original Promela code, which may receive one message of the typemfrom the bufferch. Then, we have thatMs∩m6=∅.

We show that in fact m ⊆Ms, by checking the following two possible cases:

(1) Ifs contains no constant message fields, then m⊆Mch =Ms follows; (2) Otherwise,smust be one of the message receiving statements used to identify message types forch. Recall howMchis partitioned to identify message types as described in the optimal identification method. We can easily see thatm⊆Ms. Becausem⊆Msandmi⊂mfor each message typemi, we have thatmi ⊂Ms. This implies thatsmay receive messages of any of the message typesm1, . . . , mk. Furthermore, we assume without loss of generality that there are pcycles in C: c1, . . . , cp. Similar to the argument in the proof of Proposition 7.1, any cycle ci in C corresponds to a set of cycles in C: ci1, . . . , ciqi as the result of the refinement of the message type m. We also have the same properties as the ones stated in Inequalities (7.1) and (7.2) for each cycleci and any of its corresponding cycles inC.

Besides, for any cycle ci ∈ C, we can show that there exists one of its corresponding cycles cil ∈ C such that cil satisfies the following property: If eff(ci)n ≥0 (eff(ci)n ≤ 0 respectively), theneff(cil)j ≥0 (eff(cil)j ≤0 re-spectively) holds for eachn ≤ j ≤ n. Moreover, if eff(ci)n 6= 0, then there exists at least one componenteff(cil)j (n ≤j ≤ n) such that eff(cil)j > 0 (eff(cil)q < 0 respectively). We only prove here the case when eff(ci)n > 0.

The proof for other cases can be similarly derived. Without loss of generality, we assume thatci hasamessage sending statements s1, . . . , sa that send mes-sages of the typem; andbmessage receiving statementsr1, . . . , rb that receive messages of the typem. Becauseeff(ci)n>0, we have thata > b. We select the firstb message sending statements s1, . . . , sb. Among all the cycles inC that

correspond to ci, we can find such a cycle cil to satisfy the following property of the message passing effects ofs1, . . . , sb andr1, . . . , rb: If anysu (1≤u≤b) sends a message of some type m⊂m, then ru receives a message of the same type m. This is possible because m ⊂ m ⊆ Mru and ru may therefore re-ceive a message of the typem. In this way,su cancels theru’s negative effect that consumes one message of the type m. Therefore, all the negative effects ofr1, . . . , ru can be compensated bys1, . . . , su, which implies that eff(cil) has no negative component corresponding to any of the message typesm1, . . . , mk. Furthermore, sincea > b, we still have some remaining message sending state-ments sb+1, . . . , sa that would make some components eff(cil)j (n ≤j ≤ n) positive.

We now start to prove the proposition based on the above discussion. By contradiction, we assume thatS is bounded while S is unbounded. SinceS is unbounded, the sufficient and necessary condition as stated in Proposition 5.4 implies a solution to the following ILP problem:

eff(c1)j·x1+· · ·+eff(cp)j·xp≥0 for each 1≤j≤n (7.17) We show in the following that from this solution we can construct a solution to the ILP problem representing the sufficient and efficient condition of the unboundedness ofS, as shown in Inequalities (7.19–7.20), thereby contradicting the assumption. We construct a solution to the above ILP problem from the solutionxi =vi

as follows. Let C> ⊆C contain all cyclesci such thatvi >0 andeff(ci)n >0, C0 ⊆C contain all cycles ci such that vi >0 and eff(ci)n = 0, and C< ⊆C contain all cycles ci such that vi > 0 and eff(ci)n < 0. We describe in the following three different selection procedures to select corresponding cycles in C for the cycles in these three subsets of C respectively. The selected cycles will receive positive coefficients in the constructed ILP solution.

Selecting cycles forC>. For each cycleci ∈C>, we select nondeterministi-cally a cyclecil∈C, which corresponds toci, as long aseff(cil)j≥0 holds for eachn≤j≤n. We use sel(ci) to denote the selected cycle for each ci ∈C>. We calculate the summary effect E> of all selected cycles from C, assuming that each selected cycle is executed as many times as its corresponding cycle in C>, as follows.

E>= X

ci∈C>

eff(sel(ci))·vi (7.21)

We can easily see that (E>)j ≥ 0 holds for each n ≤j ≤ n, and that there exists at least onej (n≤j≤n) such that (E>)j >0 . Furthermore, following the property stated in Inequality (7.2), we have that

n sel(ci) to denote selected cycles.

Selecting cycles forC<. The principle of this selection procedure is to guar-antee that, for each message type mi ∈ {m1, . . . , mk}, the summary effect of the selected cycles fromC should not over-consume the number of messages of the typemi generated byE>. We use a counterz(j) for eachj-th component in E> where n ≤j ≤ n, which is initialized to (E>)j. During the selection procedure, each counterz(j) records how many messages of the typemj−n+1

have not been consumed by already selected cycles. Moreover, we assign a pos-itive numberco(c) to each cyclec∈C being selected to denote its respective coefficient in the constructed solution. We also use a counterco(i) for each cycle ci ∈C<, which is initialized tovi. During the selection procedure,co(i) shows the upper bound on the coefficient that can be assigned to the next selected cycle corresponding toci. The selection procedure is described as follows.

We look for the smallest index j, wheren ≤j ≤n, such that (E>)j >0.

Note that thej-th component inE> corresponds to the message typemj−n+1. Then, we start the selection with j and the cycle in C with the smallest subscript. Suppose that this cycle isci. Then, this cycle consumes−(eff(ci)n·vi) messages of the type m in the solution to the ILP problem (7.17–7.18). The idea is to select some cycles for ci to consume as many messages of the type mj−n+1 as possible. The selection is an iterative procedure. Each iteration is in one of the following three cases:

• Ifz(j)≥ −(eff(ci)n·co(i)), then we select the cyclecilsuch thateff(cil)j= eff(ci)n. We setco(cil) = co(i). By now we have finished the selection of cycles corresponding to ci. If there is no other cycle in C< for which we need to select cycles, then the selection procedure terminates. Oth-erwise, we look for the next cycle to process, i.e., the cycle cu ∈ C<

whose subscript is the smallest one satisfying u > i. Furthermore, if z(j) = −(eff(ci)n·co(i)), then we have consumed all messages of the typemj−n+1. In this case, we continue the selection withcu and the w-th component of E> such that w is the smallest index satisfyingw > j and (E>)w > 0. Such an index w must exist following the Inequality (7.22). Otherwise, if z(j) > −(eff(ci)n ·co(i)), then we reduce z(j) by

the type mj−n+1. In such case, we continue the selection withci and the w-th component ofE> such thatwis the smallest index satisfyingw > j and (E>)w. Otherwise, if z(j) > −eff(ci)n·u, then we reducez(j) by

−(eff(ci)n·u) and continue the selection withci and thej-th component.

• If z(j)<−eff(ci)n, then we construct an effect vectore as follows. Ini-tially,eu=eff(ci)ufor each 1≤u≤n−1, andeu= 0 for eachn≤u≤n. We add a new counterzand set it to−eff(ci)n. First, we setej =−z(j).

We reduce z byz(j). Next, we look for the u-th component ofE> such that uis the smallest index satisfying u > j and (E>)u >0. Then, we set eu=−max{z(u),z}. Next, we reducezby−eu, and also reducez(u) by −eu. If z >0, then we look for the next smallest indexu such that u > uand (E>)u >0 and set eu to the proper value. We repeat this construction until z = 0. We can easily see that there is a cyclecil ∈C witheff(cil) =e. We selectcil withco(cil) = 1. We reduceco(i) by 1. If co(i) = 0, we look for the cyclecu ∈C< whose subscript is the smallest one satisfying u > i. If no suchcu exists, then the selection procedure terminates. Otherwise, let thew-th component be the last component of E> used in the previous construction ofe. Ifz(w)>0, then we continue the selection procedure with cu and thew-th component. Otherwise, we continue the selection with cu and the the next smallest index w such that w> wand (E>)w >0.

From the selection of cycles for C>, C0 and C< above, we construct the solution to the ILP problem (7.19–7.20) as follows. For anyci∈C>∪C0, we set the coefficient ofsel(ci) tovi. For anyci∈C<and for eachcof the selected cycles forci, we set its respective coefficient toco(c). For any otherxil in the ILP problem, we set its value to 0. We can easily prove that the constructed solution is a real solution, following the selection procedures and the properties as stated in Inequalities (7.1) and (7.2). We omit this proof here, which can be derived in a similar way as in the end of the proof of Proposition 7.1. 2

The above proposition only states that the buffer boundedness test returns no better result when we refine one message type in the optimal message type definition obtained by our proposed method. However, we can easily see that any finer message type definition would not lead to a more precise boundedness analysis.