• Keine Ergebnisse gefunden

6.3 Model description

6.4.1 Services with fixed execution time

To assess the responsiveness of a service, this analysis computes the probability distribution of the service’s total execution time depending on tS, tC, tR, n, , and pcov. Let Xn be the random variable describing the service execution time and FXn

(t) = P(X

n

t)the distribution of the probability that the service is completed correctly before or at timetusingncheckpoints.

ComputingFXndirectly is difficult.2 As an intermediate step, compute the runtime distribution of a single block—these blocks are then recombined to derive the runtime distribution of the entire program. Consider first the casepcov

=1and letYnbe the random variable describing the execution time of one of the service blocks of lengthtS=n. A single block’s execution cannot complete beforetS=n+tC. If the system survives for at leasttS

=n+t

C, this block is completed successfully—this case has probabilitye (

t

One additional recovery step takes another tR +t

S

=n+t

C time units to complete before it increases the probability of successful execution. Executing a single recovery step happens when the system failed during the first execution of this block (with probability(1 e (

t

)) but survived the execution of the recovery step (with probability e (tR+

t

and so on for every additional recovery step. Obviously, the probabilities for surviving an ordinary step or a step with an additional recovery step are crucial. The following abbreviations will be used for these probabilities:

2Except for the case of no checkpointing, for which the probability of successfully completing the program at timetSise tS, and0before this time.

6.4. ANALYSIS

The probability distribution ofYnis thus

F only for a Poisson fault process. Assuming general fault processes, the occurrence of faults in previous steps must be considered when deriving the distribution.

This distribution is a so-called arithmetic distribution function [55]. Since there are, for any fixed set of parameters, only countably many points in time where Ynchanges its value, it is actually a discrete random variable. The discrete density function ofYnis denoted byfYn

(t); it is obtained fromFYn by replacingH() with the functionu(0)=1,u(t)=0ift6=0.

From this probability distribution FYn it is now possible to compute the distribution for Xn. The total execution time of the service is the sum of the execution times for itsnblocks:Xn

= (indepen-dent) random variables that have a density, the density of the sum of the random variables is the convolution of the densities. Therefore, the probability density of Xn is the n-fold convolution of the density of Yn:

f

(denotes convolution). Lemma 6.1 is helpful to compute this convolution.

Lemma 6.1 (Convolution). Letf(t)=

P

Proof. First proof the special case for the convolution f(t)of two discrete functions f1

(t)and f2

The case of ann-fold convolution follows easily by induction.

Using this lemma,fXncan be written as

wheremis the number of recovery steps taken and

^

There are only a finite number of cases to write a natural number mas a sum of other natural numbers and therefore Equation (6.1) for^cmis well formed. To give a closed form forc^m, the structural difference between

c

0 andci

;i > 0must be considered (this reflects the fact that is does make a difference whether a normal step or a recovery step fails). The formula forc^m can be simplified by distinguishing two cases. Obviously,

^ sum are0, and Equation (6.1) can be refined as follows:

^

wherepsumis defined as follows:

Definition 6.2 (psum()). psum(m;k)is the number of possibilities to writemas a sum of exactlykpositive natural numbers (were permutations of the sum are considered as different possibilities).3

To simplifypsum, the functionnnsumis used:

Definition 6.3 (nnsum()). nnsum(m;k)is the number of possibilities to writemas a sum ofknonnegative natural numbers.

6.4. ANALYSIS

Proof. The equation fornnsumcan be shown with a simple induction onk: Write functionnnsum(m;k+1) as

P

m

i=0

nnsum(m i;k)(the new summand k+1can be any value between0andm), use the induction hypothesis, apply a

iteratively from the right, and note for the last summand that k 1

0 Therefore, Equation (6.2) can be rewritten as

^

So far,pcov =1has been assumed. For imperfect detection coverage, the fault detection must detect all faults to be able to assume that a result is correct, i.e., it has to work for allmrecovery steps that would occur in the case ofpcov =1. Formrecovery steps, the probability of this case is equal topmcov. The coefficient^cm formrecovery steps is hence modified to

~

Finally, given a deadlinedat which the service has to be completed, the probability of successful comple-tion under worst-case runtime assumpcomple-tions isP(Xn d)for any givenn. By the derivation offXnand the above modification forpcov,

P(X

withc~m and^am as defined above. Equation (6.3) is actually a finite sum, since only executions with a total execution timea^msmaller thandcontribute to the probability. Therefore,

^ is an upper limit for the number of possible recovery steps for deadlined(m0 <0ford<n(tS

n +t

C

), which corresponds to the impossibility of meeting this deadline). This last observation completes the proof of the main theorem:

Theorem 6.1 (Responsiveness of a checkpointed service with fixedtS). For a service with a fixed execu-tion time, the probability of meeting its deadline whenncheckpoints are used is

P(X

A closed-form solution for this Equation (6.5) would allow to determine the optimalnanalytically, but such a form is not easy to find. A numerical solution, on the other hand, is quite simple. For such a numerical solution, it is important to note that there are only a finite number of checkpointsnthat have to considered: too many checkpoints would imply that even in the fault-free case the deadlinedcannot be met ifd<tS

+nt

C

(by Equation (6.4)). Therefore, it suffices to compute Equation (6.5) only for suchnwith

1n

and choose thenthat maximizes Equation (6.5); ties can be broken in favor of a smallern.