• Keine Ergebnisse gefunden

Solution for m machines and routines with fixed runtimes

5.4 Analysis

5.4.4 Solution for m machines and routines with fixed runtimes

Unfortunately, the solution derived in Section 5.4.3 is computationally very expensive. It is difficult to com-pute the probability distribution in reasonable time with such general assumptions as have been made above.

However, it might be more feasible to do so if the assumptions are more restricted. In this section, fixed run-times of routines are assumed, i.e., routineihas a precisely given runtimeai (on a machine of relative speed one).

These restrictions allow a more straightforward solution, which can be generalized tommachines right away. Much of the notation introduced in the previous section will be used here, with obvious generalizations from two machines to m machines where necessary. To avoid some purely technical special cases in the following derivation, we will assume thatn>m.

The proof will use the notion of a state:

Definition 5.12 (Stateqof an execution). A state q =(A;S;R ) 2 Q represents the progress of the execu-tion ofnroutines onmmachines,Qis the set of all possible states. Such a state consists of

an assignmentAsimilar to Definition 5.1, but each element extended by the absolute time of completion:

A=[(machinej;routinei;absolute completion timeo);:::],

a fault scenarioSas in Definition 5.8, extended tommachines,

a tupleRof lengthnthat represents the state of each individual routine; an element ofRcan be either

waiting,started, ornished.

Note that Sj is the random variable corresponding to the lifetime of machinej, while sj is the number of routines that machine j survives in a given state. Again we need some utility functions to conveniently formulate the definitions and proofs.

Definition 5.13 (Thedonepredicate). Given a stateq=(A;S;R ),doneis defined as

done

Definition 5.14 (The number of operational machines in a state). For a state q = (A;S;R ), the number of operational machines in a state is given by

operational

Definition 5.15 (The next routine to be scheduled). Given a state q = (A;S;R ) that is not yet done, the next routine to be scheduled is given by

candidate

The assignments here are ordered according to absolute completion time. An ordered concatenation opera-tor allows to maintain this order when assignments are concatenated, and only assignements ordered according to completion are used in this proof.

Definition 5.16 (Ordered concatenation of assignments.). Given an assignment sequenceAordered by com-pletion time, the ordered concatenation ofAwith a single assignment(j;i;c)is defined as

A?(j;i;c)

Slightly more complicated—due to the need to consider faults and the startup phase of the scheduling algorithm—is the problem of determining the first machine that becomes idle in any given stateq.

Definition 5.17 (Idle information(j;l;i;o)). Given a stateq = (A;S;R ), j (the first machine to become idle), its index linA, the routineiit has been executing, and its absolute completion timeo, are defined as follows:

Ifjfu :R (u)=waiting gj> n m(not all machines have been assigned routines in this parallel step, and no routine has actually finished yet),

j=l=jfu:R (u)=startedgj+1ando=i=0

and else (all machines have been assigned routines, and some routine actually finishes),

l=idleindex(A;S;

HereJ is a tuple of lengthm, representing the number of routines that have been assigned to each machine, and(J (k;l))(u)def=

(

J(u) ifu6=k;

l else.

5.4. ANALYSIS

Lemma 5.10 (The idle information(j;l;i;o)is correct). Given a stateq =(A;S;R ), the idle information

(j;l;i;o)as defined in Definition 5.17 correctly describesj, the next machine to become idle, its indexlinA, the routineithat it has executed (if any), and the absolut completion when this machine becomes idle. i=0 ando=0indicate that not all machines have been assigned a routine in this parallel step (it corresponds to the start phase of the parallel step).

Proof. While there are more thann mroutines waiting, less thanmmachines have been assigned a routine.

This happens at the beginning of a parallel step, when no routine is finished yet. Hence the next idle machine is given by the number of routines started so far, plus 1.

Otherwise, the first machine to become idle is the machine that appears first in the assignment without any other routines being assigned to it later, where faults have to be taken into account. Faults are represented by the number of routines a machine survives, so the m-tuple J counts this number for each machine. If

S(j) J(j), machinejhas survived all previous and the current routine, so if it is the last routine assigned to this machine, then the machine is idle and still working. The first machine for which this holds is the first idle machine (since assignementAis ordered by completion time).

If all machines fail before they have completed their final routine (and only then) will the value of the

idleindexfunction be infinity. Assignments have to assure that at least one machine survives until completion of a parallel step.

We can now construct a function that “executes” the scheduling of a routine by mapping a stateq to two succeeding statesq1andq2, whereq1represents the normal progress of the computation andq2represents the event that the machine on which the routine is scheduled fails during the execution of this routine.

Definition 5.18 (Eager scheduling on states). Given a stateq =(A;S;R ), the functionesis

es

Ifdone(q)holds,es(q)=fqg. Let(j;l;i;o)be the idle information of stateqas defined by Definition 5.17.

Then completion index to be used later). Otherwise, icand = candidate(R

0

) is the next routine to be scheduled.

Then

and ifoperational(S)>1,

q functionesgenerates all possible execution scenarios by either letting a machine survive the assignment of a new routine (q1), or by assuming that it crashes during this assignment (q2). This is proven in the following lemmas.

Lemma 5.11 (esreflects an eager scheduling step).

Proof. Consider a stateq =(A;S;R ). If in this state all routines are done, the algorithm terminates.

If there is at least one routine that is not done, the algorithm selects the first machine that is or becomes idle and assigns this routine to it. The completion time of this new routine is its execution time divided by the relative machine speed, plus the time at which the machine becomes idle (which can be0).

For each assignment of a routine, there are two cases (as long as there are at least two operational ma-chines): either the machine survives the execution of this routine, or it does not. In both cases, the assignment is added toA, but in the latter case,S is marked to indicate that this machine does not survive the execution of this routine. N.B. that the number of operational machines is not allowed to fall below 1. Therefore, the

idleindexfunction will never be1.

Definition 5.19 (Set of all executionsQes). Qes = limk!1esk(q0), where the application ofesto a set of states is defined per element,es (Q)=[q2Q

es(q),Q22Qand the initial state isq0

=([];(n;:::;n);(waiting;:::;waiting )). Lemma 5.12 (Qesis finite). Qesis finite, and only a finite number of steps are necessary to generate it, i.e.,

there is ak0such that8kk0

Proof. We have to show that for any state not done, there are only a finite number of successors. Think of the repeated application ofesas a tree. Obviously, the tree is of finite degree.

First note that a state for whichdoneholds is a fixpoint ofes. Suppose there is an infinite path(ql)in the tree,ql +1

)=false. However, every application ofesreduces the number of routines that are not started or not finished by one (since at least one machine must not fail), and this number cannot fall below zero. Contradiction.

Therefore, by K¨onig’s Lemma, the tree is finite, andQesis finite. And since any stateqfor whichdone(q) holds is a fixpoint ofes, a finite number of steps suffice to generateQes.

Lemma 5.13 (Qesreflects eager scheduling).

Proof. Follows immediately from Lemma 5.11 and Lemma 5.12: Every state inQescorresponds to an actual execution of eager scheduling and there are no other executions possible (by Lemma 5.11) since ordered eager scheduling behaves deterministically modulo faults, which are accounted for.

We finally have to compute the probability of any q 2 Qes actually happening. Unlike the proof in Section 5.4.2 and Section 5.4.3, the only probabilistic element here are the machine faults. The occurring faults are described by the survival parameterS in a stateq = (A;S;R ). The probability for each machine

1;:::;mbehaving as prescribed byqis given below.

Lemma 5.14 (pfjis correct). The probability of a state as defined by Definition 5.20 is correct.

Proof. Analogous to Lemma 5.10.

Hence the final theorem of this section can be formulated as follows.

Theorem 5.3 (Runtime distribution with fixed routine execution times). For nroutines with fixed execu-tion timesai,i=1;:::;non a processor of speed1, andm<nprocessors of relative speedcj and lifetime

S

j,j=1;:::;m, the runtime distribution of the successful completion timePr(Z t)of eager scheduling is

5.5. SOME EXAMPLES

wherectfis the completion time under faults as generalized from Definition 5.11.

Proof. By Lemma 5.13, Qes is the set of all possible successful executions of eager scheduling with the given parameters. By Lemma 5.20, the probability of such a state occurring is

Q independence assumption of machine failures). The state will be successfully completed before or at time t only ifH(t ctf(A;S))6=0.