Realizing TAFT with the IPE Server - A PERIODIC R EQUESTS

5.4 A PERIODIC R EQUESTS

5.4.3 Realizing TAFT with the IPE Server

capacity is the highest priority entity, a task instance is selected for execution under the capacity in the following order:

1. A pending aperiodic request;

2. The periodic task instance with the shortest deadline;

3. The idle task.

So, if no aperiodic request is pending, the capacity is exchanged with a periodic instance so that it is preserved and can be used when an aperiodic request arrives later.

The IPE server has been shown to exhibit a performance, which is comparable to that of the EDL server and hence nearly optimal w.r.t the response times of aperiodic requests (Spuri and Buttazzo 1996). Furthermore, it feasibly schedules each periodic task set with a processor utilization not greater than one. Expressed more formally, this means: For each set T := {τj = (Tj,Cj) | j ∈ 1..n} of periodic tasks, the IPE server feasibly schedules T if and only if

1 :

≤

∑

= n

i i

T U C

The drawbacks of the IPE server as compared to the DPE server, or other periodic server approaches, is that it is based on the idle times of an EDL schedule of the periodic task set, which is computed offline. Storing the idle times incurs a certain memory overhead, in particular if the periods of the periodic tasks are not harmonic. The current implementation of TAFT (Becker and Gergeleit 2001,Becker et al. 2003), which is also based an EDL schedule for a periodic task set (for the exception parts in this case), shows that using an EDL schedule for the periodic tasks is a viable solution. We therefore decided to base our implementation on the IPE server due to the advantages stated above.

• It guarantees timely completion of correct jobs and exception handling for each set of periodic task pairs with a utilization factor not greater than one;

• It achieves a nearly optimal performance w.r.t to the response times of the commu-nication tasks;

• It makes use of remaining processor idle times to complete faulty main parts.

5.4.3.1 The Basic Algorithm

We present here an extended version of the IPE server, called TAFT-IPE(basic), which is able to schedule periodic task pairs. TAFT-IPE(basic) works as follows: Whenever an in-stance Ji,k of task pair τi = (Ti,Ci,Ei) is released, the scheduler adds the main part MPi,k to the pending queue and schedules it the same way as the IPE server schedules a convtional periodic task. Scheduling the main part like a periodic task under the IPE server en-sures that aperiodic requests cannot prevent it from getting up to the specified amount of resources (Ci) by its deadline. However, while a conventional task never exceeds its speci-fied resource demand, the main part MPi,k may do so. Therefore, TAFT-IPE(basic) per-forms an accounting of the actual execution time of the main part. As soon as it is equal to the specified resource demand Ci, the scheduler aborts the main part and continues with the execution of the exception part EPi,k instead. The exception part too is scheduled in the same way as the IPE server would schedule a conventional periodic task. Again, schedul-ing the exception part like a periodic task under the IPE server ensures that aperiodic re-quests cannot prevent the exception part’s getting up to the specified amount of resources (Ei) by its deadline.

In this algorithm, the amount of resources assigned to any job Ji,k is limited by Ci' := Ci + Ei. Therefore, the resources consumed by a task pair τi = (Ti,Ci,Ei) under this algorithm can be represented by a virtual periodic task τ'i with a specified resource demand of Ci' and a period Ti' = Ti. If the IPE server is able to complete each instance of τ'i by its deadline, this means TAFT-IPE(basic) is able to allocate up to Ci + Ei time units to each instance of τi by the deadline of that instance. Therefore, TAFT-IPE(basic) is able to complete MPi,k as long as it has an actual execution time not greater than Ci. If this is not the case, TAFT-IPE(basic) is still able complete EPi,k by its deadline using the remaining specified re-source demand. Therefore, TAFT-IPE(basic) guarantees a timely completion of correct jobs and exception handling for a set of periodic task pairs T := {τj | j ∈ 1..n} whenever the IPE server feasibly schedules the corresponding set T ' := {τj' | j ∈ 1..n} of virtual peri-odic tasks. Together with the IPE schedulability criterion presented in Clause 5.4.2.3, this implies that TAFT-IPE(basic) achieves a timely completion of correct jobs and exception handling for a set T := {τj | j ∈ 1..n} of periodic task pairs as long as

+ ≤

∑

= n

i i

T E

U C .

We do not prove this conclusion here; a prove will be provided for the complete TAFT-IPE algorithm in Sub-Section 5.4.4.

TAFT-IPE(basic) aborts main parts directly after they consumed their specified resource demand Ci although there may still be a chance to complete them in time. The current im-plementation of TAFT exploits remaining processor idle times to try to complete faulty main parts. Since this feature increases the reliability of task execution, we strive to pre-serve it while extending the supported task model. In the following clause, we show how the basic algorithm can be extended for this purpose.

5.4.3.2 Executing Faulty Main Parts

Our objective in extending TAFT-IPE(basic) is to use remaining processor idle times to complete faulty main parts instead of aborting them directly. According to the TAFT con-cept, faulty main parts are scheduled on a priority level lower than that of the correct main and exception parts, so as to ensure that the guarantees provided to the latter are not com-promised by the faulty main parts; that is, to provide fault containment. Our approach to achieve this is inspired by the following observation: The way aperiodic requests are han-dled by aperiodic servers is analogous to the lower level of priority of faulty main parts in TAFT: Aperiodic servers try to complete the aperiodic requests, but only as long as the guarantees provided to periodic tasks are not compromised. So, the fact that faulty main parts are scheduled on a lower priority level than correct main parts and exception parts in TAFT can be mapped to the distinction between aperiodic requests and periodic instances in the server approaches. Therefore, the basic idea of the extension is mapping faulty main parts to aperiodic requests.

TAFT-IPE(basic) is extended in the following way: Each time a main part has been exe-cuted for Ci time units and is still not completed, the scheduler turns it into an aperiodic request instead of aborting it directly. At the same time, the scheduler adds the exception part to the pending queue. So, a main part’s getting faulty corresponds to the arrival of an aperiodic request, while executing the exception part corresponds to continuing the execu-tion of the virtual periodic task. Correct main parts and excepexecu-tion parts are conceptually scheduled on a higher priority level: For both, guarantees are provided, while the faulty main parts are served on a best-effort basis. In particular, this means that the exception part of the faulty task pair is guaranteed to be completed by its deadline unless the faulty main part is completed. Nevertheless, like the IPE server defers executing periodic instances to improve responsiveness of aperiodic requests, TAFT-IPE defers the execution of exception parts to serve the faulty main parts first. So, the scheduler first tries to complete the faulty main parts and executes the exception handling only if required. If the scheduler is able to complete the faulty main part, it removes the exception part from the pending queue. De-ferring the execution of the exception parts does not compromise the guaranteed exception handling: As soon as there are no more aperiodic capacities available, the scheduler will stop executing faulty main parts. Therefore, as soon as further delaying the execution of the exception part would result in the exception part’s missing its deadline, the scheduler starts executing it. In this case, the faulty main part is removed from the pending queue;

that is, the main part is aborted.

In spite of the analogy between handling faulty main parts in TAFT and handling aperiodic requests in the IPE server, there are also the following differences:

1. In the IPE server, delaying the execution of a periodic instance does not impact its chance of being completed because its WCET is known and it is completed by its

deadline in any case. While this is also the case for the exception parts in TAFT-IPE, it is not for the main parts. Here, delaying the execution of a main part means reducing its chance of being completed if it should become faulty.

2. While responsiveness is the main issue for aperiodic requests in the IPE server, it is not per se an issue for the execution of faulty main parts. For them, early execution is only an advantage if it increases the probability of being completed.

Together, points 1. and 2. raise the question whether it is reasonable to defer the execution of a correct main part in order to execute aperiodic requests. Deferring the exception parts is not a problem since it does not affect their probability of being completed; rather, they are guaranteed to be completed whether deferred or not.

In answering this question, we must distinguish between communication tasks and faulty main parts, which are both scheduled as aperiodic requests. We deem it reasonable to defer the execution of correct main parts in order to execute communication tasks. For one thing, communication tasks have small execution times, so they do not delay the execution of the main parts too much. Furthermore, they have strong response time requirements, which warrant their early execution. Therefore, if a capacity is available, communication tasks have the highest priority in using it.

Things are not as clear regarding the execution of faulty main parts. Deferring correct main parts to execute a faulty one allows the first faulty main part to consume all available ca-pacities. Thus, if one of the deferred main parts becomes faulty also, there may be no more capacities available so that it may have to be aborted. Therefore, the first faulty main part would have the best chance to be completed, and furthermore, it would reduce the chance of the following ones. For these reasons, it seems fairer to delay the execution of faulty main parts and execute the correct ones first. Besides increasing the fairness, this keeps the capacities longer in the system, which is of advantage when instance of a communication task arrive. On the other hand, faulty main parts have deadlines. Sooner or later, a faulty main part will be aborted by the corresponding exception part. Always favoring correct main parts may delay the execution of faulty ones until the moment at which the corre-sponding exception part is started.

Summarizing the discussion above, the following three points can be settled:

• Deferring exception parts is not a problem since they are completed at any rate if required. Therefore, they are never executed under a capacity.

• If a communication task is pending, it is run under the capacity.

• Amongst several faulty main parts, priorities are assigned according to deadlines.

The question remains whether a faulty or a correct main part should be executed under a capacity if both kinds of instances are pending. Two strategies appear to be plausible:

1. Allocate the capacities to the main part with the shortest deadline, whether it is faulty or not; ties are broken in favor of correct main parts. This means that dead-lines are the first criterion and the distinction between correct and faulty main parts is the second.

2. Allocate the capacities to a correct main part; among the correct main parts, priori-ties are assigned based on deadlines. This means that the distinction between faulty and correct main parts is the first criterion and the deadlines are the second.

Both strategies are possible and easily implemented; the first gives focus on trying to com-plete faulty main parts and not wasting capacities, while the latter stresses fairness. We decided to apply the first approach, because it works better for task having precedence constraints. In a pair of jobs with a precedence constraint, the successor cannot be started before the predecessor is terminated. Therefore, if the predecessor becomes faulty, it is not possible to execute the successor first and then try to complete the faulty predecessor. Fur-thermore, using deadlines as the first criterion allows using a simple approach to enforce precedence amongst tasks, as we shall explain in Section 5.5.

5.4.3.3 Formal Description of TAFT-IPE

We now present a formal description of TAFT-IPE. For this purpose, we use an SDL-like syntax enhanced by some common mathematical notions to keep the description simple. At the heart of the description is the procedure schedule, which determines the task instance to be executed (cur_task). It is assumed that when schedule returns, the task referenced by cur_task is dispatched on the CPU. We assume that a set {τi = (Ti,Ci,Ei) | i ∈ 1..n} of peri-odic task pairs has been accepted and an EDL schedule for this set been computed offline.

The acceptance criterion for the task pairs will be presented in Sub-Section 5.4.4. Two vectors describe the idle times of the EDL schedule for a single hyper period of length H := lcm({Ti | i ∈ 1..n})⁶: The vector = (e0,e1,…ep) denotes the starting times of the idle time intervals, whereas the vector ∆ = (∆0,∆1,…,∆p) denotes their lengths, where p is the number of idle times in the hyper period. As EDL schedules are cyclic, these two vectors are suffi-cient to describe all idle times in the whole schedule. We use the following notions in the formal description:

• ΓS: the capacity of the server; for i ∈ 1..n, Γi is the capacity associated with τi; and C := {ΓS} ∪ {Γi | i ∈ 1..n} is the set of all capacities;

• periodics, comms, aperiodics: the set of ready instances of correct main and excep-tion parts, communicaexcep-tion tasks, and faulty main parts respectively waiting to be executed;

• cur_ent: a reference to the current entity (task or capacity);

• cur_task: a reference to the currently running task;

• ed(S): returns the entity with the shortest deadline in a set S of entities. Each capac-ity Γi > 0 is assigned a deadline di,k during [ri,k,di.k], where ri,k and di.k are the release time and deadline of the kth instance of τi. Ties are broken in the following order:

6 “lcm” denotes the least common multiple of a set of numbers.

capacity, correct main part, faulty main part, and exception part. When S is empty, the function returns the idle task;

• remi: the remaining allocated execution time of the current instance MPi,k of τi’s main part; it represents the difference between the specified execution time Ci and the processing time already assigned to MPi,k. When remi = 0 and MPi,k is not com-pleted it has become faulty;

• cur_rem: remaining execution time of the current entity cur_ent. If cur_ent is a ca-pacity, cur_rem corresponds to the value of that capacity;

• rnext: the next time at which a periodic instance will be released;

• request: a signal arriving in the scheduler whenever an instance of a communica-tion task is released;

• completed: a signal arriving in the scheduler whenever an instance completes;

• accept: the acceptance test for instances of the communication tasks, which will be presented in Sub-Section 5.4.4;

• dispatch_time: the time when the current task was dispatched to the CPU;

• sched_timer: the scheduling timer;

• now: the current clock time;

We organized the explanation of the formal description according the different themes of the algorithms:

Maintaining Capacities. During the start transition all capacities are initially set to zero (lines 2-3). The server capacity ΓS is replenished whenever an idle time of the underlying EDL schedule starts (lines 13-14). Each such point of time coincides with the release time of a task instance. To replenish the server capacity, the length of the idle time is added to the server capacity. Whenever some instance has been executing under a capacity for ∆t time units, the capacity is reduced by that amount of time during the accounting (line 62-63). If it was an instance of a main part, the capacity associated with that main part is in-creased accordingly (line 64-65). This means that the capacity is exchanged from cur_ent to Γi.

Monitoring the execution of main parts. Whenever a main part is released, the remaining execution time remi is initialized with the resource demand specified for that main part (line 10). The scheduler uses this variable to detect when a main part becomes faulty. All execution times of the main part are subtracted from the remaining execution time during the accounting (line 67-68). When remi reaches zero, the main part is converted into an aperiodic request and the exception part EPi is added to the pending queue (lines 69-73).

Before a main part is dispatched to the CPU, the scheduling timer is set to ensure that the execution of the main part is interrupted when the remaining execution time has been con-sumed (lines 42-43 and 48).

Figure 5-11. Formal description of the TAFT-IPE algorithm (part 1)

Figure 5-12. Formal description of the TAFT-IPE algorithm (part 2)

Scheduling. Deciding which task to execute is performed in two steps. First, the set of ca-pacities and periodic instances is considered, and the highest priority entity is chosen as the current entity (lines 34-35). If this entity is a task instance, it is selected for execution (lines 40-44). Otherwise, it is a capacity and a task has to be selected to run under this ca-pacity (lines 36-39). If a communication task is pending, it is selected. Else, the instance with the shortest deadline is chosen. This means that regarding execution under a capacity, the highest priority is given to the communication tasks. Among correct and faulty main parts with a common deadline, ties are broken in favor of correct ones for fairness reasons.

If a main part and an exception part have the same deadline ties are broken in favor of the main part in order to increase the probability that the main part can be completed by its deadline. In fact, the exception part will not be executed under a capacity because each exception part is released together with an aperiodic request (the faulty main part) having the same deadline. So, the scheduler defers executing the exception part and tries to com-plete the main part first. If a faulty main part can be comcom-pleted, the exception part is re-moved from the pending queue (lines 27-28). If, however, the scheduler starts executing the exception part, it aborts the main part (line 45-46).

Reclaiming Resources. To have even more capacities for executing aperiodic requests, we use a mechanism that converts resources that have been allocated for main parts and exception parts, but have not been used into aperiodic capacities. So, resources of main parts that do not need the full Ci and of exception parts that are not executed can be used to execute aperiodic requests. When a main part is completed, its remaining execution time and the time allocated for the corresponding exception part are transferred to its aperiodic capacity (line 26). This mechanism for reclaiming resources nearly adds no overhead to the scheduler.

Figure 5-13 Example of a TAFT-IPE schedule

Figure 5-13 shows an example of a TAFT-IPE schedule. The example contains two peri-odic task pairs τ1 := (6,3,1) and τ2 := (24,4,1) and a communication task R1 := (1,3). From top to bottom, the graphs depict the idle times of the underlying EDL schedule (ωEDL), the capacity of the IPE server (ΓS), and the states and capacities of both tasks. In the example, the execution times of the exception parts and of the communication task are not that much shorter than the execution time of the main parts as we suppose they would be in reality;

yet, we decided not to make them to small so that the figure remains easy to comprehend.

The actual required execution times of the periodic task instance are assumed to be C1,1 = 4, C1,2 = 3, C1,3 = 4, C1,4 = 5, and C2,1 = 7. At the start, the initial capacity ΓS = 2 is the highest priority entity and the task with the shortest deadline (τ1) is selected to run under that capacity, meanwhile accruing a capacity Γ1. At time 2, ΓS is exhausted and Γ1 be-comes the highest priority entity. Task τ1 runs under that capacity until time 3 when it be-comes faulty and is turned into an aperiodic request. This request is selected for execution

under the capacity, for it is the task instance with the shortest deadline. At time 4, it is completed and the time allocated for its exception part is added to Γ1, which is still the highest priority entity. Task τ2 runs under that capacity until time 6 and accrues a capacity Γ2 = 2. At time 6, the server capacity ΓS is replenished, another instance of τ1 is released, and an instance of the communication task arrives. ΓS is the highest priority entity at that time, and according to our policy, the communication task is selected to run under that capacity. So, by time 7, ΓS is exhausted and the communication task completed. Now, the highest priority entity is τ1, which is executed consequently. When τ1 is completed at time 10, the time allocated for its exception part is added to capacity Γ1, which at once becomes the highest priority entity. The execution of τ2 is continued for 1 time unit under Γ1 so that Γ2 increases to 3. From time 11 to 12, Γ2 is the highest priority entity and τ2 is executed under it until it becomes faulty. At the same time, the next instance of τ1 is released. This instance is now the highest priority entity and is executed until time 15, when it becomes faulty. Its exception part, which is added to the pending queue at that moment, immedi-ately becomes the highest priority entity and is executed accordingly. After the exception part is completed, Γ2 becomes the highest priority entity again. The only pending task in-stance is the faulty main part of τ2, which is executed under the capacity until time 17. At that time, the next instance of τ1 is released and another aperiodic request arrives. As Γ2 is still the highest priority entity, the aperiodic request is executed under it and completed by time 19. After Γ2’s being exhausted, τ1 is selected for execution. At time 22, when it be-comes faulty, its exception part is added to the pending and immediately executed. After the exception part is completed, the exception part of τ₂ is run.

This example illustrates how the mechanism of priority exchange allows preserving the capacities in the system: The last time a new capacity is added is at time10, when τ₁ is completed and the time allocated for the execution of its exception part is turned into a capacity. The capacities are still available from time 16 to 18 to execute the faulty main part of τ₁ and at time 18 to execute the communication task. This instance of the communi-cation task is served immediately, even though the last capacity was added at time 10. Ac-tually, the last replenishment of the server capacity occurred even earlier at time 6.

Im Dokument A middleware for cooperating mobile embedded systems (Seite 133-142)