System Modeling - R ELATED W ORK - A middleware for cooperating mobile embedded systems

4.4 R ELATED W ORK

4.4.1 System Modeling

this limitation, because the prediction model accounts for the changes of the ob-served state during the transmission of the state message (Mock 2003).

2. State prediction is used to relate local states observed at different points of time to the same reference time so that they can be combined into a time coherent global state.

First, as the timing behavior of time-free asynchronous systems is not specified such that even correct systems may exhibit a completely unpredictable timing behavior, it is not pos-sible to design protocols observing a predictable timing behavior in this model. Second, according to a well-known result of (Fischer et al. 1985) consensus cannot be achieved in such systems if at least one process may be faulty.

Synchronous systems (Cristian 1991,Cristian 1996,Galleni and Powell 1996) can be con-sidered the reverse end of the spectrum. In synchronous systems, service specifications are timed. Communication is assumed to be reliable and timely (also called certain (Cristian and Fetzer 1999)). Obviously, assuming reliable and timely communication is not reason-able in a wireless environment. Weaker synchronous system models have been proposed also, in which the communication service may exhibit a bounded number of omission fail-ures (Verissimo et al. 1991). The upper bound is referred to as the omission degree. In these models, certain communication can be achieved using timing redundancy (Rodrigues and Verissimo 1992,Kopetz and Grünsteidl 1993). In our environment we cannot assume that an upper on the number of omission failures hold for all stations at any time. The qual-ity of the communication link between two mobile systems changes with their relative po-sitions and may observe any state between disconnection and nearly reliable communica-tion. On the other hand, our model is similar to a synchronous model with bounded omis-sions in the sense that our protocols achieve certain communication as long as an upper bound holds.

(Cristian and Schmuck 1995,Cristian 1996,Cristian and Fetzer 1999) presented the timed asynchronous system model, which is situated between the synchronous and the time-free asynchronous system model. In fact, the model was introduced even earlier (Cristian 1989), however, not explicitly named and distinguished from the time-free asynchronous system model. In this model processor and communication services have timing specifica-tions. The timing specifications, however, represent likely time bounds holding most of the time rather than worst-case bounds. Therefore the components may be subject to timing failures. In fact, the possibility of timing failures can be considered one of the distinctive features of this model: Whereas time-free systems have no timing specifications and hence no timing failures, synchronous systems have a timing specification but are assumed to meet it. The other distinctive feature of the timed asynchronous system model as compared to the time-free model is the existence of hardware clocks having a bounded drift w.r.t real-time. These clocks can be used to detect timing failures of the components.

The timed asynchronous system model, like the time-free asynchronous system model, still describes a system that exhibits an unpredictable timing behavior in general. Timely pro-gress can be guaranteed neither for the computation nor for the communication services.

Yet, timing specifications are assumed to be chosen in such a way that the system alter-nates between long phases during which services adhere their timed specifications and only short phases where they do not. The basic idea regarding service guarantees is to give a conditional guarantee for a timely progress of the services instead of giving no guaran-tees at all: The protocols guarantee a timely progress whenever the underlying communica-tion and processor services exhibit a “sufficient synchrony”. What “sufficient synchrony”

means is described by stability predicates. So, the protocols guarantee a timely progress whenever the system, or a part thereof, is stable for a sufficiently long interval of time.

Safety properties, on the other hand, are guaranteed to be met always, whether the system is stable or not. For example, in the group membership service defined in (Cristian 1996), agreement on group membership is an unconditional property ― any two stations joined to

the same group agree on its membership ―, whereas a bounded join delay is a conditional property ― two processes connected throughout some interval I are guaranteed to be joined to the same group after I if the system is stable during I. Working with conditional properties has two advantages

• Even though a general guarantee cannot be given, the protocols provide guarantees for those times at which the system is sufficiently synchronous, rather than giving no guarantees at all. In a well tuned system, which is stable most of the time, this may be sufficient for many applications

• The conditional properties can be converted into unconditional ones by strengthen-ing the underlystrengthen-ing model without changstrengthen-ing the protocols. This is achieved by add-ing progress assumptions, which essentially require that the system eventually be-comes stable for a sufficiently long time. Obviously, this means that the underlying system must be designed or tuned to warrant the progress assumptions.

Since a timed asynchronous system alternates between two phases it can also be consid-ered as a heterogeneous or asymmetric system. There are intervals of time during which it behaves like a synchronous system and there are intervals of time during which it behaves like an asynchronous system. Which phases prevail depends on the choice of the timing specifications and on the runtime conditions of the system.

The timed asynchronous system model is targeted to match most of existing “run-off-the-mill” distributed systems consisting of workstations without a particular real-time OS con-nected by networks that exhibit phenomena like congestion and hence a hardly controllable timing behavior. It models a basically asynchronous system, which provides no guarantee the any progress is achieved in time and in which any component may be late. Though generally detectable, timing failures cannot be detected and handled in a timely manner.

Regarding timeliness, a model with stronger guarantees is better suited to match our in-tended system environment where real-time scheduling of CPU resources is performed (see Chapter 5) and a contention-free access to a single LAN is provided. This allows guaranteeing a timely progress for at least part of the computational tasks and giving an upper bound on the delay of messages on the network. W.r.t timeliness we therefore adopt a stronger model, which allows focusing the protocol design on the main source of unpre-dictability in the wireless LAN ― the omission failures.

Using stability predicates and conditional properties is a promising approach to design protocols for unpredictable environments. It leverages protocols providing safety under a wide range of conditions and provide progress whenever possible. We adopt this approach in our system model. In particular, we introduced a valid predicate similar to the notion of F-connected introduced in (Cristian and Fetzer 1999) and progress properties conditioned on that predicate. Furthermore, we adopt the idea of making services fail-aware (Fetzer and Cristian 1996). Fail-awareness allows strengthening the conditional properties in the sense that if some station does not fulfill the condition (is not valid in our model) and hence is not able to fulfill its progress properties, it indicates this fact to its user, thus enabling the user to react to this situation. It should be noted, however, that in a timed asynchronous system model, a timely reaction to such an exception indication cannot be ensured.

(Verissimo et al. 2000,Casimiro and Verissimo 2001) also present a heterogeneous model, called the Timely Computing Base (TCB) model, which is a continuation of their previous

work an the so-called quasi-synchronous system model (Almeida and Verissimo 1996). In this model the system is statically divided into two parts: a payload part and a control part.

The payload part may have any degree of synchronism, whereas the TCB components to-gether constitute a synchronous subsystem. The payload part can be modeled by a timed asynchronous are quasi-synchronous system model. Since the TCB exhibits a synchronous behavior all of the time and allows executing well-defined functions in bounded time, it allows for stronger timeliness properties than the timed asynchronous system model. Par-ticularly, it allows detecting and handling timing failures in time. While regarding the local task execution, our system would fit to the TCB model, namely the timely exception han-dling of TAFT already includes a similar idea, we cannot assume a TCB model for the wireless network. That is, the synchronous control network, which interconnects the com-ponents of the TCB, cannot be realized on a wireless LAN. The TCB model suggests real-izing the control network as a small bandwidth dedicated network or by using the highest priority in the payload network. Both concepts cannot be used to realize a synchronous network on a wireless LAN. In a wireless network of mobile stations, whether or not a synchronous communication channel can be established between to stations is a dynamic property of the network. Yet, we adopted the idea to distinguish between the payload ― the atomic multicast messages ― and a small bandwidth synchronous channel, which is used to achieve agreement in case an atomic multicast message cannot be delivered at all members. In our system model, the synchronous channel does not provide a guaranteed synchronous service, but a conditional, fail-aware service, which can be implemented in a dynamically changing wireless network.

Finally, we shortly note, that there is a lot of further work on partial synchrony models (Dolev and Dwork 1987,Chandra and Toueg 1991,Chandra et al. 1992), which for the most part aims at adding just enough synchrony to an asynchronous system so that consensus can be solved, but not sufficient to provide a predictable timing behavior.

Im Dokument A middleware for cooperating mobile embedded systems (Seite 98-101)