Summary - State-Based Real-Time Analysis of Synchronous Data-flow (SDF) Applications on MPSoCs

In this chapter, we elaborated on the model-checking capabilities which enable us to check more complex properties (see claim C2-3) than those analyzed by analytical RT methods (see analytical Sect. 3.1.1.2). The model-checking TCTL statements also allowed us to gain confidence whether or not the instantiated timed-automata templates correctly capture the semantical behavior of a spe-cific SUA through checking the validity of SDF semantics and other properties such as liveness and arbitration protocols’ correctness.

The viability of our state-based RT method (see claim C4 in Chap. 1) was approved by conducting a set of scalability tests which showed that our method has the potential of scaling up to 320 actors on a 2-tiles platform and up to 96 actors on a 4-tiles platform (for the chosen use-case), significantly improving the number of analyzable actors compared to related work (see claim C4-3).

We have also shown how the clustering mechanism when applied can help reducing the state space and the needed analysis time (in the MP3 example 4.3% less state space and 48.6% less analysis time). In addition, by enabling composability and combining a TDMA-based SDFGs’ cluster scheduling with our state-based method, the RT analysis of larger SDF applications (by TDMA slot number of 10, potentially thousands of actors on a 2-tiles platform and hundreds of actors on a 4-tiles platform if the use-case has the same scalability behavior as the one in Fig.7.2) with large number of actors is enabled.

In addition, our approach showed a significant precision improvement (up to a percentage improvement of 300%) compared with the worst-case bound calculation based on a pessimistic analytical upper-bound delays for every shared resource access (see claim C4-2).

Finally, we have demonstrated the applicability of our model-based design flow on an industrial use-case (see claim C4-1) using a multi-phase electric motor control algorithm (modeled as SDFA) mapped to the Infineon’s TriCore hardware platform with both the burst and single-beat inter-processor commu-nication styles. We have shown that the upper bounds timing results estimated through our approach were always a safe over-approximation of the measured (through cycle-accurate simulation) ones and that our state-based RT analysis was able to detect timing violation in the case where a burst inter-processor style was chosen.

Conclusion and Outlook

In thesis, we started from the observation that MPSoCs are emerging due to their performance and power efficiency, and that the real-time analysis of ap-plications with hard real-time requirements on such architectures is not an easy task requiring novel RT analysis methods. The underlying research of this the-sis tried to address the problem whether or not a state-based RT method is applicable for RT analysis of multiple applications restricted to synchronous data-flow model of computation when run on MPSoCs with shared communi-cation resources. More centrally, we tried to handle the challenge of choosing an appropriate abstract representation (in timed automata) of the SDFG appli-cations, the MPSoC and their temporal behaviors and interactions, while still enabling tight timing results’ prediction.

By combining the flexibility of timed automata with the efficiency of SDF graphs, we enabled a state-based RT analysis of multiple hard real-time SDF applications mapped to an MPSoC platform with shared communication re-sources, considering variable access delays due to the contention on commu-nication resources and utilizing different inter-processor commucommu-nication styles (such as burst/single-beat). This was realized through the implementation and state-space exploration of a set of flexible timed-automata templates capturing execution times boundaries of SDF actors and their scheduling decisions, map-ping and utilization of MPSoC resources, shared communication resources ac-cess protocols (including arbitration of various complexities) and local/shared memories. These TA templates are also capable of representing a class of MP-SoCs respecting (or which can be configured to respect) the constraints imposed in this thesis (see Sect.4.1).

Since the state-space explosion problem is the main bottleneck which faces a system designer when utilizing state-based RT analysis methods, we exam-ined methods which helped improving the state space of our implemented TA

183

templates. In a first approach, we proposed some optimizations on our TA templates to minimize the state space. In addition, techniques from literature such as clustering of actors and extending the MPSoC with extra hardware components which guarantee temporal and spatial isolation of clusters of ac-tors (combined with a TDMA clusters’ scheduler), were examined to be useful in terms of improving the scalability of our approach and their application to our system model was described.

The viability of our RT method was approved by conducting a set of scal-ability tests which showed that our method scales up to 320 actors on a 2-tiles platform and up to 96 actors on a 4-tiles platform, significantly improving the number of analyzable actors compared to related work. We have also shown how the clustering mechanism when applied can help reducing the state space and the needed analysis time (in the MP3 example, 4.3% less state space and 48.6% less analysis time). In addition, by enabling composability and com-bining a TDMA-based clusters’ scheduling with our state-based method, the RT analysis of even larger SDF applications (e.g. by ten TDMA slots, poten-tially thousands of actors on a 2-tiles platform and hundreds of actors on a 4-tiles platform) with large number of actors was demonstrated. Moreover, our method showed a significant reduction in the worst-case response time predic-tion (up to a percentage improvement of 300%), compared with the worst-case bound calculation based on a pessimistic analytical upper-bound delays for ev-ery shared resource access known from literature. In addition, our approach enabled the analysis of more complex properties than those supported by tradi-tional analytical RT methods such as the safety, liveness and reachability prop-erties.

Finally, we have demonstrated the applicability of our suggested model-based design flow being able to validate the timing requirements of a small industrial use-case of a control algorithm (modeled as SDFA) of a multi-phase electrical motor mapped to a TriCore-based Aurix hardware platform with dif-ferent inter-processor communication styles (burst and single-beat IPC). We have shown that the upper bounds timing results estimated through our ap-proach were always (for the scenarios experimented) a safe over-approximation of the measured (through cycle-accurate simulation) ones and that our state-based RT analysis was able to detect a timing violation in the case where a burst inter-processor style was chosen.

Overall, our approach opened up the way for using timed automata with its model-checking features for the RT analysis for SDFGs running on MPSoCs (see Sect. 3.1.2.3). In addition, our proposed RT analysis method feasibility was demonstrated for small parallel systems, enabling their usage in safety-critical real-time domain (such as avionics) providing formal guarantees on the absence of timing hazards.

8.1 Discussion

The challenge which we faced when developing our state-based RT analysis method, is how to choose the right abstraction level of the input model such that the method scales to be able to analyze systems with adequate sizes and at the same time can still obtain tight timing results. For this purpose, we deliberately made the assumptions and restrictions described in Sect. 4.1 to enable such a state-based RT analysis of SDF applications mapped to an MP-SoC. While the applicability of our method was demonstrated in the conducted experiments, there are still some issues that should be discussed. Restrictions made in this thesis are considered to be very realistic for safety-critical domains, for e.g. in the avionics domain. In these domains, costs resulting from adapt-ing such restrictions are typically compromised as long as they help passadapt-ing the certification procedures imposed by authorities to approve the deployment of the target MPSoC system. Nevertheless, the price to be paid, when imposing such restrictions, could be critical in other domains.

While SDFGs (see A1) are commonly used for capturing the behavior and implementation of signal-processing applications where infinite streams of signal samples (which can be represented as tokens) are processed [Schaumont, 2013], their expressiveness suffers from control-related limitations.

Some of these limitations were stated in [Schaumont, 2013], for e.g. stopping and restarting an SDFG is not possible since an SDFG can have only two states either running or waiting for input. In addition, reconfiguration of an SDFG to be able to (de)activate different parts depending on specific modes is not possible. Moreover, different rates depending on run-time conditions are not supported. Also modeling exceptions which might require deactivating some parts of the graph is not possible. However, emulating control flow within the SDFG is possible even though not always efficient (c.f. [Schaumont, 2013]). In addition, control-flow within an actor functionality is allowed, the fact which enabled us translating event-triggered systems in Simulink into SDFGs (refer to Sect. 6.2). We also relaxed some of the SDFG MoC limitations in this thesis by enabling SDFG graphs to be sensitive to external periodic events allowing us to support the RT analysis of periodic control systems.

Static allocations of actors (see A1), static-order and non-preemptive scheduling (see A3) (incl. non-preemptive arbitration in A9, non-support of hardware interrupts in A4) can be very costly in terms of resource utilization.

The fact which can lead to expensive and thus non-competitive designs. On the other side, a variety of dynamic implementations which are reconfigurable (e.g. adapting different allocations, scheduling for different situations) depend-ing on dynamic changes in the environment cannot be supported when makdepend-ing above restrictions. Also the fact that we restrict external events to be periodic (see A2) decreases the flexibility of our approach to handle a set of applications

in the safety-critical domain such as those sensitive to sporadic events.

Moreover, constraining the application code to be mapped to the private memory of the corresponding processor (see A6), leads to a limitation concern-ing the size of private memories particularly for large applications. Neverthe-less, recent research recommendations and current design trends are moving in this direction where private tasks’ code is stored in private (growing-larger) memories and only message-passing (see A7) is realized via communication resources (esp. in the emerging NoCs designs). In addition, the non-usage of shared caches could also lead to a performance degradation of the application overall execution time. In the industrial example demonstrated in Sect. 7.4, however, the Aurix local memories were very fast so that the execution time measured without caches was even better than that measured when using caches¹. In addition, prohibiting contention on interconnect bridges and IO devices (see A8, A10) could be too strict for some applications even though using a dedicated processor element (I/O PE) which is exclusively allowed to communicate with I/O devices seems to be a typical decision in real-life imple-mentations.

Overall, the restrictions imposed in this thesis were deliberately made to obtain a manageable state space. However, most of these restrictions can be easily relaxed in future work if the ongoing research achieves more powerful model-checkers with more capabilities. Future model-checkers could highly benefit from the growing computing power and can utilize for e.g. many-cores to enable the concurrent exploration of the SUA given state space instead of the currently supported single-threaded approach.

Im Dokument State-Based Real-Time Analysis of Synchronous Data-flow (SDF) Applications on MPSoCs with Shared Communication Resources (Seite 188-194)