• Keine Ergebnisse gefunden

in the safety-critical domain such as those sensitive to sporadic events.

Moreover, constraining the application code to be mapped to the private memory of the corresponding processor (see A6), leads to a limitation concern-ing the size of private memories particularly for large applications. Neverthe-less, recent research recommendations and current design trends are moving in this direction where private tasks’ code is stored in private (growing-larger) memories and only message-passing (see A7) is realized via communication resources (esp. in the emerging NoCs designs). In addition, the non-usage of shared caches could also lead to a performance degradation of the application overall execution time. In the industrial example demonstrated in Sect. 7.4, however, the Aurix local memories were very fast so that the execution time measured without caches was even better than that measured when using caches1. In addition, prohibiting contention on interconnect bridges and IO devices (see A8, A10) could be too strict for some applications even though using a dedicated processor element (I/O PE) which is exclusively allowed to communicate with I/O devices seems to be a typical decision in real-life imple-mentations.

Overall, the restrictions imposed in this thesis were deliberately made to obtain a manageable state space. However, most of these restrictions can be easily relaxed in future work if the ongoing research achieves more powerful model-checkers with more capabilities. Future model-checkers could highly benefit from the growing computing power and can utilize for e.g. many-cores to enable the concurrent exploration of the SUA given state space instead of the currently supported single-threaded approach.

to follow the WCET with the help of a run-time monitor2 as suggested in [Nowotsch et al., 2014,Wolf et al., 2012].

Another approach would be to optimize the model-checker for the given problem set, for e.g. through utilizing a multi-core capable model-checker (instead of the current UPPAAL tool) such as opaal+LTSmin [Dalsgaard et al., 2012] to tackle the state-space explosion problem. Sta-tistical model-checking [David et al., 2011] (found to be a good alterna-tive to exhausalterna-tive computation of WCET on single-processor platforms [B´echennec and Cassez, 2011]) can also be used to obtain the probability dis-tribution of the execution times and improve the scalability of our approach.

In general, if the probability that a critical violation of the real-time require-ment is adequately low then this would be acceptable even for hard real-time applications.

In the case of hierarchical scheduling, a composable TDMA scheduling im-proves the number of analyzable SDFGs on the same platform compared to other strategies, also using a TDMA arbitrated interconnect can help.

Possible Architectural Extensions Even though high-speed private scratch-pads with increasingly larger sizes are emerging to current hardware platforms, the sizes of these are still considered as a bottleneck especially when dealing with applications of large memory footprint. In this case, the architecture could be extended with an off-chip large memory with slower access latencies, which can be shared between the processors. In order to retain the predictability through RT analysis method, this large memory should be binded with the help of a TDMA memory controller to the current architecture, allowing a clear temporal separation of accesses to shared memory. In this case a cache (which is supported by the WCET analyzer) could buffer the instructions between the local and the off-chip memory to achieve a better performance, keeping the SUA analyzable by our approach.

Additionally, the MPSoC architectural constraints could be relaxed towards other kinds of communication resources such as cross-bar switches and NoCs with flexible arbitrations. Extending the interconnect model towards Network on Chips (NoCs) should be straight forward as burst-transfer modes are already supported. Nevertheless, NoCs are only meaningful for large number of tiles which are connected through it.

Furthermore, in order to be able to introduce caches in the current sys-tem model, the model should be able to consider single tiles loading/writing instructions from/to the shared memory which requires an instruction

abstrac-2A hardware monitor component can be configured (with the WCET) as a watchdog which monitors the execution time of actors and in case it finishes before the assigned time, it blocks and in case of lasting longer than the expected WCET then additional time could be assigned or an error is detected since WCET must be always larger than measured time.

tion level in order to be able to model the behavior of caches and contention on the interconnects. This in turn will hit the state-space wall as already noted in Sect.3.1.2.1.

Extending the Scheduling Mechanisms Extending our model towards pre-emptive schedulers is a difficult issue, since by preemption the current state of the actor should be saved and a context switch should be done. This is difficult in our current model, since we abstract the execution of actors in terms of up-per/lower bounds and we don not consider the actor at the instruction level of granularity (as done for e.g. in [Lv et al., 2010]).

Typically, scheduling involves three steps with each one can be performed either at run-time or at compile-time [Sriram and Bhattacharyya, 2000]:

1. Assigning the tasks to processors

2. Determining the order in which tasks may run

3. Setting the start times at which the tasks will be executed

There are many scheduling strategies for SDFG (see [Sriram and Bhattacharyya, 2000]), ranging from static (fully static, order transactions, self-timed) to quasi-static to dynamic (static assignment, fully dynamic) strategies. Future work should explore these options and their support should be analyzed.

In this work, we made a simplification of the general case of Def. 5.4.1 concerning at which granularity we allow to construct the clusters and assumed that a cluster can consist of a number of SDFGs and these are independent from other SDFGs mapped to other clusters. In future work, the general case, permitting clustering at the granularity level of actors (see Fig.5.13) should be examined and it procedures should be analyzed to assure that such clustering never leads to a deadlocks.

The blocking behavior on the shared FIFO should support besides busy-waiting (considered in this thesis) suspension-based approaches3 to enable comparison between them when making decisions for a SDFGs binding and scheduling. In general, in order to model preemption stopwatch automata (TA with stoppable clocks) are required (in order to stop the execution time of pre-empted transaction) which are supported by current versions of UPPAAL and

3Suspension-based blocking mechanism (realized through interrupts) are useful in the case of shared hardware FIFOs which could notify the blocked actor on the target processor when data are available.

for which an over-approximated but efficient reachability analysis4can still be applied [Cassez and Larsen, 2000].

Enabling RT-analysis of more Dynamic Data-flow MoC Although the SD-FGs offer good features for analyzability (e.g. deadlocks and bounded buffer properties are decidable for such models [Lee and Messerschmitt, 1987b]), they lack expressiveness. Future work should take into consideration more expres-sive extensions of SDFGs and analyze their predictability and evaluating how far our state-based RT method can handle such systems. One example is the Scenario-Aware Data-Flow (SADF) MoC [Skelin et al., 2015]. This MoC uses a data-flow model to represent a specific scenario and it uses either a stochastic (Markov chain) approach or a finite state machine to model the order in which scenarios occur. A first sketch of an approach, based on our work, targeting the state-based RT analysis of FSM-SADFGs on MPSoCs with shared memory communications was accepted to be published in [Stemmer et al., 2016].

Another interesting more dynamic MoC is SysteMoC [Falk et al., 2005]

where applications are modeled similar to SDFGs as a graph of atomic actors which communicate through FIFO queues but in difference to SDFGs, actors’

production and consumption rates are variable (which is the property of dy-namic data-flow graphs) [Gajski et al., 2009].

Towards Design-space Exploration Benefiting from clean semantics of the SDF MoC being able to easily distinguish communication from computation parts in the application, the complexity of the mapping and platform alterna-tives can be compositionally managed. Flexible mapping to different target platforms is now enabled, and with the help of our composable timing analysis method analyzing different mappings is possible. Future work should address supporting design-space exploration in our model-based design flow (similar to [B ¨uker, 2013, Rosvall and Sander, 2014]) exploring mappings with predictabil-ity, performance efficiency and costs as optimization goals. Genetic algorithms could also be used for encoding mapping problem as in [Stulova et al., 2012].

Additionally, our timed-automata representation could be extended (with the help of Priced Timed Automata: PTA [Behrmann et al., 2005]) to support energy optimal mapping exploration of power-aware SDFGs on MPSoCs (sim-ilar to [Zhu et al., 2014,Zhu et al., 2015]).

4The reachability problem is in general undecidable for stopwatch automata [Suman and Pandya, 2006], an over-approximating but efficient reachability analysis is shown to be decidable in [Cassez and Larsen, 2000].

[IEC, 2010] (2010). IEC 61508. Functional safety of electrical/electronic/programmable electronic safety-related systems.

[Abel et al., 2013] Abel, A., Benz, F., Doerfert, J., D ¨orr, B., Hahn, S., Haupenthal, F., Ja-cobs, M., Moin, A. H., Reineke, J., and Schommer, B. (2013). Impact of resource shar-ing on performance and performance prediction: A survey. InCONCUR 2013–Con-currency Theory, pages 25–43. Springer.

[Aeronautical Radio, 1992] Aeronautical Radio, I. (1992). RTCA DO-178B. Software Consid- erations in Airborne Systems and Equipment Certification.

[Aeronautical Radio, 2003] Aeronautical Radio, I. (2003). Arinc 653: Avionics applica-tion software standard interface. Technical report, ARINC, 2551 Riva Road Annapo-lis, MD 21401, U.S.A.

[Ahmad et al., 2014] Ahmad, W., de Groote, E., H ¨olzenspies, P. K., Stoelinga, M. I. A., and van de Pol, J. C. (2014). Resource-constrained optimal scheduling of syn-chronous dataflow graphs via timed automata. In Proceedings of 14th IEEE Inter-national Conference on Application of Concurrency to System Design (ACSD). IEEE.

[Akesson et al., 2010] Akesson, B., Molnos, A., Hansson, A., Angelo, J. A., and Goossens, K. (2010). Composability and Predictability for Independent Application Development, Verification, and Execution. Multiprocessor System-on-Chip: Hardware Design and Tool Integration, page 25.

[Alur et al., 1990] Alur, R., Courcoubetis, C., and Dill, D. (1990). Model-checking for real-time systems. In Logic in Computer Science, 1990. LICS ’90, Proceedings., Fifth Annual IEEE Symposium, pages 414–425.

[Alur and Dill, 1990] Alur, R. and Dill, D. L. (1990). Automata for modeling real-time systems. InProceedings of the Seventeenth International Colloquium on Automata, Languages and Programming, pages 322–335, New York, NY, USA. Springer-Verlag New York, Inc.

[Alur and Dill, 1994] Alur, R. and Dill, D. L. (1994). A Theory of Timed Automata.

Theoretical Computer Science, 126:183–235.

191