# PhD thesis

# Accurate analysis of real-time stream processing applications using dataflow models and timed automata

Autonomous cars are an example of a safety-critical Cyber-Physical System (CPS). In such a CPS, there is a very complicated interaction between the continuous-time physical environment and the discrete-time embedded control system of the car using its sensors and actuators. The sensors periodically sample the physical properties of the environment and produce streams of data samples. The actuators consume data from streams and expect to receive this data periodically from the digital control system. In this thesis, we focus on the modem system in autonomous cars, which is used for vehicle-to-vehicle communication. Such a safety-critical embedded system must be analyzed in order to verify whether temporal constraints will be satisfied. During the design of such a system, bounds must therefore be derived for the interval of time it takes for a real-time system to produce an output as a response to an input event.

Nowadays, these embedded systems are often implemented on multiple processors. The software that is running on such a system is split into tasks, that are connected by First-In-First-Out (FIFO) communication buffers. Some of these tasks execute conditionally, depending on the value of the incoming data. The incoming data can put the application in a certain mode, in which only a subset of tasks actually executes. Run-time schedulers determine when a task is allowed to execute on a processor.

The available analysis methods for this type of systems have severe shortcomings. The analysis results of these methods are insufficiently accurate. Furthermore, the run-time of the analysis methods is often unacceptably high. Moreover, most of these methods do not support applications containing cyclic dependencies and dynamic applications containing conditionally executed tasks.

The analysis of applications containing cyclic dependencies naturally fits to dataflow analysis approaches. However, for dynamic applications where tasks are scheduled using budget schedulers, the accuracy of dataflow analysis can be low. We therefore introduce locks that cause additional sequence constraints in these dynamic applications. These locks prevent interference between tasks in different modes, which can reduce the latency for dynamic applications.

To enable the analysis for non-starvation-free schedulers, we also introduce barriers. These non-starvation-free schedulers are a broader class than budget schedulers. The barriers ensure that all inputs of a mode are available before any tasks in that mode can start. The combination of locks and barriers allows for compositional analysis of modes. This means that each mode can be analyzed in isolation, even if these modes belong to a different level in the hierarchy of an application. As a result, existing analysis methods for non-dynamic applications can be used for each mode in isolation.

To obtain more accurate analysis results, we introduce a transformation from dataflow graphs into timed automata. Using these timed automata, accurate analysis can be performed by taking the correlation of firing durations into account of consecutive firings of the same task, but also of different tasks. Current dataflow analysis techniques abstract from this specific type of correlation by means of over-approximation, whereas we can take this correlation into account with timed automata, without over-approximation. Moreover, we enable the analysis of systems with out-of-order communication by annotating tokens with a token index order, instead of only using the token arrival order. Out-of-order communication can be a result of multiple executions of the same tasks in parallel. By consuming tokens in index order, the functional behavior of the graph is orthogonal to the arrival order of the tokens.

In order to accurately analyze the consequences of pre-emptive task scheduling on the temporal behavior of an application, we make use of model checking of timed automata. In this model we include correlation of the executions of different tasks, sharing the same resource. However, modeling task scheduling directly in a timed automata model can make the scheduling problem undecidable. We address this issue by either an over-approximation in the timed automata model, or by allowing the model checker to over-approximate reachability. The run-time of the analysis method is improved by combining model checking of timed automata with dataflow analysis. Moreover, latency is minimized by using this hybrid analysis/optimization method, where we introduce additional sequence constraints to limit the interference between tasks.

*My thesis can be downloaded HERE.*