CE25 - Sciences et génie du logiciel - Réseaux de communication multi-usages, infrastructures de hautes performances

FAult-aware timing behaviour for safety-critical multicore SYstems – FASY

Submission summary

The safety-critical embedded industries, such as avionics, automobile, robotics and health-care, require guarantees for hard real-time and correct application execution. As applications become more complex, their computational demands scale rapidly, requiring architectures with multiple processing elements. Although multicore architectures can effectively satisfy the needs of best-effort systems, the same cannot be stated for critical embedded systems due to hard-to-predict timing behaviour and increased fault susceptibility.
Hard-to-predict timing behaviour originates from the complex nature of modern systems. Not only application complexity, but also hardware complexity has been increased. To improve average performance, modern architectures are enhanced with dynamic hardware components, which, however, have variable timing behaviour. Parallel execution of applications on the same platform leads to concurrent accesses to architecture shared resources. These concurrent accesses introduce timing delays (interferences), highly affecting applications’ timing behaviour. To provide hard real-time guarantees, safe, but pessimistic, Worst-Case Execution Time (WCET) estimations have to be employed during system design.
Increased fault susceptibility stems from the nature of electronic systems. Reliability threats, such as manufacturing process variation, aging and soft errors, depend on transistors size and are expected to significantly increase with transistors shrinking. The most important reliability threats have been considered soft errors occurring due to environmental conditions, e.g., high temperature and high-energy electromagnetic radiation. However, with the further ongoing reduction of transistors size, faults will occur even under normal operation conditions, which was not the case with technology used a decade ago. Due to this unreliable nature of electronic systems, the susceptibility of multicore architectures towards reliability threats is inevitable.
However, the majority of existing WCET estimation approaches is fault-unaware; the hardware of the target platform is assumed to be fault-free. As reliability issues become imminent due to technology scaling, such fault-unaware approaches become unsafe. Approaches with timing guarantees apply fault-tolerant techniques to detect, correct or mitigate faults, and extend the fault-free WCET to include the time overhead of the applied fault-tolerant techniques. However, the focus is that hardware faults impact the functional behaviour of applications. Only few approaches address the impact on the timing behaviour of applications, but they target memory components, usually considering permanent faults into caches. Nonetheless, with the technology size reduction, faults in combinational logic and smaller sequential logic of cores cannot be considered negligible anymore.
The goal of FASY is to tackle the aforementioned limitations, addressing the combined challenge of providing timing guarantees and reliable execution on multicore embedded systems, when soft errors occur in cores. FASY will provide the means to analyse both functional and timing behaviour of applications, perform fault-aware WCET estimation and design cores with timing guarantees and reliable execution. This will be achieved through novel approaches considering both reliability and WCET aspects. More precisely, FASY will design a framework that performs realistic, and accurate, functional and timing architectural vulnerability analysis, including interferences. This framework will be extended with a probabilistic WCET estimation technique to provide fault-aware WCET estimations. FASY framework will be used to identify the hardware and software parts, that have the highest and most frequently impact, when faulty. Low-level fault-tolerant mechanisms will be designed to mitigate the faults impacts. FASY will be based on open-source cores, providing flexibility and removing the limitations of COTS platforms.

Project coordination

Angeliki Kritikakou (Institut de Recherche en Informatique et Systèmes Aléatoires)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

IRISA Institut de Recherche en Informatique et Systèmes Aléatoires

Help of the ANR 303,321 euros
Beginning and duration of the scientific project: February 2022 - 42 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter