ASTRID - Accompagnement spécifique des travaux de recherches et d’innovation défense 2022

Interpretable-by-design Fuzzy Policy in Reinforcement Learning – IFP-in-RL

Submission summary

In the general context of the field of eXplainable Artificial Intelligence (XAI), the IFP-in-RL project aims to propose a method for the automatic construction of a control system of a system, such as a drone, which takes take into account the interpretability constraint in its very design. This project takes place within the framework of systems based on fuzzy rules which, since their introduction, aim to facilitate the expression of knowledge in a linguistic form, natural for the user, and easily understandable by a human. Such a knowledge representation is an excellent way to promote human interaction with the computer system and to improve their understanding of how it works, thus offering the possibility of making their behavior transparent and easily validated. In the literature, different approaches to build or to fine-tune a fuzzy rule base to design a system exist, but they generally suffer from the drawback of not incorporating specific interpretability optimization.

In this project, an innovative methodology is introduced for the design of such systems. This methodology is based on the implementation of a reinforcement learning approach using interpretability metrics. The objective here is to integrate the consideration and optimization of the desired interpretability during the learning itself, and not a posteriori as many methods currently do in the field of XAI.

The IFP-in-RL project aims to achieve this upstream, a complete study, both theoretical and experimental, of interpretability metrics, including existing numerical criteria as well as user needs. This will involve proposing a taxonomy of existing metrics and defining new measures if necessary, in order to complete the previous ones and allow their exploitation in original reinforcement learning algorithms. An original feature of this project is to integrate a qualitative assessment, carried out on a human panel, of the proposed metrics but also of the rule bases obtained at the end of reinforcement learning.

In application terms, the objective of the IFP-in-RL project is to implement these proposals for piloting a drone, navigating in complete autonomy to ensure a mission consisting of flying over points of interest and taking pictures, from data provided by a simulator.

Christophe Marsala (LIP6)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

TRT THALES Research and Technology
LIP6 LIP6

Help of the ANR 298,721 euros
Beginning and duration of the scientific project: - 30 Months

Explorez notre base de projets financés

ANR makes available its datasets on funded projects, click here to find more.