CE23 - Intelligence artificielle et science des données

REPUBLIC: A Quest for {R}obustn{E}ss, {P}rivacy, and {U}n{B}iasedness in AI with Sequential {L}earn{I}ng under {C}onstraints – REPUBLIC

Submission summary

Developing responsible AI asks for effectively incorporating three fundamental aspects: robustness, privacy, and unbiasedness (fairness).
In the REPUBLIC project, we propose to investigate these aspects for Reinforcement Learning (RL) with the framework of RL under constraints.
The research roadmap has three phases that transits through the literature on responsible RL, a unified fundamental framework, and real-life applications.
The first phase aims to unify the eclectic definitions of robustness, privacy, and unbiasedness available in literature into three unified frameworks of RL under constraints.
This unification contemplates to invoke a two-player constraint breaking and utility maximising game framework of achieving optimal robust, private, and unbiased performance in RL.
The next phase is developing the fundamental study of RL under static/dynamic, linear/non-linear, and deterministic/probabilistic constraints.
The present analysis of the simplest settings of RL under dynamic, linear constraints shows limitations of the optimistic techniques for efficiency-driven RL, and asks for careful trade-off of optimism and constraint-dependent pessimism.
The goal of this research is to systematically understand fundamental limits of RL under dynamic, non-linear, and probabilistic constraints, which would be invoked in the first phase of this project.
Following this study, we aim to derive statistical and computational machineries to design optimal algorithms for RL under different type of constraints.
The final phase is to apply the generated knowledge and RL algorithms to real-life applications, such as collaborative drug design and decision making with imperfect climate models.
The first problem requires distributive deployment of algorithms proposed in the first two phases.
The second problem asks for scaling-up the deployments, dealing with imperfect predictors, and quantifying high uncertainty due to randomness of the inherent dynamics and model imperfection.

Project coordination

Debabrota Basu (Institut national de la recherche en informatique et automatique)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

Institut national de la recherche en informatique et automatique

Help of the ANR 270,969 euros
Beginning and duration of the scientific project: January 2023 - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter