ChairesIA_2019_1 - Chaires de recherche et d'enseignement en Intelligence Artificielle - vague 1 de l'édition 2019 2019

DeepCuriosity: Curiosity-driven exploration and curriculum learning in AI with applications to autonomous agents, automated discovery and educational technologies. – DeepCuriosity

DeepCuriosity

Curiosity-driven exploration and curriculum learning in AI with applications to autonomous agents, automated discovery and educational technologies

Developmental Artificial Intelligence

The research vision and program of the DeepCuriosity project aim at developing the foundations of a new scientific approach to autonomous artificial intelligence and lifelong machine learning. While deep reinforcement learning has achieved impressive results recently (e.g. in complex board or video games), it is currently reaching both conceptual and practical limits. In particular, current AI systems are far from autonomous: they are still task specific; they require a lot of data and energy; they also require the intervention of engineers for each new task. To go beyond these limits, the approach proposed in this project is grounded in a decade of interdisciplinary work achieved by PY Oudeyer and his team at Inria Bordeaux Sud-Ouest, modelling mechanisms of infant learning and development in humans, in particular curiosity-driven learning. This has enabled to pose the first bricks of a new machine learning framework: intrinsically motivated goal exploration processes (IMGEPs). In IMGEPs, machines learn autonomously open repertoires of skills by self-supervised acquisition of world models, through sampling their own goals with curiosity-driven self-organized curriculum learning. These algorithms were shown to enable real world robots to learn efficiently repertoires of high-dimensional skills while being able to adapt quickly to changes in the environment, and under limited time and energy resources. There are still fundamental challenges and limits to address to scale up these research advances. This project aims to address the following objectives: Core AI research objectives: • We will develop novel self-supervised machine learning algorithms enabling curiosity-driven incremental learning of structured representations of goal/task spaces • We will study how IMGEP algorithms can automate curriculum learning driven by learning progress in high-dimensions • We will extend IMGEP algorithms to the context of human-robot collaboration through natural language interaction Applications objectives: Recent proof-of-concepts showed how these general fundamental methods can find applications in the following three societally important fields, which we aim to scale up in DeepCuriosity leveraging dedicated collaborations with application-specific partners: • Autonomous agents in large open video games world and autonomous robot exploration (this will leverage collaborations with the video games industry) • Robotized automated discovery of novel patterns in self-organized bio-printed cells systems, opening ground-breaking health application for people that need personalized tissue transplants (leveraging collaboration with a bio-printing company) • Personalized curriculum of exercises for human learners in digital educational apps (leveraging collaboration with Académie de Bordeaux and EdTech companies)

Curiosity-driven learning algorithms

Based on more than a decade of work modeling various fundamental processes of infant development, ranging from curiosity-driven exploration of sensorimotor skills to socially guided language acquisition [2,13], the Flowers lab has been working on transposing these computational models (initially made to understand better human learning) to machine learning (with the aim to build flexible autonomous lifelong learning machines). In particular, the Flowers lab laid the foundation of the Intrinsically Motivated Goal Exploration (IMGEP ) machine learning framework, in which machines learn autonomously open repertoires of skills by self-supervised acquisition of world models, including structured goal representations [3,5], and through sampling their own goals with self-organized curriculum learning. IMGEPs can leverage population-based [9] or multi-goal deep reinforcement learning [1,4] techniques (related ideas have been explored recently in the UVFA-HER architecture and other goal-parameterized deep RL algorithms e.g. Andrychowicz, 2017, but with simpler random sampling of goals). Here, the objective is to discover and master a diversity of controllable outcomes, and learn world models, while avoiding to spend too much time trying to learn goals that are too complicated or even impossible. These algorithms were shown to enable virtual or physical robots to learn efficiently repertoires of high-dimensional skills while being able to adapt quickly to changes in the environment and body damages [1,9,4]. The work on IMGEP algorithms has focused so far on curiosity-driven learning of low-level sensorimotor skills in a single agent. We will extend this approach by leveraging natural language as a cognitive tool, and using modular neural networks to enable generalization capabilities.

Results

Language-augmented autotelic agents: Developmental machine learning studies how artificial agents can model the way children learn open-ended repertoires of skills. Such agents need to create and represent goals, select which ones to pursue and learn to achieve them. Recent approaches have considered goal spaces that were either fixed and hand-defined or learned using generative models of states. This limited agents to sample goals within the distribution of known effects. We argue that the ability to imagine out-of-distribution goals is key to enable creative discoveries and open-ended learning. Children do so by leveraging the compositionality of language as a tool to imagine descriptions of outcomes they never experienced before, targeting them as goals during play. We introduced the IMAGINE and LGB architectures, which are intrinsically motivated deep reinforcement learning architectures that models this ability. Such imaginative agents, like children, benefit from the guidance of a social peer who provides language descriptions. To take advantage of goal imagination, agents must be able to leverage these descriptions to interpret their imagined out-of-distribution goals. This generalization is made possible by modularity: a decomposition between learned goal-achievement reward function and policy relying on deep sets, gated attention and object-centered representations. We introduce the Playground environment and studied how this form of goal imagination improves generalization and exploration over agents lacking this capacity. In addition, we identified the properties of goal imagination that enable these results and study the impacts of modularity and social interactions.

Prospects

DeepCuriosity will be key to empower the development of internationally impactful and visible AI research at Flowers, Inria Bordeaux and Région Nouvelle Aquitaine, promoting an interdisciplinary and human-centered approach to AI. It will boost a rich ecosystem of collaborations with other public research and educational institutions, and companies, addressing key societal issues (AI promoting inclusivity and diversity in edTech; AI with limited environmental footprint; health with improved automatization of tissue bio-printing). It will also develop new training courses covering these advances.

Scientific productions and patents

Colas, C., Karch, T., Lair, N., Dussoux, J. M., Moulin-Frier, C., Dominey, P., & Oudeyer, P. Y. (2020). Language as a Cognitive Tool to Imagine Goals in Curiosity Driven Exploration. Advances in Neural Information Processing Systems (Neurips 2020), 33.
Akakzia, A., Colas, C., Oudeyer, P. Y., Chetouani, M., & Sigaud, O. (2021, May). Grounding Language to Autonomously-Acquired Skills via Goal Generation. In ICLR 2021.
Romac, C., Portelas, R., Hofmann, K., & Oudeyer, P. Y. (2021). TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL. ICML 2021.
arxiv.org/abs/2103.09815 Web site and code: developmentalsystems.org/TeachMyAgent/
Laversanne-Finot, A., Péré, A., & Oudeyer, P. Y. (2021). Intrinsically motivated exploration of learned goal spaces. Frontiers in neurorobotics, 109. hal.inria.fr/hal-03120618
Etcheverry, M., Moulin-Frier, C., & Oudeyer, P. Y. (2020, December). Hierarchically Organized Latent Modules for Exploratory Search in Morphogenetic Systems. In NeurIPS 2020-34th Conference on Neural Information Processing Systems.
arxiv.org/abs/2007.01195
Kovac, G., Laversanne-Finot, A., & Oudeyer, P. Y. (2020). Grimgep: learning progress for robust goal sampling in visual deep reinforcement learning. arxiv.org/abs/2008.04388

Submission summary

The research vision and program of the DeepCuriosity project aim at developing the foundations of a new scientific approach to autonomous artificial intelligence and lifelong machine learning. While deep reinforcement learning has achieved impressive results recently (e.g. in complex board or video games), it is currently reaching both conceptual and practical limits. In particular, current AI systems are far from autonomous: they are still task specific; they require a lot of data and energy; they also require the intervention of engineers for each new task. To go beyond these limits, the approach proposed in this project is grounded in a decade of interdisciplinary work achieved by PY Oudeyer and his team at Inria Bordeaux Sud-Ouest, modelling mechanisms of infant learning and development in humans, in particular curiosity-driven learning. This has enabled to pose the first bricks of a new machine learning framework: intrinsically motivated goal exploration processes (IMGEPs). In IMGEPs, machines learn autonomously open repertoires of skills by self-supervised acquisition of world models, through sampling their own goals with curiosity-driven self-organized curriculum learning. These algorithms were shown to enable real world robots to learn efficiently repertoires of high-dimensional skills while being able to adapt quickly to changes in the environment, and under limited time and energy resources.

There are still fundamental challenges and limits to address to scale up these fundamental research advances, as well as their application in societally important domains. This project aims to address these challenges along the following “core AI research” and “application” objectives:

Core AI research objectives:
• We will develop novel self-supervised machine learning algorithms enabling curiosity-driven incremental learning of structured representations of goal/task spaces (starting from low-level high-dimensional perception).
• We will study how IMGEP algorithms can automate curriculum learning driven by learning progress in high-dimensions, and be applied to a wide diversity of machine and human learners.
• We will extend IMGEP algorithms to the context of human-robot collaboration through natural language interaction by 1) enabling the human to guide a curiosity-driven exploring robot for new tasks/environments using natural language instructions that the robot has learnt to understand; 2) enabling the robot to report what it does and sees using learnt natural language.

Applications objectives:
Recent proof-of-concepts showed how these general fundamental methods can find applications in the following three societally important fields, which we aim to scale up in DeepCuriosity leveraging dedicated collaborations with application-specific partners:
• Autonomous agents in large open video games world and autonomous robot exploration (this will leverage collaborations with the video games industry and French public defense organization).
• Robotized automated discovery of novel patterns in self-organized bio-printed cells systems, opening ground-breaking health application for people that need personalized tissue transplants (collaboration with a bio-printing company).
• Personalized curriculum of exercises for human learners in digital educational apps (collaboration with Académie de Bordeaux and EdTech companies).

DeepCuriosity will be key to empower the development of internationally impactful and visible AI research at Flowers, Inria Bordeaux and Région Nouvelle Aquitaine, promoting an interdisciplinary and human-centered approach to AI. It will boost a rich ecosystem of collaborations with other public research and educational institutions, and companies, addressing key societal issues (AI promoting inclusivity and diversity in edTech; AI with limited environmental footprint; health with improved automatization of tissue bio-printing). It will also develop new training courses covering these advances.

Pierre-Yves Oudeyer (Centre de Recherche Inria Bordeaux - Sud-Ouest)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Inria FLOWERS Centre de Recherche Inria Bordeaux - Sud-Ouest

Help of the ANR 599,970 euros
Beginning and duration of the scientific project: May 2020 - 48 Months

Explorez notre base de projets financés

ANR makes available its datasets on funded projects, click here to find more.