A new Bayes-Duality principle for adaptive, robust, and life-long learning of AI – BDAI
Our goal is to develop new methodologies to enable adaptive, robust, and life-long learning of Artificial Intelligent (AI) systems. Deep-learning methods currently are not adaptive and robust, e.g., new knowledge cannot be easily added in trained models and, when forced, the old knowledge is easily forgotten. Given a new dataset, the whole model needs to be retrained on both the old and new datasets, and training only on the new dataset leads to the catastrophic forgetting of the past. All of the data must be available at the same time, which creates a dependency on large datasets and models that plagues almost all deep-learning systems. Our main goal is to fix this by developing new learning paradigms to support adaptive and robust systems that learn throughout their lives.
We introduce a new principle for machine learning, which we call the Bayes Duality principle or “Bayes-Duality”. In the past, by using Bayesian principles, we have shown that a majority of machine-learning algorithms can be seen as natural-gradient methods (presented as a NeurIPS 2019 tutorial). Our new discovery is that natural-gradients enable a “dual representation” which relates models to their training data. This can be used to design new adaptive, life-long learning systems by understanding the current knowledge extracted by a model, adapting the knowledge to new situations, and collecting new data to complement the current knowledge. We plan to fully develop Bayes-Duality theory, and apply it to obtain new practical adaptive, robust, life-long learning methods for deep networks.
Upon success, our proposal will lead to many breakthroughs in the field of AI. We split our work across ten projects. The first three projects focus on the theoretical aspects while the last seven projects will focus on developing practical methods:
(1) A new theory of duality for machine-learning
(2) A generalization of convex duality to non-convex problems
(3) New theoretical guarantees on generalization error of adaptive systems.
(4) Knowledge representation in deep learning
(5) Uncertainty-estimation for robust deep learning
(6) Fast adaptation of neural-network architectures
(7) Fast knowledge transfer (Continual Learning and Student-Teacher Learning)
(8) Scalable distributed deep learning (Federated Learning)
(9) Fast knowledge collection in deep learning (Active Learning)
(10) Deep reinforcement learning methods
Both the Japan and France sides will be involved. Project 1 will focus on the general theory of Bayes-Duality for machine learning and its relationship to the other types of dualities in mathematics. Project 2 will specifically focus on convex duality. Project 3 will develop new theoretical guarantees of adaptive systems, specifically focusing on knowledge transfer, meta-learning, and life-long learning. Projects 4 and 5 focus on robustness while projects 5-7 focus on adaptation. Project 8 will use similar ideas for scalable distributed training by only sharing the dual representations across different computing nodes. Projects 9-10 will focus on the collection of new knowledge: project 9 will develop new active learning methods to collect and curate datasets. The collection will be driven by comparing dual representations to enable exploration-exploitation. Last, project 10 will use this idea to teach models to use past knowledge when facing new situations. For example, when playing Atari games, the model can use its past strategies to avoid failures in the future.
Project coordination
Julyan Arbel (Centre de Recherche Inria Grenoble - Rhône-Alpes)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partner
RIKEN RIKEN Center for Advanced intelligence project, Approximate Bayesian Inference Team
INRIA GRA Centre de Recherche Inria Grenoble - Rhône-Alpes
Help of the ANR 463,903 euros
Beginning and duration of the scientific project:
March 2022
- 60 Months