BRIDinG thE gAp Between iterative proximaL methods and nEural networks – BRIDGEABLE
Proximal methods have enabled significant advances in large scale optimization in the last decade. At the same time, deep neural networks (NNs) have led to outstanding achievements in many application domains related to data science. However, the fundamental reasons for their excellent performance are still poorly understood from a mathematical viewpoint. Recently, we have shown that almost all the activation functions used in NN architectures (e.g. the multivariate squashing functions recently introduced for capsule networks) identify with the proximity operators of convex functions. This finding opens up new perspectives in deep learning by exploiting tight links between NN structures and iterative proximal algorithms. More precisely, we propose three main research avenues.
First , the well-known fragility of neural networks with respect to adversarial perturbations will be investigated. For this purpose, we will be using fixed point techniques grounded on the firm non-expansiveness property of these activation operators. Our preliminary results in this direction will be extended by considering more general architectures than basic feedforward ones (e.g. residual networks or GANs). Novel architectures intended to be more robust will also be proposed by mimicking existing proximal methods. Suitable training algorithms will be designed allowing us to control the Lipschitz constant of these resulting NNs, thus making a first step towards their certifiability.
Second, a new formulation of inverse problems will be proposed, aiming at replacing standard convex regularizing functions by a regularization approach based on maximally monotone operators (MMOs). This strategy will be not only more general, but also more flexible. It will allow data-driven MMOs to be learned in a supervised manner. This will lead to efficient plug and play iterative algorithms for solving image restoration or reconstruction problems. In these approaches, denoising steps will be performed by a NN. One of the major benefits of our framework will be to yield clear convergence results of the resulting iterative schemes.
Finally, we will investigate deep dictionary learning (DDL) methods. These currently appear as competitive alternative approaches to NNs. In each step of these methods, a non-smooth cost function is optimized in order to find an optimal representation of the analyzed data in a suitable dictionary. Since this optimization is usually performed by proximal techniques, these methods can be interpreted as the use of a smart nonlinear activation operator. Our purpose will be to clarify the relations existing between DDL and NNs in order to both make DDL techniques more powerful and to better analyze their performance. In addition, strategies will be introduced to increase the versatility of DDL approaches by making them adaptive to incoming data.
In terms of methodological outcomes, the project is expected to lead to significant progress in the explainability of NNs and in the proposition of novel methods for improving their reliability. In terms of practical impact, the developed methods will result in a new generation of techniques for solving problems arising in three application fields: 3D medical imaging (collaboration with GE Healthcare), data analysis for energy and environment issues (collaboration with IFPEN) and multivariate nonlinear modeling of electric motors (collaboration with Schneider Electric).
Project coordination
Jean-Christophe PESQUET (Centre de Vision Numérique)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partner
CVN Centre de Vision Numérique
Help of the ANR 484,920 euros
Beginning and duration of the scientific project:
August 2020
- 48 Months