Interacting Particle systems for Sampling and Optimization: analysis, algorithms and application to statistical inference – IPSO
Many scientific applications require the calculation of expectations with respect to high-dimensional probability distributions. A widely-used approach to this end, known as the Markov chain Monte Carlo (MCMC) method, is to simulate a long trajectory of a stochastic dynamics that samples the target distribution, so that the expectations can be approximated by taking the average over the trajectory. Although the original MCMC method dates back to the 1950s, the development and analysis of sampling methods remains at present an extremely active area of mathematical research, driven by the ever-increasing amount of data available to scientists, a desire to understand high-dimensional problems, and changes in computer architecture.
Many key recent developments in the field are based on the use of interacting particle systems and their analysis at the level of the nonlocal Fokker-Planck equation describing the systems in the limit of infinitely many particles, known as the mean field limit. This approach to numerical algorithms based on interacting particle systems emerged initially from the optimization community and has since then brought considerable insight. It has enabled, notably, significant progress towards proving rigorously the longtime convergence of widely-used interacting particle methods, including the ensemble Kalman filter and particle swarm optimization.
Improving, implementing and mathematically analysing sampling and optimisation methods based on interacting particle systems are the primary aims of this project. The work we propose to undertake pertains to two particular classes of methods: consensus-based methods inspired by particle swarm optimisation, and ensemble Kalman-based methods, which were recently revealed to have a close connection to interacting Langevin diffusions. These methods have proven to be successful in variety of applications, including posterior sampling and maximum a posteriori estimation in the context of Bayesian inverse problems, as well as the training of large neural networks.
To illustrate some of the challenges this project addresses, we briefly describe one of the key applications motivating us: Bayesian inverse problems. In the Bayesian approach to inverse problems the posterior distribution is usually known up to a constant factor, but high-dimensional and expensive to evaluate. In addition, it is often the case that its derivatives are difficult or too costly to calculate and that its Hessian has widely separated eigenvalues. Therefore, there is particular interest in developing gradient-free methods which, in order to avoid conditioning issues, satisfy a property known as affine invariance.
The planned research outlined in this proposal will help lay the theoretical foundations of the emerging field of sampling and optimization using interacting particles and, when combined with modern computer architectures for parallel computing, has the potential to significantly impact applications.
Project coordination
Urbain VAES (Centre de Recherche Inria de Paris)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partnership
Centre de Recherche Inria de Paris
Help of the ANR 142,138 euros
Beginning and duration of the scientific project:
December 2023
- 24 Months