DS10 - Défi des autres savoirs 2017

BAyeSian nonparametrics, uncertainty quantifICation and random Structures – BASICS

Submission summary

In the contemporary society, statisticians receive on a daily basis data and questions from diverse fields such as genomics, ecology, social sciences and astrophysics. This data is often heterogeneous and of large dimension. In this context, mathematical statistics has an important role to play. Procedures adapted to this new type of data should be proposed, but also analysed, validated and compared.

Bayesian nonparametric methods occupy a central role in applied statistics and machine learning. One reason is their flexibility: the statistician chooses a probability distribution, the prior, for the unknown parameters of the model and this distribution is updated by conditionning on the observed data. In constructing such a prior distribution, the statistician can often take advantage of special data structures. Another reason is the development in the last 20 years of numerous algorithms to efficiently compute posterior distributions, in particular more recently for high dimensional settings. The validation of these methods by mathematical results on convergence and optimality is a key challenge.

Motivated by numerous practical applications, two classes of statistical models are currently witnessing a spectacular development: so-called high-dimensional models, and random graph models. In high-dimensional models, the number of parameters is typically higher than the number of observations. Inference is however often made possible by exploiting an underlying parsimonious structure. These ideas are key for multiple testing procedures, that play a fundamental role in applications, in particular in genomics for interpreting DNA microarray data. Random graph models are also developing rapidly, motivated by the numerous practical applications of networks, such as the study of trophical networks in ecology, or the study of social networks. The stochastic blockmodel is for instance one of the most frequently encountered models in this field.

The BASICS project intends to propose new methods and new analyses for these families of central models of modern statistics, in particular using the flexibility of bayesian nonparametrics methods. These methods are already commonplace in recent algorithms on high-dimensional models, in particular in multiple testing, where calibration can be made efficiently using empirical bayes approaches. Yet, the analysis of convergence rates and optimality for these methods has been little explored so far. Those are key issues though, as it is essential to determine which prior distributions will lead to optimal estimation, and how to calibrate prior parameters to achieve this goal. The project will focus in particular on certain random structures; first, on multiscale structures arising in the wavelet analysis of signals, for which a priori distributions with a tree structure are particularly natural; second, on random graphs that will be analysed via Bayesian and non-Bayesian methods. Another key idea of the BASICS research programme is uncertainty quantification. Indeed, obtaining confidence regions is a key element for the interpretation of statistical results. Bayesian methods naturally propose such an uncertainty quantification via so-called credible sets. The project intends to give conditions that guarantee that such credible sets indeed quantify the confidence level.

Project coordination

Ismaël CASTILLO (Laboratoire de probabilités et modèles aléatoires)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partnership

LPMA Laboratoire de probabilités et modèles aléatoires

Help of the ANR 66,960 euros
Beginning and duration of the scientific project: October 2017 - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter