Statistical Modeling and Inference for unsupervised Learning at largE-Scale – SMILES
Statistical Modeling and Inference for unsupervised Learning at LargE-Scale
Transform heterogeneous high-dimensional and potentially big data into structured knowledge via original latent variable models learned by controlled complexity algorithms
Large-scale latent variable models
Large-scale data analysis is an inherently multidisciplinary area and is becoming increasingly important in the today’s society. SMILES is a collaborative fundamental research project that aims at introducing an unsupervised statistical modeling framework and scaled inference algorithms for transforming large-scale data into knowledge. It considers the large-scale context as a whole, with its main issues related to the inference from a big volume of data of very high dimension and with underlying complex hidden structures. The key tenet of SMILES is to introduce large-scale regression-based sparse and (non-)parametric models for data representation, and large-scale latent data models for unsupervised data classification. The knowledge extraction will namely consist in automatically retrieving hidden structures, summarizing prototypes, groups, sparse representations. We consider different data settings, including functional data, multimodal bioacoustical data, and biological data.
- latent variable models for high-dimensional regression, including functional regression and functional regressions mixture models
- latent variable models for high-dimensional classification and clustering in high-dimensional scenarios
- latent variable models for unsupervised learning and bioacoustics
- theoretical, methodological and computational results on approximation, estimation and model selection capabilities in latent variable models, in particular for mixtures and mixtures of experts
deep latent variable models and distributed mixtures for clustering
see the mid-term report
Large-scale data analysis is an inherently multidisciplinary area and is becoming of broader interest for today's society. SMILES is a collaborative fundamental research project that aims at introducing an unsupervised statistical modeling framework and scaled inference algorithms for transforming large-scale data into knowledge. It considers the large-scale context as a whole, with its main issues related to inference from a big volume of data of very high dimension and underlying complex hidden structures. The key tenet of SMILES is to introduce large-scale regression-based sparse (non)parametric models for data representation, and large-scale latent data models for unsupervised data classification. The knowledge extraction will namely consist in automatically retrieving hidden structures, summarizing prototypes, groups, sparse representations. We consider different data settings, including functional data, multimodal bioacoustical data, and biological data.
Project coordinator
Monsieur Faicel CHAMROUKHI (LABORATOIRE DE MATHÉMATIQUES NICOLAS ORESME)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partner
LMNO LABORATOIRE DE MATHÉMATIQUES NICOLAS ORESME
LMRS LABORATOIRE DE MATHEMATIQUES RAPHAEL SALEM
LIS Laboratoire d'Informatique et Systèmes
MODAL MOdel for Data Analysis and Learning
Help of the ANR 338,904 euros
Beginning and duration of the scientific project:
October 2018
- 42 Months