In many scientific applications, increasingly-large datasets are being acquired to describe more accurately biological or physical phenomena. While the dimensionality of the resulting measures has increased, the number of samples available is often limited, due to physical or financial limits. This results in impressive amounts of complex data observed in small batches of samples.
A question that arises is then : what features in the data are really informative about some outcome of interest ? This amounts to inferring the relationships between these variables and the outcome, conditionally to all other variables. Providing statistical guarantees on these associations is needed in many fields of data science, where competing models require rigorous statistical assessment. Yet reaching such guarantees is very hard.
FAST-BIG aims at developing theoretical results and practical estimation procedures that render statistical inference feasible in such hard cases. We will develop the corresponding software and assess novel inference schemes on two applications : genomics and brain imaging.
Monsieur Bertrand Thirion (Institut National de Recherche en Informatique et en Automatique)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
UPSud + LMO Université Paris Sud + Laboratoire de mathématiques d'Orsay
LBBE - CNRS Biométrie et biologie évolutive
Inria Saclay - Ile-de-France - équipe PA Institut National de Recherche en Informatique et en Automatique
Help of the ANR 442,073 euros
Beginning and duration of the scientific project: February 2018 - 48 Months