CE45 - Mathématiques et sciences du numérique pour la biologie et la santé

Computational Approaches for Multimodal Data Integration in Biomedicine – CAMUDI

Submission summary

High-throughput technologies are generating a wealth of biological data. These data yield unprecedented opportunities to better understand biological systems in healthy and pathological states but also bring heavy computational challenges. A fundamental challenge is the integration of data from multiple and heterogeneous sources - or modalities. Proper integration is indeed critical to reveal fine-grained cellular mechanisms and their pathological deregulations. The CAMUDI project will develop innovative multimodal data integration approaches able to cope with the increasing complexity and diversity of biological data, and further adapted to leverage small numbers of samples. Indeed, the methods currently available are data-intensive. However, in many real-case applications, such as for the study of rare diseases, the number of available samples is by definition limited.

We will first develop joint dimensionality reduction approaches with transfer learning. Multi-omics datasets available from public omics compendia will be integrated with joint dimensionality reduction, and target datasets composed of a small number of samples will be projected on the learned latent space. We will then foster multi-layer network explorations to propose drug repurposing pipelines, implementing and comparing both supervised and unsupervised strategies with both direct and embedding network approaches. Finally, we will investigate multimodal autoencoders with transfer learning to integrate omics and images. The multimodal autoencoders will be trained on public datasets and fine-tuned with target datasets composed of a small number of samples.

All the multimodal data integration approaches developed in the CAMUDI project will be implemented as tools available to the community, and applied using in-house datasets (omics and live-cell imaging) to study Facio Scapulo Humeral Dystrophy (FSHD). FSHD is a highly heterogeneous and rare genetic disease with poorly understood pathophysiological mechanisms. We hypothesise that the analysis and integration of multimodal datasets will be key for FSHD proper molecular diagnosis and stratification of patients, for the identification of causative pathways and biological processes and for the development of new therapeutic strategies.

Project coordination

Anais BAUDOT (Centre de Génétique Médicale de Marseille (Marseille Medical Genetics))

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

LIS Laboratoire d'Informatique et Systèmes
MMG Centre de Génétique Médicale de Marseille (Marseille Medical Genetics)
I2M Institut de Mathématiques de Marseille
MMG Centre de Génétique Médicale de Marseille (Marseille Medical Genetics)

Help of the ANR 404,850 euros
Beginning and duration of the scientific project: December 2021 - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter