CE45 - Mathématique, informatique, automatique, traitement du signal pour répondre aux défis de la biologie et de la santé

Towards a Deeper knowledge of Proteomes – DeepProt

Submission summary

Next generation sequencing is gradually revealing the secrets of genes for all organisms, but the complex world of proteins remains largely unknown despite the critical roles they play in living organisms. Diversity of proteins originates either from genetic variations or from alternative splicing, both affecting their primary amino acid sequences, or from post-translational modifications (PTMs) modifying the chemistry of the complex polymers after synthesis. The diversity of PTMs would be a major factor in explaining the gap between the small number of genes and the complex machinery of organisms.

"Proteomics" refers to the scientific discipline that aims at identifying, quantifying and characterizing proteins on a large scale. The most used experimental approach in mass spectrometry, called "bottom-up", generates tens of thousands of experimental spectra in a single one-hour run. Software is essential to get information from so huge raw data but at present, 40 to 75% of the spectra remain uninterpreted after analysis. Modifications displayed by proteins are the most widespread hypothesis to explain this low rate.

By renewing the paradigm of comparing large volumes of spectra, two DeepProt partners raised a first methodological lock to resolve this issue. The software SpecOMS can compare tens of thousands experimental spectra to hundreds of thousands spectra modeled from a protein database with no filter on their mass in a few minutes on a standard workstation. Taking advantage of the power of this new approach, DeepProt aims at developing new algorithms to more exhaustively interpret the set of experimental spectra generated by proteomics experiments -even those from complex metaproteomic studies - and next to deduce the list of proteins present in the analyzed mixture with detailed information about their PTMs. To achieve this goal, several challenges need to be addressed: (i) to better identify the peptides displaying several modifications, (ii) to improve the inference of proteins and describe their modifications, (iii) to cope with the scale-up from a classical proteome size to a metaproteome size (more than two orders of magnitude).

To meet the challenges, DeepProt relies on a multidisciplinary consortium mastering the entire chain of a complex proteomic analysis and will focus on several sets of spectra generated for two major scientific topics: the study of protein modifications induced by food processes (in relation with the allergenicity of food) and the metaproteomics of the gut microbiota (for new therapeutic perspectives). By working this way, in addition to the algorithmic advances, DeepProt will allow significant advances applicable to both gut microbiota studies and “foodomics”.

The new models and algorithms developed by DeepProt will be implemented in existing and used user-friendly software to manage end-to-end proteomic analyses and to quantify identified proteins. This will provide the means to get a more comprehensive view of complex proteomes and their variations and thus will open up research opportunities in many areas of biology and health.

Project coordination

Hélène ROGNIAUX (Biopolymères, Interactions Assemblages)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.


BIA Biopolymères, Interactions Assemblages
LS2N Laboratoire des Sciences du Numérique de Nantes
GQE Génétique quantitative et Evolution - Le Moulon

Help of the ANR 505,661 euros
Beginning and duration of the scientific project: September 2018 - 48 Months

Useful links

Explorez notre base de projets financés



ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter