CE45 - Mathématiques et sciences du numérique pour la biologie et la santé

New approaches to bridge the gap between genome-scale metabolic networks and untargeted metabolomics – MetClassNet

MetClassNet

New approaches to bridge the gap between genome-scale metabolic networks and untargeted metabolomics

Network science to address metabolomics challenges

1. Develop a novel computational framework to build multilayer networks integrating different knowledge-based networks and experimental networks.<br />2. Develop state of art computational solutions exploiting the multilayer topology to improve the power of metabolism analysis such as gap filling, metabolite identification, new mechanism identification, biological interpretation or functional motif search.<br />3. Generate new biological insight by using combined power of the above approaches. We will use several life science examples and applications to validate and improve the networks.<br />4. Liberate the new software solutions and data toward the community. FAIR for both data and tools and provide tutorials and workshop for adoption of the tools.

The MetClassNet approach. For this project, we hypothesize that combining all these types (layers) of networks in an integrated computational framework (multilayer connected network) will lead to increased metabolic knowledge and gain new insights into metabolic modulations. In particular, the integration of networks will help in the challenging task of metabolite identification by providing a biological and chemical context (ontologies and GSMN) linked to MS based networks. GSMNs will also benefit of this integration since MS and MS/MS data will help in filling the gaps in the GSMN (e.g. on lipid metabolism).

Deposition of first C. elegans lipidomics study.
Implemented mzTab-M in MSnbase.
Deployment of a Nextflow pipeline in IARC HPC infrastructure to analyse metabolomics data.
MassBank Backend for Spectra (enables library matching in R for MetClassNet). Enable reading of spectra in MassBank format into R.
MsBackendMassbank package accepted in Bioconductor.
Caching ChemOnt in R package classyfireR.
Coverage Vignette. A vignette that shows with an example how to check the overlap of compound classes represented within an experimental dataset and a GSMN.
MetClassNetR package.Creation of a vignette that uses an example dataset to generate the experimental networks (mass-differences, correlations, spectral similarity) and export to .gml which will be used as input for MetClassNet.
Development of a metabolite mass difference list based on chemical reactions rule.
Prototyping of GNPS scoring in R Implementation of a GNPS like scoring for spectral similarity networks within R. In future to be used by MetClassNetR.
Definition of an architecture for the whole system.
Neo4J server installed on de.NBI
creation of GML2Cypher, a tool for network integration into Neo4J database.
Import of ChEBI and ChemOnt, and import of MTBLS1582 derived graphs into Neo4J.
creation of a tutorial for the exploitation of the consortium’s Neo4J database.
Collection of C. elegans data set for testing. Different datasets on C. elegans metabolomics and lipidomics have been collected and processed exactly the same way as input for MetClassNet workflow.
C. elegans dataset analyzed with all experimental networks. A RP-LC-MS data set on a C. elegans infection model was used to create mass-difference, correlation and spectral similarity networks using MetNet and GNPS. Results from annotation and identification of metabolites using different tools were combined and data has been prepared for a first multilayer network.
Analysis of Liver cancer prospective cohort study (450 matched case/control) untargeted metabolomics datasets.
Generation of the first two-layer network (genome scale metabolic network + mass-difference network from an experimental dataset).
Wetlab: Collecting samples for C. elegans developmental stages (for testing MetClassNet).
Upload of Data Management Plan on Opidor following ANR template.

Defining algorithms allowing to exploit the multi-layer architecture to address challenges in metabolomics.
Applicaiton on data gathered by the consortium.

1. L. Salzer, M. Witting. Quo Vadis Caenorhabditis elegans Metabolomics—A Review of Current Methods and Applications to Explore Metabolism in the Nematode. Metabolites. 2021. Open access: doi.org/10.3390/metabo11050284
2. Cecilia Wieder, Clément Frainay, Nathalie Poupin, Pablo Rodríguez-Mier, Florence Vinson, Juliette Cooke, Rachel PJ Lai, Jacob G Bundy, Fabien Jourdan and Timothy Ebbels (2021). Pathway analysis in metabolomics: recommendations for the use of over-representation analysis. Plos Computational Biology. Open access: doi.org/10.1371/journal.pcbi.1009105

Metabolism is a key biological process which is modulated in living organisms in response to environmental exposure, genetic variations and diet. Understanding metabolism is essential to improve plant performance, nutritional content, and to understand Human health and well-being. The metabolic response can be complex, involving hundreds to thousands of small molecules (metabolites) connected by thousands of biochemical reactions. Together, they constitute a dense network, in its entirety often called Genome Scale Metabolic Network (GSMN). Within this context, metabolomics is a cornerstone approach to experimentally observe changes in the metabolome (set of metabolites). One of the main analytical platforms to measure the metabolome is Mass Spectrometry (MS) which is often coupled to separation methods (e.g. Liquid Chromatography, LC-MS). Even though the technology is advancing rapidly, several challenges remain for widespread adoption of metabolomics. Metabolite identification remains one of these challenges. Nevertheless, experimentally obtained data and in silico generated GSMN overlap only partially and are generally not studied simultaneously.
In MetClassNet, we hypothesis that these difficulties could be overcome by designing new data structures and algorithms which will exploit the connectivity (network) between molecules. This integrative approach will boost the power of data analysis by unifying GSMNs and networks obtained from experimental data. Hence, MetClassNet will propose a new computational framework and novel methods to help with tackling main metabolomics challenges in data analysis and data interpretation. This framework will integrate information from experimentally derived information and GSMNs by bridging them using direct mapping, ontologies and chemical class information.
At the end of the project, MetClassNet will offer the community an innovative tool set where it will be possible to go beyond table based analysis of metabolomics data by integrating (and not just exporting) them into a network system. To this end, MetClassNet will create novel algorithms and tools to mine these networks allowing to increase our knowledge of the metabolome. The developed framework will also ease the connection between metabolomics and GSMNs, hence allowing to fill the gaps in current databases of metabolic networks. Within MetClassNet project, we will showcase the benefit of the computational framework to address the study of metabolic modulations related to ageing, toxicology, cancer and nutrition. Finally, MetClassNet consortium will put the necessity of opening data, protocols and software to the community high in its agenda.

Project coordination

Fabien Jourdan (Institut National de la Recherche Agronomique Centre Toulouse - Occitanie)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

IPB Leibniz Institute of Plant Biochemistry / Bioinformatics & Mass Spectrometry
INRA TOXALIM - MEX Institut National de la Recherche Agronomique Centre Toulouse - Occitanie
CIRC Centre International de Recherche sur le Cancer
HMGU Helmholtz Zentrum München, German Research Center for Environmental Health / Research Unit Analytical BioGeoChemistry

Help of the ANR 985,818 euros
Beginning and duration of the scientific project: January 2020 - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter