Bridging the gap between dynamical systems and their partial observation, computational models of biological processes aim at uncovering key mechanisms driving cellular dynamics, and ultimately predict their behaviour under unobserved conditions.
Computational models of molecular interaction networks are usually built from data related to the structure of the biological system, such as
known interactions; and data related to its dynamics, such as measurements of gene expressions or proteins activity at different time and/or conditions.
However, despite huge advances in experimental technologies, observations of biological processes stay very scarce, either in terms of temporal resolution, number of observed entities, synchronisation between measure points, or variety of experimental conditions. Combined with complex structures for molecular interactions, the model engineering problem in this case appears to be largely under-specified, leading to (too) many potential candidate models.
Boolean Networks (BNs), and logical models in general, are widely adopted for the modelling of signalling pathways and gene and transcription factors networks as they are not demanding for the exact knowledge of the quantitative parameters of the selected molecular interactions. With BNs, the activity of components is caricatured to “off” and “on”, and their evolution is computed according to logical rules
However, in practice, biological data still let open a multitude of candidate BNs. Thus, arbitrary modelling choices have to be made, e.g., by prioritizing certain logics between regulators or by preferring smallest/largest models, which may introduce biases in subsequent model predictions.
The BNeDiction project aims at providing a general methodology for making predictions from data on systems structure and dynamics by the means of ensembles of Boolean networks (BNs), an unexplored direction.
By focusing on ensembles of models, we aim at capturing the diversity of admissible models and reduce biases due to the selection of a single “best” model from arbitrary criteria. Similarly to random forest approaches, we will constitute ensembles of BNs representative of the whole multitude of admissible models, and then compute predictions from the ensemble.
Based on recent advances on the symbolic and implicit formal characterization of the compatible models using logic programming, the key challenges relate to the sampling of ensemble of diverse models, and the evaluation and maximization of its predictive power, with a thorough benchmarking of the pipeline.
Ensemble modelling has the potential of improving the robustness of predictions by accounting for potential model variability and uncertainty, We plan to demonstrate the BNeDiction pipeline for the ensemble modelling of mouse hematopoietic system from single-cell RNA-seq differentiation data available with different mutant conditions. These type of data brings strong dynamical constraints for the model inference, and the different mutant conditions can be split into training and testing data to evaluate the predictive power of ensemble modeling.
Overall, the project aspires at delivering a convincing methodology for assessing the adequacy of automated logical modelling from experimental data, a key and recurring question at the intersection of artificial intelligence and life sciences
Monsieur Loïc Paulevé (Laboratoire Bordelais de Recherche en Informatique)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
LaBRI Laboratoire Bordelais de Recherche en Informatique
Help of the ANR 250,992 euros
Beginning and duration of the scientific project: February 2021 - 48 Months