CE45 - Mathématiques et sciences du numérique pour la biologie et la santé

Molecular and evolutionary principles governing enzyme regioselectivity – dEEPEN

Understanding the evolution of enzymes and predicting their function

Prediction of enzyme function is a notoriously difficult problem with promising biotechnological impact. Here, we tackle this problem by studying both molecular mechanisms and evolutionary properties of a class of enzymes that are involved in the ubiquinone biosynthesis pathway. To that end, we propose an interdisciplinary approach consisting in confronting bioinformatics studies in evolution (comparative genomics), biophysical models of molecular dynamics and biochemical experiments.

Molecular modelling, phylogenomics and coevolution analysis to elucidate the functioning and evolution of the hydroxylases of the ubiquinone biosynthesis pathway

Our project deals with the prediction and design of enzymatic functions. To that end, we study a family of hydroxylases (the flavin monooxygenases, FMOs) which are able to realise their hydroxylation reaction at different positions of the aromatic ring of the ubiquinone (UQ), a molecule that is key to the respiratory chain. In particular, the UQ-FMO family is characterised by a broad diversity of regioselectivity, i.e., of enzymes that can hydroxylate one, two or three positions of the aromatic ring of the UQ.<br /><br />Our objective is to elucidate the molecular mechanisms responsible for this diversity of regioselectivity by combining molecular modelling, phylogenomics and statistical analyses of coevolution of protein residues. In addition, using biochemical experiments, we will systematically test the predicted regioselectivities of enzymes that are natural, artificial and ancestral (enzymes resurrection).<br /><br />Our analysis of FMO regioselectivities is realised within a hierarchical framework, i.e., starting from a subfamily of UQ-FMOs that is specific to alphaproteobacteria (UbiL’s) and by progressively studying the whole set of UQ-FMOs and then the whole set of FMOs. More specifically, one of our objectives is to identify groups of residues that are responsible for this variation of regioselectivity, which may take the form of the so-called “sectors” or of the so-called“specificity-determining positions (SDP)”. We will then use this information to refactorize enzymatic functions. As a proof-of-concept of the generality of our method, we will try to modify the regioselectivity of an FMO that is not associated with the UQ biosynthesis pathway.

A. Bioinformatics - evolutionary analyses of enzymes:

A.1. Development of a bioinformatics tool integrating 1) the various methods developed for the analysis of residue coevolutionin proteins and 2) the possibility of multiple statistical weighting. Application of these analyses to UbiL and other UQ-FMOs.

A.2. Development of a robust and precise scenario for the evolution of UbiL on the basis of a phylogenomics reconstruction of ~3000 sequences available in the NCBI database. Extension to all UQ-FMOs is on the way.

B. Biochemistry - experimental test of the enzyme functions:

B.1. We experimentally tested the regioselectivities of more than 130 UQ-FMOs including 80 UbiL using functional complementation assays.

C. Molecular modeling:

C.1. Development of an automated modelling pipeline, taking into account the multi-domain organisation of proteins of the FMO family. The pipeline was used to model 10 UbiL of 5 organisms of interest for evolutionary analysis, and whose regioselectivities were evaluated experimentally.

C.2. Molecular dynamics models of UbiL in complex with the flavin adenine dinucleotide (FAD, cofactor of the reaction) and 3 possible substrates. The mode of binding to the substrate and to the FAD was studied in detail for each of the enzymes and in comparison with the PHBH (reference structure of class A FMOs).

C.3. Comparison of the molecular flexibility of UbiLs by an approach combining molecular dynamics and analysis by Protein Blocks

C.4. Prediction of UbiL SDPs based on a comparative analysis of the substrate binding site.

Experimentally, we have established an exhaustive map of the functional variations (variations of regioselectivity) of a class of enzymes (UbiL) within a class of bacteria that are approximately 2 billion years old (alpha-proteobacteria). Combined with a detailed phylogenomics analysis as well as a residue coevolution analysis of UbiL, these data allow us to have precise knowledge of the events linked to the emergence and evolution of an enzymatic function (here the hydroxylation of the UQ. They also suggest a precise scenario for a part of the species tree for which there is controversy within the phylogenetic community.

Using molecular modelling, we have identified six important residues that interact with the substrate (UQ intermediate). Remarkably, five of these six residues belong to a sector identified by our analysis of residue coevolution. These results thus establish a bridge between the mechanical details of the regioselectivity and the evolutionary traces of regioselectivity variation observable in the sequences.

Concerning the evolutionary scenario of UQ-FMOs: publication of an article integrating both phylogenomics analyses and experimental results and aiming at proposing the most likely evolutionary history of regioselectivity during the evolution of UQ-FMOs since the ancestor of Proteobacteria producing UQ.

Concerning coevolution analysis methods: setting up of a modular «open source« software combining the different available metrics associated with coevolution and the different available procedures of residue weighting. The goal is to be able to perform a systematic comparison of the results obtained from the different methods with a first application to FMOs.

Regarding molecular modeling:
- Modeling of reaction intermediates with UbiL.
- Study of the regioselectivity of Ubi by integrating molecular modeling data and sequence analysis (publication in preparation)
- Study of molecular flexibility and dynamics on the recognition of both substrates and FAD in class A FMOs.

Teppa E., Launay R., de Brevern A., Junier I., Abby S., Pierrel F., Esque J., André I. Structural and sequence investigations of regioselectivity and substrate specificity in a family of enzymes: the case study of ubiquinone biosynthesis hydroxylases. Journées du GDR Bioinformatique Moléculaire, GT MASIM, November 25-26 2021, Lyon.

Launay R., Teppa E., Martins C., Abby S., Pierrel F., André I, Esque J. Characterization and functional comprehension of an enzymatic assembly: the ubiquinone metabolon from Escherichia coli. Journées du GDR Bioinformatique Moléculaire, GT MASIM, November 25-26 2021, Lyon.

Teppa E., Launay R., Esque J., André I., Enzyme specificity and regioselectivity in flavin-containing monooxygenases. Seminars of the A2B2C April 7, 2021

« A complex scenario of gene duplications and losses explains the evolution of hydroxylases’ regio-selectivity within the ubiquinone biosynthetic pathway » Groupe Français de Bioénergétique (21-24 septembre 2021)

The exploitation of evolutionary information, and more particularly of residue coevolution, has revolutionized protein structure predictions. Adaptation of the methods issued from these analyses to the prediction and design of enzymatic functions remains an open problem. Enzymatic functions are indeed characterized by an internal dynamics of proteins that is difficult to model and study experimentally. In this project, we tackle this problem by studying in detail the capacity of certain hydroxylases of the flavin-containing monooxygenase (FMO) protein family to realize their reaction at different positions of the aromatic cycle of ubiquinone (UQ), a molecule key to the production of cellular energy. More precisely, UQ biosynthesis pathway involves three hydroxylation reactions occurring on three carbons of the UQ aromatic ring. Partner 1 has shown that different proteobacteria species use different combinations of (UQ-)FMOs to hydroxylate these three positions: some bacteria use a single enzyme able to hydroxylate all three positions, while other bacteria use three distinct enzymes that hydroxylate a single position each. The UQ-FMO family is thus characterized by a broad diversity of regioselectivities, with enzymes capable of hydroxylating one, two or three positions of the UQ aromatic cycle. In this context, our objective is to develop a methodology that combines molecular modeling (Partner 2) and evolutionary information of enzymes (Partner 1) to elucidate the molecular mechanisms underlying this diversity of regioselectivities.

Our preliminary results show that in alphaproteobacteria, a family of homologous enzymes named UbiL displays the entire diversity of UQ-FMO regioselectivities, with UbiLs hydroxylating one, two or three positions in different organisms. In addition, an analysis of amino acid coevolution suggests that this diversity is controlled by a sector, i.e., a network of coevolving residues connected in the 3D space (forming a cavity around the active site). In this context, our first objective is to decipher the molecular mechanisms responsible for the variations of the UbiL regioselectivity. To this end, we will use a combination of molecular modeling (Partner 2), of phylogenomics (Partner 1) and of statistical analyses of amino acid coevolution (Partner 1). Moreover, the predicted regioselectivities of natural, artificial and ancestral enzymes (we will resurrect the latter) will be systematically tested using biochemistry experiments (Partner 1). Next, we will apply our methodology to the full set of UQ-FMOs (~1000 sequences) in order to highlight the evolution of mechanisms associated with the hydroxylation stages of the UQ pathway. Finally, we will analyze the entire family of FMOs (~18000 sequences) in order to recapitulate the evolution of this protein family by identifying both commonalities and differences between metabolic pathways. In particular, our objective is to identify the sector(s) responsible for the variations of regioselectivity of UQ-FMOs and, more generally, the variations of regioselectivity of FMOs, and to use this information to refactor enzymatic functions. In this regard, as a proof of concept of the generality of our methodology, we will aim at modifying the regioselectivity of a FMO unrelated to the UQ pathway.

Altogether, this truly interdisciplinary project thus aims at integrating molecular modeling (from 3D modeling of enzymes to the analysis of the internal dynamics of the enzyme in interaction with the substrate) and evolutionary information (from the reconstruction of the evolutionary history of metabolic pathways to the coevolution of amino acids) of enzymes whose functionality can be tested at the bench (using biochemistry experiments) in order to improve our understanding of the functioning and evolution of enzymes, and to propose novel principles for enzyme design.

Project coordination

Ivan JUNIER (Techniques de l'Ingénierie Médicale et de la Complexité - Informatique, Mathématiques et Applications, Grenoble)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.


TIMC-IMAG Techniques de l'Ingénierie Médicale et de la Complexité - Informatique, Mathématiques et Applications, Grenoble

Help of the ANR 447,191 euros
Beginning and duration of the scientific project: February 2020 - 42 Months

Useful links

Explorez notre base de projets financés



ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter