Nowadays, due to impressive technical and financial efforts, ultra-high resolution and quality X-ray diffraction datasets for macromolecules (mainly proteins) are available and their number will increase in the next years. At the moment, there are no methods able to fully exploit the richness of information intrinsically contained in these new sets of data. To overcome this drawback, it is necessary to introduce new quantum mechanics-based techniques to successfully refine structures of proteins
The main goal of the project is to propose a new set of fast computational tools in order to routinely refine crystal structures and electron densities of macromolecules (particularly proteins) when sub-atomic (d = 1.0 Å) or ultra-high (d = 0.7 Å) resolution X-ray diffraction data are available. This responds to the increasing availability of highly intense synchrotron sources and to the establishment of new X-ray free electron laser facilities that will lead to higher and higher resolution and quality X-ray diffraction data for macromolecules in the very near future. In fact, at the moment, the development of fast and efficient routine methods/software does not keep up with the impressive technological and experimental advances in the field of macromolecular crystallography. Therefore, all the information contained in the collected experimental X-ray diffraction data is not always completely and efficiently exploited to provide detailed and definitive descriptions of the large systems under exam.<br />To fill this gap, in this project we aim at developing a new set of computationally advantageous quantum mechanical techniques for routine and successful refinements of macromolecular crystal structures. The original methodological improvement of this project will lead to obtain details of protein structures and electron densities at an unprecedented level of accuracy. This will give the possibility to shed further light on the modes of action of many biological molecules, with a direct repercussion on the improvement or the development of efficient therapies to cure human diseases. In fact, the new computational strategies will be potentially used as routine techniques to provide protein structures and electron densities that will constitute more realistic targets for more rational designs of new drugs.
To achieve the main goals of the project, two methods are mainly used: the Hirshfeld Atom Refinement (HAR) technique and the approach of the Extremely Localized Molecular Orbitals (ELMOs).
The Hirshfeld Atom Refinement method is a new and emerging refinement technique of quantum crystallography that is strongly based on tailor-made quantum chemistry calculations. Recent investigations conducted on a large number of crystal structures of organic molecules have shown that HAR is able to locate hydrogen atoms very precisely and accurately, providing bond-lengths that are in agreement with results obtained from the refinement of neutron diffraction data, mostly within a single standard deviation. Nevertheless, the application of HAR to macromolecules is hampered by the fact that the computational cost of the technique significantly increases with the size of the systems under exam because a quantum chemical calculation is necessary at each step of the process. The research program proposed in this project aims at completely overcoming this drawback by fully extending the applicability of HAR to macromolecules and, in particular, to proteins.
To accomplish this task, we aim at coupling HAR with quantum mechanical methods based on the use of the Extremely Localized Molecular Orbitals. ELMOs are molecular orbitals strictly localized on small molecular units (e.g., atoms, bonds or functional groups) and, due to their extreme localization, they can be easily and unambiguously assigned to small molecular fragments. For this reason, ELMOs can be also easily transferred from molecule to molecule to instantaneously reconstruct wavefunctions and electron densities of very large systems.
In this project, the reliable and convenient transferability of the ELMOs is exploited to develop new computationally advantageous and/or multi-scale embedding techniques of quantum chemistry that can be easily coupled with HAR to perform the refinement of protein crystallographic structures.
In particular, the ELMOs transferability was exploited i) to construct libraries of extremely localized molecular orbitals, which allow instantaneous reconstructions of approximate wavefunctions and electron densities of macromolecules and ii) to develop the new multi-scale embedding approach QM/ELMO, where the most important part of the system under exam is treated at fully quantum mechanical level while the rest is described through transferred and frozen extremely extremely localized molecular orbitals.
These are the main results obtained during the project:
1) Libraries of Extremely Localized Molecular Orbitals. The recently constructed libraries of Extremely Localized Molecular Orbitals are new useful tools to rapidly obtain approximate wavefunctions and electron densities of biological molecules (for example, polypeptides, proteins). Other than in the context of structural refinements, the new databanks can also find applications in other fields of physical chemistry, such as in the detection of non covalent interactions in biosystems through the recently developed NCI-ELMO and IGM-ELMO techniques, which could be very useful in the rational design of new drugs, in supramolecular chemistry and crystal engineering.
2) QM/ELMO method. This is a new multi-scale embedding technique that allows fully quantum mechanical studies of macromolecules at a reduced computational cost. The novel approach enables to treat crucial regions of very large systems (e.g, active sites of proteins) through traditional quantum chemistry methods for ground and excited states, while the remaining parts of the molecules are described by means of frozen extremely localized molecular orbitals (ELMOs) previously transferred from the recently constructed libraries. The new multi-scale QM/ELMO approach can be also considered as a step forward compared to the popular Quantum Mechanics/Molecular Mechanics (QM/MM) techniques, with the ELMO databanks that can be seen as sorts of quantum mechanical force-fields.
3) HAR-ELMO technique: quantum crystallographic refinements of proteins. This new refinement technique combines the precision and accuracy of the Hirshfeld Atom Refinement (HAR) with the speed of the ELMO libraries, with the final goal of extending the applicability of HAR to macromolecules. The newly developed strategy allowed to perform the first ever (fast) quantum crystallographic refinements of polypeptides and proteins and, in all the investigated cases, the obtained results have shown that the new HAR-ELMO approach provides accurate structural parameters for all atoms, including hydrogen atoms. The new HAR-ELMO technique opened a new research stream in protein crystallography.
4) HAR-QM/ELMO approaches. The Hirshfeld atom refinement was also interfaced with the multiscale embedding approach QM/ELMO with two different goals: in one case to perform more and more accurate refinements of crucial regions in macromolecules (e.g., active sites in proteins); in the other case to improve the performances of HAR by describing the chemical environment at fully quantum mechanical level in the quantum chemical calculations at the basis of each refinement.
5) XC-QM/ELMO strategy, which is a method that refines the wavefunction for a subset of electrons in a molecular system. It took the form of the X-ray constrained spin-coupled technique, where a large number of molecular orbitals are kept frozen, while only few are refined against X-ray diffraction data.
The ideas developed within this project will mainly have a direct and significant impact on fundamental science. In fact, starting from sub-atomic or ultra-high resolution X-ray diffraction data collected for crystals of macromolecules, the new and efficient computational tools proposed in this project will enable to extract structural details at an unprecedented level of insight. In particular, the long-standing problem of accurately determining the hydrogen-atom positions in the field of macromolecular crystallography will be solved, thus enabling to determine the protonation states of crucial protein residues with a significantly reduced level of uncertainty. Therefore, the application of the new developed strategies will enable to obtain detailed information that will be vital both to elucidate protein functions and to get increasingly detailed insights into basic biological and biochemical mechanisms and into diseases pathways.
Furthermore, since the new methods will provide unprecedented information on the modes of action of biological molecules, the research conducted in this project will also have direct and important implications on society. In fact, gaining more insights into the functions of biomolecules will have important repercussions on the fine-tuning of efficient therapies to cure human diseases. In particular, since the new refinement tools will provide structural and electronic information on proteins at a level that was not imaginable only 4 or 5 years ago, it will be possible to have access to protein structures and charge distributions that will constitute more realistic targets and starting points to design new and more efficient drugs. In other words, the new computational methods developed within the present project could potentially become routine and efficient tools to perform a more rational drug design and will be used by researchers working both in academic context and in pharmaceutical companies.
The results of the project were regularly published in international peer-reviewed journals (24 papers (plus 3 in preparation) and 2 book chapters) and communicated to the scientific community in international/national conferences (22 oral communications and 6 posters) and seminars (6). Some programs were also developed with the intention of releasing them free of charge at the end of the project. One Ph.D. thesis was also strictly connected to the research carried out in this project.
Most representative publications:
* B. Meyer, A. Genoni, Libraries of Extremely Localized molecular Orbitals. 3. Construction and Preliminary Assessment of the New Databanks, J. Phys. Chem. A 122, 8965-8981, 2018.
* D. Arias-Olivares, E. K. Wieduwilt, J. Contreras-García, A. Genoni, NCI-ELMO: a New Method to Quickly and accurately Detect Non-Covalent Interactions in Biosystems, J. Chem. Theory Comput. 15, 6456-6470, 2019.
* E. K. Wieduwilt, J.-C. Boisson, G. Terraneo, E. Hénon, A. Genoni, A Step toward the Quantification of Non-Covalent interactions in Large Biological Systems: The Independent Gradient Model-Extremely Localized Molecular Orbital Approach. J. Chem. Inf. Model. 61, 795-809, 2021.
* G. Macetti, A. Genoni, Quantum Mechanics/Extremely Localized Molecular Orbital Method: a Fully Quantum Mechanical Embedding Approach for Macromolecules, J. Phys. Chem. A 123, 9420-9428, 2019.
* G. Macetti, E. K. Wieduwilt, X. Assfeld, A. Genoni, Localized Molecular Orbital-Based Embedding Scheme for Correlated Methods, J. Chem. Theory Comput. 16, 3578-3596, 2020.
* G. Macetti, A. Genoni, Quantum Mechanics / Extremely Localized Molecular Orbital Embedding Strategy for Excited States: Coupling to Time-Dependent Density Functional Theory and Equation-of-Motion Coupled Cluster. J. Chem. Theory Comput. 16, 7490-7506, 2020.
* G. Macetti, E. K. Wieduwilt, A. Genoni, QM/ELMO: A Multi-Purpose Fully Quantum Mechanical Embedding Scheme Based on Extremely Localized Molecular Orbitals, J. Phys. Chem. A 125, 2709-2726, 2021.
* L. A. Malaspina, E. K. Wieduwilt, J. Bergmann, F. Kleemiss, B. Meyer, M. F. Ruiz-López, R. Pal, E. Hupf, J. Beckmann, R. O. Piltz, A. J. Edwards, S. Grabowsky, A. Genoni, Fast and Accurate Quantum Crystallography: from Small to Large, from Light to Heavy, J. Phys. Chem. Lett. 10, 6973-6982, 2019.
* E. K. Wieduwilt, G. Macetti, A. Genoni, Climbing Jacob’s Ladder of Structural Refinement: Introduction of a Localized Molecular Orbital-Based Embedding for Accurate X-ray Determinations of Hydrogen Atom Positions. J. Phys. Chem. Lett. 12, 463-471, 2021.
* A. Genoni, D. Franchini, S. Pieraccini, M. Sironi, X-ray Constrained Spin-Coupled Wavefunction: a New Tool to Extract Chemical Information from X-ray Diffraction Data, Chem. Eur. J. 24 15507-15511, 2018.
For more information, please visit the website alessandrogenoni.weebly.com
Nowadays, due to the large investments in the construction of more and more facilities for the production of intense high-synchrotron radiations and X-ray free electron lasers, sub-atomic and high-resolution X-ray diffraction datasets for macromolecules started appearing and their number will significantly increase in few years. The information content of these experimental datasets will be important to obtain structural and electron density details of biological molecules and, consequently, to get fundamental insights into their functions. Nevertheless, the impressive advances from the experimental and technological points of view have not been followed by an as much important and solid development of computational methods and software able to fully exploit the wealth of information contained in the high-resolution X-ray datasets. Therefore, the research program presented in this proposal aims at filling this gap. In particular, our goal consists in devising and implementing a new set of fast and efficient tools based on quantum mechanics that will be routinely used to refine structures and electron densities of proteins.
To achieve this goal, we mainly aim at extending the applicability of the Hirshfeld Atom Refinement (HAR) to macromolecules. HAR is an emerging technique able to locate the positions of the hydrogen atoms with the same precision and accuracy obtained from neutron diffraction measurements. This holds true also if we exploit X-ray diffraction data at resolutions as low as 0.8 Å. Therefore, protein refinements completely come into reach of HAR. Nevertheless, since a quantum mechanical calculation is necessary at each step of the refinement, the current version of HAR is obviously too computationally expensive to be directly applied to proteins. Our aim in this project will be to overcome this drawback by coupling HAR with novel linear scaling and multi-scale strategies based on the reliable transferability of the Extremely Localized Molecular Orbitals (ELMOs), which are Molecular Orbitals strictly localized on small molecular subunits (e.g., atoms, bonds or functional groups). In fact, after being preliminarily computed on suitable model molecules and stored in proper libraries, the ELMOs can be indeed considered as elementary electronic LEGO building blocks that will allow the instantaneous reconstruction of wave functions and electron densities of large systems.
To extend the applicability of HAR, preliminary steps will be i) the construction of universal ELMO-databanks to cover all the possible functional groups of the twenty natural amino acids in all their possible protonation states and ii) the development of an original multi-scale QM/ELMO strategy, through which fundamental regions of a macromolecule will be treated at a high quantum chemical level of theory, while the rest of the system will be described by means of frozen (transferred) ELMOs. Afterwards, in the crucial part of the project, the ELMO-libraries and the QM/ELMO technique will be coupled to HAR. Finally, the QM/ELMO method will be also extended into the framework of the Jayatilaka X-ray constrained wave function approach to further refine wave functions and electron densities of important proteins regions by properly taking into account ultra-high resolution X-ray diffraction data.
The original methodological improvement proposed in this project will allow to obtain details of protein structures and electron densities at an unprecedented level of accuracy. This will give the possibility to shed further light on the modes of action of many biological molecules, with a direct repercussion on the improvement or the development of efficient therapies to cure human diseases. In fact, the new computational tools will be potentially used as routine techniques to provide protein structures and electron densities that will constitute more realistic targets for more rational designs of new drugs.
Monsieur Alessandro Genoni (Structure et Réactivité des Systèmes Moléculaires Complexes)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
SRSMC Structure et Réactivité des Systèmes Moléculaires Complexes
Help of the ANR 163,221 euros
Beginning and duration of the scientific project:
January 2018
- 36 Months