DS0401 - 2016

In vivo G-Quadruplexes Topologies and Identification of Proteins Partners – G4-TopIPro

Design of constrained G4 DNA and identification of partner proteins

We propose the use of G-quadruplex systems constrained in a single topology to identify and characterize the proteins which interact with a well-defined G-quadruplex topology. We also propose to use this concept to generate specific antibodies of a given G-quadruplex topology in order to determine what is the nature of the G-quadruplex topology in vivo.

The aim of the project is to design constrained G-quadruplex DNA systems in order to overcome the polymorphism problems inherent of those G4-DNA.

G-quadruplex (G4) DNA consists in the formation, from guanine rich nucleic acids, of tetrameric structures resulting from the association, via Hoogsteen type hydrogen bonds, of guanine tetrads and their stacking. Numerous studies demonstrate the important biological role of G4, now considered to be a target of interest for drug design.<br />A major characteristic of G4s is their polymorphic nature: they are indeed capable of adopting different conformations in which the strands are parallel or antiparallel, with the presence of different types of loops (lateral, diagonal or propeller) and of varying lengths. This structural polymorphism, as well as a dynamic balance between the single-stranded form and the G4 structure, represents a serious bottle neck for the studies of recognition of G4 by biological components and thus limits structure-activity relation-ship studies with proteins interacting with them.<br />Our objectives are to overcome these problems by using constrained G4 DNA for the identification and characterization of proteins interacting with a stable and well-defined topology of G4 as well as the production of specific antibodies of a given topology.

We have developed a biomolecular system based on the use of a peptide scaffold on which the oligonucleotide sequences forming the G4 are anchored. This strategy allows to constrain the G4 in a single and predetermined topology, thus preventing the polymorphism inherent of G4-DNA. These systems are then attached via a biotin anchored on the peptide template, onto magnetic beads functionalized with streptavidin allowing the use of the “fishing” technique to capture proteins from a cell extract. These are then identified by proteomic analysis. Controls are also used to select proteins for G4 and for a component of G4 (e.g. loop). The proof of concept of this strategy has been carried out using proteins known to interact with G4 DNA and visualized on gel by the «western blot« technique.

The synthesis of more sophisticated biomolecular structures with a single topology have been carried out using successive chemoselective ligations which have shown their compatibility. The use of the protein capture technique on the beads, followed by the proteomic analysis made it possible to demonstrate a certain selectivity of the various proteins for G4 in parallel topology versus G4 in antiparallel topology. In addition, we have identified new protein complexes that interact with G4s, including the NELF (Negative Elongation Factor) complex which controls the «pausing« of RNA Polymerase II during transcription.

The various ligations developed will be exploited for the development of other tetrameric DNA systems, in particular G4 DNAs carrying internal loops.
The different systems developed are used to study the interaction with putative ligands molecules. In this context, we are collaborating with the Institut Curie of Orsay (M.-P. TEULADE-FICHOU team) to study G4 ligands developed in its team. On the other hand, thanks to these constrained G4 systems, we were able to initiate an international collaboration with the group of B. ELIAS at the University of Louvain la Neuve. In this context, we are studying the interaction of G4 with photoactivatable metal complexes of ruthenium and iridium. We were thus able to demonstrate a good affinity of certain complexes with G4 DNA. These molecules have also shown good photo-cytotoxicity and are currently being used for studies on small animal models.
Another interesting prospect exploiting these results at the level of chemistry concerns the design of tetramolecular DNA of the i-motif type. Indeed, we have demonstrated that thanks to the constraint exerted by the cyclo-decapeptide template we can stabilize this particular motif at physiological pH. This system can thus be exploited for the search for selective i-motif ligands but also to identify proteins which interact with this i-motif DNA: the extremely innovative point being that these studies can be carried out at physiological pH.
At the biological level, in addition to identifying factors known to interact directly or be associated with the biology of G4s, we have also identified new complexes for which their cellular impact through their action on G4 remains to be studied. Currently, we are in the process of cell-validating a number of these factors via CRISPR / Cas9 approaches.

A first publication (Chem. Eur. J. 2017, 23, 5602) describes the use of three successive chemical ligations thus allowing the development of a G4 system derived from the HIV genome. A second publication (Org. Biomol. Chem., 2020, 18, 6394) describes the use of four successive ligations to synthesize tetrameric DNA. This was the subject of Alexandre Devaux's thesis defended on July 1, 2020 at the University of Grenoble Alpes. Another publication (in correction for Scientific Report) describes the use of constrained systems for the capture of proteins, this publication has been deposited on the BioRxiv site (https://doi.org/10.1101/2021.04.06.438633). Finally, Angélique Pipier's thesis on this project was defended on October 15, 2020 at the University of Toulouse 3.

The double-helical structure of DNA in which two antiparallel strands are held together through canonical A/T and G/C base pairing was established over half a century ago. However, beyond double-helical-based structures, the past decades have brought accumulating evidences of the existence and biological relevance of four-stranded nucleic acid structures, including G-quadruplexes and i-motifs. G-quadruplex nucleic acids (G4-DNA and G4-RNA) are formed from G-rich sequences through stacking of tetrads of Hoogsteen hydrogen-bonded guanines connected by various loop-forming sequences, and are stabilized by physiologically abundant K+ and Na+ cations. Sequencing and bioinformatics analyses of the human genome indicate that it contains as much as 700,000 sequences having the potential to form stable G-quadruplex structures. G4-DNA forming sequences are found at telomeres, where they are involved in the regulation of telomerase, a reverse transcriptase enzyme that is up-activated in 80% of human tumours, and in the promoter regions of a large number of genes, including proto-oncogenes c-MYC, c-KIT, and KRAS. On the other hand, G4-RNA have also been described to play biological functions: they are involved in several biological processes such as translation regulation, pre-mRNA processing, mRNA targeting and telomere maintenance. More recently, intermolecular hybrid DNA:RNA quadruplex structures (HQ) have also attracted increasing interest as they appear to be involved in transcription regulation. It has been reported that certain pathologies or chronic diseases due to cell dysfunction might involve the presence of G-quadruplexes. The G4 formation has been linked to genetic disorders (diabetes, fragile X disorder, Bloom syndrome), age-related degenerative illness (ALS, FTD), cancer (telomere, MYC, Kit, BCL-2) and virus infection (EBV, HIV, HPV), and those motifs are now considered an emerging class of drug targets.
A major characteristic of G-quadruplex nucleic acid structures is their intrinsic polymorphic nature: depending of the length, sequence, medium and cations concentration, intramolecular G-quadruplexes show distinct structural topologies in which the strands are in parallel or antiparallel conformations, with the co-existence of different types of loops (lateral, diagonal or propeller) with variable lengths. This structural polymorphism represents a serious bottleneck for the studies of G-quadruplex recognition by the biological components and impedes global studies of structure-activity relationships of G4-interacting proteins. Furthermore the in vitro structural studies cannot fully be transposed to the complex intracellular medium and thus the adopted topologies by G-quadruplexes in vivo remains largely unknown.
In this context, the main objectives of this proposal are the design and synthesis of a panel of various constrained G-quadruplex topologies of structural and biological relevance by using a template-assisted strategy (task 2). Those constrained systems represent original tools, which will be used for the identification and characterization of proteins interacting with a single well-defined G-quadruplex topology. This will provide a database of hits, which could be useful to the scientific community for a better comprehension of how cellular proteins interact with, modulate and are regulated by G-quadruplex structures (task 3). In addition, we will attempt to investigate the topology of G-quadruplexes in vivo by producing specific antibodies for a given G-quadruplex topology (task 4).

Project coordination

Eric DEFRANCQ (Département de Chimie Moléculaire)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partnership

DCM UMR 5250 Département de Chimie Moléculaire
MNHN-U1154-CNRS UMR7196 Laboratoire Structure et Instabilité des Génomes
IPBS Institut de Pharmacologie et Biologie Structurale

Help of the ANR 494,586 euros
Beginning and duration of the scientific project: September 2016 - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter