Single-Molecule Analysis of Human Genome Replication – SMAHGR
DNA replication is the most vital biological process. The faithful duplication of a cell’s genome is critical to preserve cell identity, genome integrity, and cell cycle progression. Defects in DNA replication are associated with developmental diseases, aging and cancer. Our goal is to depict a complete replication landscape of the human genome at the single molecule (SM) level. We will use nanopore sequencing, artificial intelligence (AI) and kinetic modeling tools recently developed and validated in our consortium, as well as CRISPR-Cas9 genome editing technology, to precisely quantify the rate of replication initiation, fork progression and replication termination through the whole human genome in multiple cell lines and to decipher the genetic and epigenetic determinants of these processes. During origin licensing, in the G1 phase of the cell cycle, the MCM replicative helicase is loaded in an inactive form around double stranded DNA. Origin activation (firing) can take place at different times through S phase, by binding of factors that activate the replicative helicase and recruit DNA polymerases and accessory factors for processive DNA synthesis. Only a fraction of the MCMs lead to productive initiation while the rest is inactivated by passive replication from active origins. The variable origin firing time and fork progression rate result in a distinct temporal order in which genome segments replicate, termed the spatiotemporal program of DNA replication. This program is developmentally regulated, correlated to transcription, chromatin structure and genome evolution and can be altered in diseases. Cell population techniques previously identified thousands of initiation and termination zones (IZs and TZs) in human cell lines. However, these techniques only reveal average tendencies and mask cell-to-cell heterogeneity. In contrast, SM techniques can reveal both common and rare initiation and termination events and can measure the progression rate of single replication forks. We recently developed a high-resolution, nanopore-sequencing based SM technique (FORK-seq) that allowed us to determine the position and orientation of hundreds of thousands replication forks in the yeast S. cerevisiae. We rediscovered the known replication origins and termini of yeast, observed dispersive initiation and termination events previously missed by cell population techniques, and provided the first-ever genome-wide map of individual fork speeds. Our working model for the human genome postulates that replication initiates within master IZs detected by cell population techniques, then propagates by dispersed initiation only detectable by SM analysis. We will use FORK-seq in human cells to test this scenario and to explore the cell-to-cell and locus-to-locus variability in replication initiation, fork progression and termination. Using kinetic modelling and AI techniques, we have been able to extract from cell population replication profiles an initiation probability landscape (IPLS) that, when inserted into our replication simulation framework, predicts such profiles almost exactly. The IPLS therefore contains all the information about the replication program at high resolution. We will refine this IPLS with FORK-seq data, and use AI to predict this IPLS from DNA sequence and epigenetic profiles alone. Using explainable AI tools, we will estimate at each locus the activating or repressing contribution of proximal and distal (epi)genetic signals and therefore understand how multiple types of signals combine to produce the IPLS. Similar tools will be used to elucidate the determinants of the fork progression landscape. CRISPR-Cas 9 genome editing will be used to confirm the role of the candidate determinants thus identified. This will represent a major contribution to the scientific understanding of a fundamental biological process that has resisted decades of investigation and has numerous medical, technological and societal implications.
Project coordination
Olivier HYRIEN (Institut de biologie de l'Ecole Normale Supérieure)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partner
LPENSL LABORATOIRE DE PHYSIQUE DE L'ENS DE LYON
IBENS Institut de biologie de l'Ecole Normale Supérieure
Help of the ANR 664,889 euros
Beginning and duration of the scientific project:
October 2023
- 48 Months