combining next generation SequencIng and droplet- based Microfluidics for the high throughput statistical analysis of Bio-molecule ADaptability. – SIMBAD
We aim at identifying the genotypic parameters that control protein evolvability, i.e. the potential of proteins to adapt to varying selection pressure by optimizing their function and/or acquiring new function through Darwinian evolution. Measuring a protein’s evolvability requires the fitness of proteins obtained by Darwinian evolution from this protein of interest to be measured, under many known selection pressures as a function of time. For this, we will trace millions of evolutionary trajectories of proteins (using an enzyme as a model) submitted to controlled selection pressure by analyzing protein genotype/phenotype mapping datasets from high-throughput in vitro evolution experiments using microfluidic systems. Parameters controlling the adaptation potential of proteins (starting points of evolutionary trajectories) will be extracted with statistical methods. Our strategy is based on an original combination of high-throughput droplet-based microfluidics and powerful next-generation DNA sequencing (NGS) technologies via a novel barcoding scheme, allowing genotype-phenotype linkage to be determined at an unprecedented scale of millions of proteins in a single massively parallel one-day experiment at each evolution cycle. Our highly interdisciplinary project involves evolutionary biology, droplet microfluidics, organic chemistry, DNA barcoding and sequencing, and statistical physics.
Directed evolution experiments allow the precise control of selection pressure, tuning of mutation rate and the monitoring of arbitrarily long evolutionary time scales, overcoming many of the limitations linked to the study of natural adaptation processes. The recent development of next generation DNA sequencing technologies also allows large populations of mutants to be analysed in parallel, opening up the possibility to have a large scale, statistical picture of molecular adaptation.
Here we propose to analyze, at an unprecedented resolution and scale, millions of trajectories of evolving bio-molecules under well-controlled selection pressure. We will couple high-throughput directed evolution to droplet-based microfluidics with next-generation DNA sequencing to study the evolution of a model protein, the enzyme SGAP (Streptomyces griseus aminopeptidase). We will evolve SGAP, which is naturally an efficient leucine aminopeptidase, to become an efficient valine or glycine aminopeptidase, or to become an efficient phosphodiesterase. A library of 10^6 SGAP enzyme variants will be evolved under successive selection pressures through the selection for arbitrary levels of aminopeptidase or phosphodiesterase activity measured by our custom-developed assays relying on the design and synthesis of new substrates. Using droplet-based microfluidics the individual phenotypes of 10^6 variants will be measured in less than an hour and the variants sorted according to their phenotype. After sorting, each variant’s sequence will be tagged with a DNA barcode encoding its phenotype, the round of selection and the selection conditions. Next-generation sequencing will be used to map genotypes of 10^6 enzyme variants with their respective phenotypes at every step, in a single run. We will perform a statistical analysis of the sequencing data to reconstruct millions evolution trajectories of these variants under successive controlled selection pressures. We will then extract parameters that determine the adaptive potential of bio-molecules under varying selection pressure.
Our work will provide a general framework for biological adaptation to selection pressures at the molecular level, yielding a deeper understanding of the evolution of new protein function and optimization of their activity. The expected results and proposed strategy have applications with a broad potential impact, for instance in the optimization of industrial enzymes, therapeutic proteins, and the design of anti-viral antibodies and vaccines.
Project coordination
Clément Nizak (Chimie-Biologie-Innovation ESPCI ParisTech-CNRS UMR8231)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partner
Broad Technology Labs, Broad Institute of MIT and Harvard
CNRS-DR PARIS B CIRB-UMR 7241 COLLEGE DE FRANCE
CNRS UMR8231 Chimie-Biologie-Innovation ESPCI ParisTech-CNRS UMR8231
CNRS UMR5588 Laboratoire Interdisciplinaire de Physique, UMR5588 CNRS-Université de Grenoble 1
Help of the ANR 424,350 euros
Beginning and duration of the scientific project:
December 2014
- 36 Months