Blanc SVSE 6 - Sciences de la vie, de la santé et des écosystèmes : Génomique, génomique fonctionnelle, bioinformatique, biologie systémique 2010

Unravelling Genome Biology through a unique combination of comparative Genomics, functional Genomics and population Genomics (3G) – GB-3G

The biology of genomes

Study of cell biology and genome evolution through a combined approach of comparative genomics, functional genomics and population genomics applied to a poorly studied yeast clade; the Lachancea

Understand genome biology through genomics

Basic research in biology has developed around model organisms representing only few of the countless living species. For most of the model organisms, they are evolutionary highly divergent and these large distances do not allow to follow in detail the mechanisms that govern the evolution of basic cellular processes. Thanks to the recent sequencing technologies, the entire DNA information of the cells is now accessible allowing us to transform virtually any species into a model organism, thus allowing enlightening the mechanisms of evolution of major cellular processes. This is what we have undertaken here with the study of a group of yeasts, the Lachancea clade, that we selected because of its remarkable evolutionary characteristics. Primarily, this work will have only fundamental consequences. However, it is important to note that the mechanisms responsible for the evolution of genomes are equivalent to the mechanisms at the origin of numerous genetic disease and of the transformation of a normal cell into a cancer cell. Our work therefore presents a potentially important issue in human health.

DNA (the support of genetic information) sequencing capacities have dramatically increased in recent years giving access to a wealth of information that researchers never imagined to be able to reach in such a short time. In the early 1990s, it used to take few weeks to sequence a DNA molecule of a few thousand bases (the basic constituents of the «alphabet« of DNA). Currently, a single sequencing reaction allows reconstituting in a few days the sequence of several complete yeast genomes, each of which measuring more than 10 million bases. It is through this type of approaches that we will address issues related to the mechanisms of genome evolution and major cellular processes, as well as relations between the base composition of DNA and traits expressed by cells. It is important to note that DNA sequencing is only one of the approaches used in this project. Other approaches such as molecular genetics and functional genomics, but also bioinformatic approaches for the effective treatment of the huge amount of data resulting from these global approaches are also developed.

The first results that we obtained are essential to achieving our goals on the different axes of the project. First, we performed the sequencing, assembly and annotation of the genome of seven Lachancea species, which provide the full repertoire of genes existing in a complete eukaryotic clade. This will be crucial for understanding genome biological processes such as DNA replication (i.e. passing from one copy to two copies of DNA before cell division), recombination (mixing of genetic information between generations) and also for studying the dynamics of chromosome evolution including the reconstruction of ancestral genomes. Second, we identified and assembled the mitochondrial genomes of five species, which illuminates the evolution of these genomes. Third, we determined the profile of replication for two related Lachancea species. This represents a major advance in the understanding the evolution of replication in a group of species different from classical model organisms. Fourth, we have made significant progress in analyzing the relationship between genome composition (genotype) and visible characteristics (phenotype). We have sequenced the genome of 28 different strains of the most emblematic species of this group, Lachancea kluyveri, and analyzed their phenotypes in about 60 environmental conditions. Finally, we also have built genetic tools for the analysis of meiotic recombination in L. kluyveri.

This is a pioneering project in terms of functional genomic exploration of a group of species centered around a reference that is not a model organism. We expect that this project is the first in a long list because it seems clear that in terms of biological processes, the only pathways explored through the study of model organisms only represent a small part of biological diversity.

The complete sequencing of seven yeast genomes has allowed us to develop a very powerful automated pipeline that performs complete genome annotation in a few minutes. This annotation tool was developed as a pipeline BiopackR Amadea (ISoft). The sequence data allowed us to determine the structure, polymorphism and evolution of mitochondrial genomes in Lachancea and these results have been published in G3. The analysis of DNA copy number changes during cell cycle in L. kluyveri has allowed us to determine the spatio-temporal program of replication and this work is currently submitted for publication. In total, this project was the subject of 14 communications in national and international scientific conferences and 6 original publications in peer-reviewed journals.

The GB-3G project relies on a unique combination of comparative, functional and population genomics approaches to structurally and functionally describe an entire eukaryotic clade both at the inter- and intra-specific levels. Through the study of a monophyletic group of yeasts, the Lachancea clade, the long-term objective of this project is to unravel the molecular mechanisms driving the evolution of major genomic processes.

Thanks to the development of genomics and more specifically deep sequencing technologies, it is now possible to bypass classical model organisms to focus on other species showing unique features that will help tackling many fundamental questions. This is presently the case with the Lachancea clade, which contains 8 described species. These species are called protoploids because they diverged from the extensively studied Sacharomyces species prior to the ancestral whole genome duplication (WGD), which modeled their genomes. Sharing the same experimental flexibility as the Saccharomyces yeasts, the Lachancea clade presents specific phylogenetic and genomic characteristics that make it a clade of much better choice to decipher the evolution of major cellular programs.

First, the Lachancea species constitutes an exquisite model for the study of protoploid vs post-WGD yeast biology because they are experimentally tractable and undergo complete sexual cycles. Second, the Lachancea clade offers a good compromise between a relatively broad evolutionary range and a relatively good conservation of genome organization. The combination of these two latter properties is rather unique in yeasts since outside the Saccharomyces genus, where all species are poorly diverged and have almost collinear genomes, a considerable level of genome reorganization occurred between all other species. These properties are crucial to answer several seminal questions addressed in this proposal since the evolutionary range within the Lachancea clade offers sufficient signal to identify major evolutionary changes, both at the structural and at the functional levels, but is limited enough to identify the molecular bases of these changes. Finally, the genome of Lachancea kluyveri exhibits an unusual composition heterogeneity with a 1Mb region showing a considerable GC-richness compared to the rest of the genome. The origin of this composition bias is not understood. We believe that this L. kluyveri specific genomic feature provides a unique opportunity to decipher the mechanisms of nucleotide composition evolution in eukaryotic genomes.

We propose to generate both inter and intra-specific genomic encyclopedia of the Lachancea clade by providing first, the genome sequences of the 5 species not yet sequenced and a misclassified strain potentially representing a ninth species of the clade and second, the genome sequences of 31 natural isolates of the L. kluyveri species. These data will be invaluable to realize major conceptual advances on our understanding of the evolution of the DNA replication and the meiotic recombination programs, the history of chromosomal rearrangements, the forces driving the genomic nucleotide composition, the genetic bases of fitness diversity and reproductive isolation in eukaryotes. To achieve these goals, we propose to develop a unique combination of comparative Genomics, functional Genomics and population Genomics that we call "3G". A common computational environment, relying on recent advances in end-user oriented technologies for data and tools integration, will be set up to handle and exploit in the most efficient manner the large amount of data generated by the 3G combination.

Project coordination

Gilles FISCHER (UNIVERSITE PARIS VI [PIERRE ET MARIE CURIE])

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partnership

Université de Strasbourg UNIVERSITE DE STRASBOURG
UPR3081 CNRS CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE - DELEGATION REGIONALE PROVENCE
INRA INSTITUT NATIONAL DE LA RECHERCHE AGRONOMIQUE - CENTRE DE RECHERCHE DE JOUY-EN-JOSAS
UPMC-CNRS UNIVERSITE PARIS VI [PIERRE ET MARIE CURIE]
ISoft ISOFT

Help of the ANR 1,278,657 euros
Beginning and duration of the scientific project: - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter