BLANC - Blanc 2007

Definition of Evolutionary Histories for the Human Proteome – EvolHHuPro

Submission summary

The goal of our project is the definition of a complete set of the evolutionary histories (cascade of phylogenetic events) for the human proteome and their genome-scale analysis. The genetic information encoded in the genome sequence contains the blueprint for the potential development and activity of an organism. This information can only be fully comprehended in the light of the evolutionary events (duplication, loss, recombination, mutation...) acting on the genome, that are reflected in changes in the sequence, structure and function of the gene products (nucleic acids and proteins) and ultimately, in the biological complexity of the organism. The recent availability of the complete genome sequences of a large number of model organisms means that we can now begin to understand the mechanisms involved in the evolution of the genome and their consequences in the study of biological systems. This is illustrated by the evolutionary analyses and phylogenetic inferences that play an important role in most functional genomics studies, e.g. of promoters ('phylogenetic footprinting'), of interactomes (notion of 'interologs' based on the presence and degree of conservation of counterparts of interactive proteins), and also, in comparisons of transcriptomes or proteomes (notion of phylogenetic proximity and co-regulation/co-expression). At the same time, theoretical advances in information representation and management have revolutionised the way experimental information is collected, stored and exploited. Ontologies, such as Gene Ontology (GO) or Sequence Ontology (SO), provide a formal representation of the data for automatic, high-throughput data parsing by computers. These ontologies are being exploited in the new information management systems to allow large scale data mining, pattern discovery and knowledge inference. Unfortunately, the vast number and complexity of the events shaping eukaryotic genomes means that a complete understanding of evolution at the genomic level is not currently feasible. At the lowest level, point mutations affect individual nucleotides. At a higher level, large chromosomal segments undergo duplication, lateral transfer, inversion, transposition, deletion and insertion. Ultimately, whole genomes are involved in processes of hybridization, polyploidization and endosymbiosis, often leading to rapid speciation. We propose to characterise and to study the evolutionary histories of the human proteome, defined as the cascade of genetic events that occurred during the evolution of the vertebrate genomes, and their impact in the protein coding regions (extensions, insertions, deletions...) of the human genome. This ambitious objective is now possible thanks to the emergence of formal description of biological data and to the recent developments of the project members aimed at accurate phylogenetic reconstruction and genome analyses (Partner 1: Figenix platform) and at automated reliable and exploitable protein sequence alignments (Partner 1 & 2: TCOFFEE, PipeAlign, MAO, MACSIMS...). These methodologies will be combined into a multi-agent, expert system for the construction of evolutionary histories. A new ontology will be developed that will facilitate the automatic definition of the important genetic events shaping a single protein and their potential causalities at the genome level. In a subsequent step, the evolutionary histories of the complete human proteome will be reconstructed, followed by the classification and functional analysis of the human proteins sharing common evolutionary histories. An analysis at the genomic level will be realized for a specific number of the proteins identified in the functional analysis. ...

Project coordination

Université

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partnership

CENTRE EUROPEEN DE RECHERCHE EN BIOLOGIE ET EN MEDECINE

Help of the ANR 350,000 euros
Beginning and duration of the scientific project: - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter