DS04 - Vie, santé et bien-être 2017

An evolutionary systems approach to understand long non-coding RNA functionality – LncEvoSys

An evolutionary systems approach to understand lncRNA functionality.

Eukaryotic genomes are pervasively transcribed, giving rise to thousands of long non-coding RNAs, that do not produce proteins. The biological functions of these RNAs remain largely unknown. Here, we propose to use an evolutionary systems approach to investigate long non-coding RNA functionality, by assessing natural selection pressures that act on various aspects of lncRNA biology.

Evaluating lncRNA functionality with an evolutionary systems approach

1) The first objective of this project is to develop an integrative lncRNA annotation methodology, to provide their precise genomic localization and gene structure, as well as their overlap and/or proximity to other functional genomic elements. We will thus provide a valuable resource for further evolutionary and functional lncRNA studies, answering to a critical need in the lncRNA community. <br /><br />2) The second objective of the proposal is to establish a sensitive and reliable approach to reconstruct homologous lncRNA families, which is a requirement for comprehensive evolutionary analyses. We propose to incorporate new parameters in the search for lncRNA homologues, such as the conservation of the genomic neighbourhood and of secondary lncRNA structures. <br />3) The third objective of this project is to develop a comprehensive evolutionary framework for evaluating lncRNA functionality. Specific evolutionary tests will be designed for several recent hypotheses related to lncRNA biological functions. Thus, we propose to implement new approaches to test for selective pressures at the level of lncRNA expression, which will be able to identify cases where transcription is required, without strong functional constraints for producing a specific RNA molecule. To test the recently proposed role of lncRNAs in establishing and/or reinforcing long-range chromatin interactions, we plan to evaluate the selective constraint on maintaining physical distances between lncRNA loci and interacting genomic elements, by analysing the frequencies of genomic rearrangements that disrupt these contacts. <br />4) The final objective of this proposal is to investigate the processes that drive the evolutionary emergence of lncRNAs. Due to their rapid pace of evolution, lncRNAs are a pertinent model to study new gene origination. By establishing high-confidence homologous lncRNA families (step 2), we will be able to robustly identify gains and losses of lncRNAs in vertebrate lineages.

The computational methodology includes developing and implementing tools for efficient analysis of large quantities of “omics” data, designing an integrative pipeline for functional lncRNA annotation, and establishing sensitive approaches for the identification of homologous lncRNA across distant species. We have decided to create public databases and web interfaces, to make the results of the project readily accessible to the entire research community. The extensive computational resources needed for this project are available at the computing centres of the LBBE-PRABI and of the IN2P3.

As of July 1st, 2019, we have progressed considerably in the development of a a an integrative approach for lncRNA functional annotation. Part of this approach was presented in a recently submitted publication:

F. Darbellay, A. Necsulea (2019) Comparative transcriptomics analyses across species, organs and developmental stages reveal functionally constrained lncRNAs. Molecular Biology and Evolution, in revision.

In parallel, we have annotated lncRNAs that are active in the human liver and analyzed their patterns of expression and splicing variation in human populations, using a valuable resource of human liver transcriptome generated by one partner (M. Heim, University of Basel). A publication presenting these results is in preparation.

We trust that our project will contribute to the advancement of the lncRNA biology field, which is now at a crossroads with respect to our understanding of lncRNA functionality. The field of lncRNA research has rapidly expanded in the last few years, thanks to the development of high-throughput transcriptome survey technologies, which have enabled lncRNA discovery at a genome-wide level. However, despite the methodological advances in lncRNA detection and quantification, our understanding of their biological functions is still very limited. There is an ongoing debate that opposes two hypotheses: one that posits that (most) lncRNAs represent transcriptional “noise” without functional relevance, and one that posits that (most) lncRNAs have essential biological roles. It has become critical to precisely evaluate the extent of lncRNA functionality, in the continuum between purely non-functional “noise” and essential molecules. The proposed research project aims to provide an answer to this important challenge, by evaluating lncRNA functionality through the lens of evolution.

As of July 1st, 2019, two articles were submitted for publication:

1. F. Darbellay, A. Necsulea (2019) Comparative transcriptomics analyses across species, organs and developmental stages reveal functionally constrained lncRNAs. Molecular Biology and Evolution, in revision.

Manuscript available on BioRxiv : www.biorxiv.org/content/10.1101/607200v2

2. A. Necsulea (2019), Phylogenomics and genome annotation. Invited book chapter for «Phylogenetics in the genomic era«, under review. Editeurs: Nicolas Galtier, Céline Scornavacca.

It was recently discovered that eukaryotic genomes are pervasively transcribed, giving rise to thousands of RNA molecules that do not encode proteins. Non-coding RNAs comprise distinct functional categories, including essential molecules, but also transcripts with yet uncertain biological roles. Among these, long non-coding RNAs (lncRNAs, defined as transcripts at least 200 nucleotides long, lacking protein-coding potential) are particularly abundant, with a recent study reporting ~60,000 lncRNA loci in the human genome. In contrast with the wealth of knowledge on protein functions, the roles of lncRNAs are still largely unknown, as only a small fraction of lncRNAs have been experimentally characterized.
Regulatory functions have been reported for increasing numbers of lncRNAs, suggesting that these non-coding transcripts may contribute to the diverse mechanisms that control gene activity in eukaryotes. Until recently, most investigations supported a direct function for the lncRNA molecules, for example as a scaffold between chromatin-modifying protein complexes and target genomic regions. However, at present we are witnessing a paradigm shift in lncRNA biology, with increasing evidence that for many lncRNA loci the biological function resides in the act of transcription or in the presence of additional functional elements at the locus rather than in the lncRNA product. Thus, lncRNA functionality remains unresolved.
Here, we aim to evaluate lncRNA functionality with an evolutionary approach. First, we plan to develop an integrative lncRNA annotation methodology, to provide the precise genomic localization and gene structure of high-confidence lncRNA loci, as well as their overlap and/or proximity to other functional genomic elements (including other genes, regulatory elements, anchor points for long range chromatin interactions, etc.), across multiple vertebrate species.
Second, we aim to establish a sensitive and reliable approach to reconstruct homologous lncRNA families, which is needed for comprehensive evolutionary analyses. An important technical barrier is the rapid evolution of lncRNA primary sequences, which hampers the detection of distant homologues. To address this issue, we will incorporate new parameters in the search for lncRNA homologues, such as the conservation of the genomic neighbourhood and of secondary lncRNA structures, in addition to using powerful methods to detect primary sequence homology. We plan to use phylogenetic approaches to determine lncRNA paralogous copies and to infer high-confidence orthologous genes across vertebrate species. A public database will be constructed to facilitate access to the resulting homologous gene families and orthology/paralogy relationships.
Third, we aim to develop a comprehensive evolutionary framework for evaluating lncRNA functionality. We will develop specific evolutionary tests for several recent hypotheses related to lncRNA biological functions. Thus, we propose to test for selective pressures at the level of lncRNA expression. To evaluate the role of lncRNAs in establishing and/or reinforcing long-range chromatin interactions, we plan to assess the selective constraint on maintaining physical distances between lncRNA loci and interacting genomic elements. Moreover, we will combine computational and experimental methods to functionally test a lncRNA candidate selected based on its evolutionary properties. Finally, we plan to investigate the processes that drive the evolutionary emergence of lncRNAs. This part of the project uses lncRNAs as a model system to address a fundamental question in the field of molecular evolution, namely the origin of new genes.
Our scientific consortium regroups computational evolutionary biologists, molecular geneticists and biomedical scientists. We are thus well placed to address lncRNA functionality with an evolutionary systems approach, with potential long-term biomedical applications.

Project coordination

Anamaria Necsulea (Laboratoire de Biométrie et Biologie Evolutive)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partnership

LBBE - CNRS Laboratoire de Biométrie et Biologie Evolutive

Help of the ANR 241,341 euros
Beginning and duration of the scientific project: December 2017 - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter