Référence projet : 10-BINF-0002
RST : Pascale COSSART
Etablissement Coordinateur : Institut Pasteur
Région du projet : Île-de-France
Discipline : 5 - Bio Med
Aide de l'ANR 1 270 000 euros
Investissement couvrant la période de octobre 2011 à octobre 2016
As described in the original proposal, one main goal of the BACNET project was to focus on the development of novel bioinformatics methodologies for the analysis of transcriptomic data from studies of bacterial organisms. Transcriptomic studies can be based on gene expression arrays, tiling arrays and deep sequencing approaches. There is a challenge in integrating the information gleaned from these various approaches. Furthermore, transcriptomics studies are providing unprecedented information about the underlying organization of bacterial genomes. They may identify numerous novel elements including small ORFs and ncRNAs. There is a need for more sophisticated bioinformatics tools to characterize the roles of these elements in microbial physiology. To address these issues, this consortium intended to utilize a large bank of gene expression, tiling array and deep sequencing data generated from studies of the Gram positive bacterial pathogen Listeria monocytogenes.
From the beginning the project was organized around three packages: The workpackage 1 was designed to improve the means for integrating, visualizing and comparing various sources of transcriptomic data through the development of a data platform and novel graphical user interface; The workpackage 2 intended to provide novel tools to characterize and functionally classify small ORFs identified by transcriptomics studies; Finally, the workpackage 3 intended to identify and charactere gene regulatory networks using novel network inference algorithms and incorporating information on the potential contributions of novel ncRNAs and small peptides.
Workpackage 1 - Development of a transcriptomic data analysis platform: A Java platform named BACNET with genome and heatmap viewers adapted to multiple types of transcriptomic and proteomic datasets was created. This platform allows to develop ‘omics’ desktop application and website for microbes and small eukaryotes. Thanks to BACNET platform, we created a public website named Listeriomics designed for biologists to browse and analyze Listeria ‘omics’ datasets. We integrated all 80 published Listeria complete genomes, all 350 transcriptomics and 25 proteomics datasets. Many tools for “omics” data analyses are available on this website: (1) An interactive genome viewer to display gene expression arrays, tiling arrays, and sequencing datasets along with proteomics and genomics datasets; (2) An expression and protein atlas that connects every gene, small RNA, antisense RNA or protein with the most relevant “omics” data; (3) A specific tool to explore protein conservation through the Listeria phylogenomic tree; and (4) A co-expression network tool to discover potential new regulations. The manuscript describing this website has been submitted for publication in a peer-review journal. This website was used in many projects in the laboratory of partner 1 for deciphering possible regulation of genes and small non-coding RNAs in Listeria species. The publication of the BACNET platform is in preparation. Its development will continue in the future.
Workpackage 2 - Analysis of small ORFs: A high throughput mass spectrometry approache was used to map all translation initiation start sites (TIS) of Listeria monocytogenes EGD-e strain after growth in rich media under three different conditions. We identified more than 1,400 TIS mapped on the Listeria genome. We detected a number of deviations from the current genome annotation and discovered an unexpectedly large number of internal TIS leading to alternative long and short polypeptides from a single transcript. Moreover, our proteogenomics analysis led to the discovery of six previously non-annotated miniproteins among which, one, Prli42, is conserved in firmicutes. Prli42 is membrane-anchored and interacts with orthologues of Bacillus stressosome components. In Gram-positive bacteria, the stressosome, a stress response cytoplasmic complex, relays external cues and activates the sigma B regulon. A regulon that activates genes involved in stress response. The stressosome is structurally well characterized in Bacillus, but how it senses stress remains elusive. Analysis of a series of Prli42 mutants demonstrated that Prli42 is important for sigma B activation and bacterial growth following oxidative stress and for survival in macrophages. We reconstituted the Listeria stressosome in vitro and visualized its supra-molecular structure by electron microscopy. Together, our N-terminonic approach unveiled the first translatome of Listeria and Prli42 as a long-sought link between stress and the stressosome. The manuscript describing these latter results is published in the journal Nature Microbiology.
Workpackage 3 – Regulatory network modeling: The partner 1 modeled the ncRNA-mRNA co-expression network and integrated it in the Listeriomics website. The partner 2 and 3 focused on two types of RNA mediated regulation: (i) RNA:protein interactions and (ii) non-coding RNA:mRNA interactions.
For the first type of regulation, a large database NBench has first been created to integrate various properties of DNA-Protein and RNA-Protein complexes. Using this database, we developed RBscore, a software to predict RNA-/DNA-binding residues in proteins and to visualize the prediction scores and features on protein structures. The scoring scheme of RBscore directly links feature values to nucleic acid binding probabilities and illustrates the nucleic acid binding energy funnel on the protein surface. To avoid dataset, binding site definition and assessment metric biases, we compared RBscore with 18 web servers and 3 stand-alone programs on 41 datasets, which demonstrated the high and stable accuracy of RBscore. A comprehensive comparison led us to develop a benchmark database named NBench. NBench and RBScore have been published in peer-review journal.
For the second type of regulation, an analysis platform named sRNA-TaBac integrating every available RNA target prediction software has been created, and is currently used to predict new non-coding RNA:mRNA interaction in Listeria. We are also performing a study of the evolution of small RNAs in Listeria species, to detect possible candidates linked to virulence. Two manuscripts are in preparation to publish sRNA-Tabac and the small RNAs evolution analysis.
L'auteur de ce résumé est le coordinateur du projet, qui est responsable du contenu de ce résumé. L'ANR décline par conséquent toute responsabilité quant à son contenu.