Restriction of SARS-CoV2 replication by non-coding and unreferenced genes in human cells – DARK-COVID
Restriction of SARS-CoV2 replication by non-coding and unreferenced genes in human cells
Our hypothesis is that human cells express unique sets of non-coding and unreferenced genes that exhibit potent inhibitory activities against SARS-CoV2. Our objectives are to identify and characterize these genes in human cells by combining an original approach that compares the total transcriptome changes in virally infected and neighboring uninfected cells with exhaustive and reference-free transcriptome analyses, loss-of-function screening strategies and mechanistic studies.
Objectives: define novel RNA involved in sarscov2 regulation
The DARK-COVID project articulates 3 main breakthroughs:<br />- The first whole transcriptomic analysis performed on sorted human cells positive or not for SARS-CoV2 antigens to get insights into the host response to SARS-CoV2 (WP1)<br />- The first exhaustive description of misregulated lncRNAs and unreferenced RNAs in SARS-CoV2 infected human cells, to identify specific biomarkers (WP2)<br />- The first functional characterization of novel genes, that modulate SARS-CoV2 replication in human cells, to reveal novel therapeutical targets for future personalized treatment (WP3)
Transcriptome
bioinformatics
genetic engineering
Transcriptomes (WP1)
Human cells were infected by Sarscov2 then sorted on their infection status. Virus-free cells, infected cells and non-infected cells were collected in 6 replicates each. Total RNA and polyA+ RNAs were extracted. Virus RNA and ribosomes RNA were depleted and 500ng RNA were sequenced on HiSeq sequencing machine using a illumine Trueseq protocol. 20 million unique reads were produced mapping on the human genome (GencodeV32).
Dataset analysis and lncRNA annotation (WP2)
Unique aligned reads were counted using HTseq count, and differential expression analysis performed using Deseq2 to evaluate misregulation of all annotated genes. Overall, the entire transcriptome was affected by the presence of the virus. Comparison between processed RNA (POLYA+) and total RNA show extreme deregulation of splicing and increased read-through in agreement with previous published observations. Several Gene Ontology were identified to be entirely misregulated. As expected, the interferon group of genes were shown to be 10-fold up regulated in infected cells. Strikingly, adjacent but non infected cells show no sign of stress-induced gene expression, as if the infected cells were disabled to send stress response signaling.
To annotate novel transcripts, scallop script (v0.10.5) was run on all replicates of total RNA-seq samples. The assemblies were merged and the contig kept were only the one with no overlap with gencode (v32) annotation (on the same strand). Our result show 5907 new transcripts detected, with 2523 significant expression in total RNA -seq a,d 745 in polyA-RNAseq (>=10 reads in all replicates in at least 1 condition).
The sarscov2 specific lncRNA or cov19lncRNA, were check for evidence of Transcription Start Site (presence of CAGE peak from FANTOM database within 1000 bases around the start site of the transcript) and classified in 2 classes: 1-weak TSS: 176 (78 in total and 36 in PolyA+) and 2-strong TSS: 141 (74 Total and 42 polyA+). These 2 Up and Down classes (Figure 1) are now ready for systematic screening of functional significance.
Work package 1 is now complete
Validation and inactivation of the lncRNA candidates (WP2)
Several lncRNA were selected to validate their expression changes by RTqPCR using several clones and get statistically significant evaluation of their expression levels in the different set of infected/non infected/mock cells.
Preliminary attempts to inactivate one lncRNA of interest indicate that siRNA pool oligos are the most efficient approach to get a rapid specific depletion of the target lncRNA. Ongoing experiments are on the way to define whether the replication of the virus is mis regulated upon loss of function.
Work package 2 is 80% complete.
Phenotypes screening (WP3)
At this stage, no phenotype has been yet validated upon loss of function.
Workpackage 3 is 0% completed.
The most striking results we observed is that the neighboring non infected cells, proximal to the infected cells do not show any sign of gene deregulation, in particular interferon genes that are usually up regulated when cells sense viral infections. Here they present an “asymptomatic” behavior as for the infected organisms with no signs of infection. As it has been shown in previous publications, infected cells are down regulated for their export pathways reducing intercellular stress signaling. Here this the first time we report indeed an absence of response from adjacent cells.
none
SARS-CoV-2 pandemic has reached unprecedented levels while vaccination solutions are still actively tested to be released in 2021. Unfortunately, still little is known on the mechanisms of host cellular response to the SARS-CoV-2 virus infection and replication. At the first level of response is the host transcriptome, enabling rapid activation of anti-viral pathways. However, 2 main limitations restrict the correct interpretation of the data. First, genome-wide investigations of host–pathogen interactions are often obscured by analyses of mixed populations of infected and uninfected cells. Transcriptome signals are then drawn into noise background, rendering impossible to efficiently and exhaustively portraying the full variation of the host transcripts. Second, the SARS-CoV-2 most recent transcriptomic and functional studies have been conducted on the coding genome and polyadenylated RNA, ignoring entire regions (Dark side of the genome) of the host transcriptome, which is composed among others, of long non-coding RNAs (lncRNAs) or unreferenced (Unref)-RNAs. LncRNAs are of specific interest since they are now acknowledged to play fundamental roles in cellular identity, development and disease progression through epigenetic or post-transcriptional controls of mRNA expression. Our hypothesis is that human cells express unique sets of non-coding and unreferenced genes that exhibit potent inhibitory activities against SARS-CoV-2. Our objectives are to identify and characterize these genes by combining whole transcriptome approach with genetic engineering on infected human cells. Our original strategy compares the total transcriptome changes in virally infected and neighboring uninfected cells, by applying single cell-like sorting of infected cells. Using a unique reference-free but also state in art comparative transcriptomic strategies, we will identify unreferenced genes and lncRNA whose expression are regulated upon SARS-CoV-2 infection. Genetic loss of function and cellular perturbation screening strategies will identify the unreferenced RNAs and lncRNAs which play a role for the virus replication. Finally, preliminary molecular characterizations will provide early insights into their mechanism of action. The DARK-COVID project articulates within the Physiopathogénie de la maladie (Axe 2) of the RA-COVID-19 call by responding to 3 critical aspects: development of cellular models (sorting of human lung cell line overexpressing the virus receptor ACE2), rapid cellular immune response and novel therapeutic targets such as the unreferenced and long non-coding RNAs. DARK-COVID is a multidisciplinary project involving human virology, state-of-the-art transcriptomic studies and original bioinformatics tools. It is a pioneer project in the field of antiviral non-coding and unreferenced genes. The consortium is composed of N. Jouvenet’s team at Institut Pasteur, expert in antiviral immunity and host/virus interactions and SARS-CoV-2 manipulation, and A. Morillon at Institut Curie, leader in lncRNA annotation and functional studies in disease progression. The 2 teams collaborate intensively since the beginning of the pandemic and already collected promising preliminary results that need to be further developed.
Project coordination
Antonin MORILLON (Institut Curie)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partnership
IC Institut Curie
IP Institut Pasteur
Help of the ANR 136,998 euros
Beginning and duration of the scientific project:
January 2021
- 12 Months