Literature detail

Computational Inference of Selection Underlying the Evolution of the Novel Coronavirus, Severe Acute Respiratory Syndrome Coronavirus 2.

Rachele Cagliani1 Diego Forni2 Mario Clerici3,4 Manuela Sironi2
Affiliations 4 institutions
  1. Scientific Institute IRCCS E. MEDEA, Bioinformatics, Bosisio Parini, Italy [email protected].
  2. Scientific Institute IRCCS E. MEDEA, Bioinformatics, Bosisio Parini, Italy.
  3. Department of Physiopathology and Transplantation, University of Milan, Milan, Italy.
  4. Don C. Gnocchi Foundation ONLUS, IRCCS, Milan, Italy.
PMID 32238584 2020 J Virol eng epublish
PubMed DOI Browse context

Article

Publication summary

The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that recently emerged in China is thought to have a bat origin, as its closest known relative (BatCoV RaTG13) was described previously in horseshoe bats. We analyzed the selective events that accompanied the divergence of SARS-CoV-2 from BatCoV RaTG13. To this end, we applied a population genetics-phylogenetics approach, which leverages within-population variation and divergence from an outgroup. Results indicated that most sites in the viral open reading frames (ORFs) evolved under conditions of strong to moderate purifying selection. The most highly constrained sequences corresponded to some nonstructural proteins (nsps) and to the M protein. Conversely, nsp1 and accessory ORFs, particularly ORF8, had a nonnegligible proportion of codons evolving under conditions of very weak purifying selection or close to selective neutrality. Overall, limited evidence of positive selection was detected. The 6 bona fide positively selected sites were located in the N protein, in ORF8, and in nsp1. A signal of positive selection was also detected in the receptor-binding motif (RBM) of the spike protein but most likely resulted from a recombination event that involved the BatCoV RaTG13 sequence. In line with previous data, we suggest that the common ancestor of SARS-CoV-2 and BatCoV RaTG13 encoded/encodes an RBM similar to that observed in SARS-CoV-2 itself and in some pangolin viruses. It is presently unknown whether the common ancestor still exists and, if so, which animals it infects. Our data, however, indicate that divergence of SARS-CoV-2 from BatCoV RaTG13 was accompanied by limited episodes of positive selection, suggesting that the common ancestor of the two viruses was poised for human infection.<b>IMPORTANCE</b> Coronaviruses are dangerous zoonotic pathogens; in the last 2 decades, three coronaviruses have crossed the species barrier and caused human epidemics. One of these is the recently emerged SARS-CoV-2. We investigated how, since its divergence from a closely related bat virus, natural selection shaped the genome of SARS-CoV-2. We found that distinct coding regions in the SARS-CoV-2 genome evolved under conditions of different degrees of constraint and are consequently more or less prone to tolerate amino acid substitutions. In practical terms, the level of constraint provides indications about which proteins/protein regions are better suited as possible targets for the development of antivirals or vaccines. We also detected limited signals of positive selection in three viral ORFs. However, we warn that, in the absence of knowledge about the chain of events that determined the human spillover, these signals should not be necessarily interpreted as evidence of an adaptation to our species.

N protein Nsp1 ORF8 positive selection SARS-CoV-2 spike protein viral evolution Evolution, Molecular Selection, Genetic Amino Acid Sequence Animals Betacoronavirus Chiroptera Coronavirus Infections COVID-19 Genome, Viral Humans Models, Molecular

Structured evidence records

Evidence records

3 total
1 records
Extraction confidence 0.95
Key finding

Phylogenetic and selection analyses revealed limited positive selection during SARS-CoV-2 divergence from BatCoV RaTG13, affecting N protein, ORF8, nsp1, and the spike receptor-binding motif.

Virus
Location
Not specified
Supporting text

We analyzed the selective events that accompanied the divergence of SARS-CoV-2 from BatCoV RaTG13. To this end, we applied a population genetics-phylogenetics approach... Results indicated that most sites in the viral open reading frames evolved under purifying selection, with several positively selected sites in the N protein, ORF8, and nsp1, and a recombination signal in the spike receptor-binding motif.

Genes or proteins
N protein; ORF8; nsp1; spike protein
Analysis methods
population genetics-phylogenetics approach; selection analysis; phylogenetic analysis
1 records
Extraction confidence 0.90
Key finding

Positive selection was detected in SARS-CoV-2 N, ORF8, nsp1, and the spike receptor-binding motif during divergence from BatCoV RaTG13, indicative of limited molecular adaptation.

Virus
Host
Not specified
Location
Not specified
Supporting text

Limited evidence of positive selection was detected. The 6 bona fide positively selected sites were located in the N protein, in ORF8, and in nsp1. A signal of positive selection was also detected in the receptor-binding motif (RBM) of the spike protein but most likely resulted from a recombination event that involved the BatCoV RaTG13 sequence.

Genes or proteins
N protein; ORF8; nsp1; spike protein
Mechanism types
positive_selection; molecular_adaptation
1 records
Extraction confidence 0.85
Key finding

Recombination involving the BatCoV RaTG13 sequence likely affected the receptor-binding motif of the SARS-CoV-2 spike protein.

Host
Not specified
Location
Not specified
Supporting text

A signal of positive selection was also detected in the receptor-binding motif (RBM) of the spike protein but most likely resulted from a recombination event that involved the BatCoV RaTG13 sequence.

Event type
recombination
Genes or segments
spike; receptor-binding motif