Literature detail

Synonymous mutations and the molecular evolution of SARS-CoV-2 origins.

Hongru Wang1 Lenore Pipes1 Rasmus Nielsen1,2,3
Affiliations 3 institutions
  1. Department of Integrative Biology, UC Berkeley, Berkeley, CA 94707, USA.
  2. Department of Statistics, UC Berkeley, Berkeley, CA 94707, USA.
  3. GLOBE institute, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark.
PMID 33500788 2021 Virus Evol eng epublish
PubMed DOI Browse context

Article

Publication summary

Human severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is most closely related, by average genetic distance, to two coronaviruses isolated from bats, RaTG13 and RmYN02. However, there is a segment of high amino acid similarity between human SARS-CoV-2 and a pangolin-isolated strain, GD410721, in the receptor-binding domain (RBD) of the spike protein, a pattern that can be caused by either recombination or by convergent amino acid evolution driven by natural selection. We perform a detailed analysis of the synonymous divergence, which is less likely to be affected by selection than amino acid divergence, between human SARS-CoV-2 and related strains. We show that the synonymous divergence between the bat-derived viruses and SARS-CoV-2 is larger than between GD410721 and SARS-CoV-2 in the RBD, providing strong additional support for the recombination hypothesis. However, the synonymous divergence between pangolin strain and SARS-CoV-2 is also relatively high, which is not consistent with a recent recombination between them, instead, it suggests a recombination into RaTG13. We also find a 14-fold increase in the <i>d<sub>N</sub></i> /<i>d<sub>S</sub></i> ratio from the lineage leading to SARS-CoV-2 to the strains of the current pandemic, suggesting that the vast majority of nonsynonymous mutations currently segregating within the human strains have a negative impact on viral fitness. Finally, we estimate that the time to the most recent common ancestor of SARS-CoV-2 and RaTG13 or RmYN02 based on synonymous divergence is 51.71 years (95% CI, 28.11-75.31) and 37.02 years (95% CI, 18.19-55.85), respectively.

molecular evolution SARS-CoV-2 synonymous mutations

Structured evidence records

Evidence records

2 total
1 records
Extraction confidence 0.95
Key finding

Comparative genomic analysis of SARS-CoV-2, RaTG13, RmYN02, and a pangolin strain GD410721 revealed synonymous divergence patterns consistent with a past recombination event in SARS-CoV-2 origins.

Virus
Location
Not specified
Supporting text

We perform a detailed analysis of the synonymous divergence between human SARS-CoV-2 and related strains, showing that divergence patterns among bat-derived viruses RaTG13 and RmYN02 and pangolin strain GD410721 support a recombination origin of SARS-CoV-2.

Genes or proteins
spike; receptor-binding domain
Analysis methods
synonymous divergence analysis; dN/dS ratio estimation; comparative genomic analysis
1 records
Extraction confidence 0.92
Key finding

Comparative genomic analyses indicate that the receptor-binding domain of the SARS-CoV-2 spike gene likely originated via recombination involving bat coronavirus RaTG13 and pangolin coronavirus GD410721.

Host
Not specified
Location
Not specified
Supporting text

There is a segment of high amino acid similarity between human SARS-CoV-2 and a pangolin-isolated strain, GD410721, in the receptor-binding domain (RBD) of the spike protein, a pattern that can be caused by either recombination or by convergent amino acid evolution... the synonymous divergence between the bat-derived viruses and SARS-CoV-2 is larger than between GD410721 and SARS-CoV-2 in the RBD, providing strong additional support for the recombination hypothesis... the data suggest a recombination into RaTG13.

Event type
recombination
Genes or segments
receptor-binding domain; spike