Literature detail

The origin and underlying driving forces of the SARS-CoV-2 outbreak.

Shu-Miaw Chaw1 Jui-Hung Tai1,2 Shi-Lun Chen3 Chia-Hung Hsieh4 Sui-Yuan Chang5 Shiou-Hwei Yeh6 Wei-Shiung Yang2 Pei-Jer Chen2 Hurng-Yi Wang7,8
Affiliations 8 institutions
  1. Biodiversity Research Center, Academia Sinica, Taipei, Taiwan.
  2. Graduate Institute of Clinical Medicine, College of Medicine, National Taiwan University, Taipei, Taiwan.
  3. Department of Life Science, National Taiwan Normal University, Taipei, Taiwan.
  4. Department of Forestry and Nature Conservation, Chinese Culture University, Taipei, Taiwan.
  5. Department of Clinical Laboratory Sciences and Medical Biotechnology, College of Medicine, National Taiwan University, Taipei, Taiwan.
  6. Department of Microbiology, College of Medicine, National Taiwan University, Taipei, Taiwan.
  7. Graduate Institute of Clinical Medicine, College of Medicine, National Taiwan University, Taipei, Taiwan. [email protected].
  8. Institute of Ecology and Evolutionary Biology, National Taiwan University, Taipei, Taiwan. [email protected].
PMID 32507105 2020 J Biomed Sci eng epublish
PubMed DOI Browse context

Article

Publication summary

SARS-CoV-2 began spreading in December 2019 and has since become a pandemic that has impacted many aspects of human society. Several issues concerning the origin, time of introduction to humans, evolutionary patterns, and underlying force driving the SARS-CoV-2 outbreak remain unclear. Genetic variation in 137 SARS-CoV-2 genomes and related coronaviruses as of 2/23/2020 was analyzed. After correcting for mutational bias, the excess of low frequency mutations on both synonymous and nonsynonymous sites was revealed which is consistent with the recent outbreak of the virus. In contrast to adaptive evolution previously reported for SARS-CoV during its brief epidemic in 2003, our analysis of SARS-CoV-2 genomes shows signs of relaxation. The sequence similarity in the spike receptor binding domain between SARS-CoV-2 and a sequence from pangolin is probably due to an ancient intergenomic introgression that occurred approximately 40 years ago. The current outbreak of SARS-CoV-2 was estimated to have originated on 12/11/2019 (95% HPD 11/13/2019-12/23/2019). The effective population size of the virus showed an approximately 20-fold increase from the onset of the outbreak to the lockdown of Wuhan (1/23/2020) and ceased to increase afterwards, demonstrating the effectiveness of social distancing in preventing its spread. Two mutations, 84S in orf8 protein and 251 V in orf3 protein, occurred coincidentally with human intervention. The former first appeared on 1/5/2020 and plateaued around 1/23/2020. The latter rapidly increased in frequency after 1/23/2020. Thus, the roles of these mutations on infectivity need to be elucidated. Genetic diversity of SARS-CoV-2 collected from China is two times higher than those derived from the rest of the world. A network analysis found that haplotypes collected from Wuhan were interior and had more mutational connections, both of which are consistent with the observation that the SARS-CoV-2 outbreak originated in China. SARS-CoV-2 might have cryptically circulated within humans for years before being discovered. Data from the early outbreak and hospital archives are needed to trace its evolutionary path and determine the critical steps required for effective spreading.

Coronavirus Mutational bias Population genetics Positive selection Disease Outbreaks Genetic Variation Genome, Viral Betacoronavirus China Coronavirus Infections COVID-19 Humans Pandemics Pneumonia, Viral SARS-CoV-2

Structured evidence records

Evidence records

7 total
4 records
Extraction confidence 0.95
Key finding

Phylogenetic comparison between SARS-CoV-2 and pangolin coronavirus sequences indicated an ancient introgression event in the spike gene.

Virus
Location
Not specified
Supporting text

Genetic variation in 137 SARS-CoV-2 genomes and related coronaviruses as of 2/23/2020 was analyzed ... The sequence similarity in the spike receptor binding domain between SARS-CoV-2 and a sequence from pangolin is probably due to an ancient intergenomic introgression that occurred approximately 40 years ago.

Genes or proteins
spike
Analysis methods
phylogenetic analysis; comparative genomic analysis
Extraction confidence 0.95
Key finding

Genome-wide analysis identified two mutations, 84S in ORF8 and 251V in ORF3, that rose in frequency during the SARS-CoV-2 outbreak.

Virus
Location
Not specified
Supporting text

Genetic variation in 137 SARS-CoV-2 genomes and related coronaviruses as of 2/23/2020 was analyzed ... Two mutations, 84S in orf8 protein and 251 V in orf3 protein, occurred coincidentally with human intervention.

Genes or proteins
ORF8; ORF3
Analysis methods
population genetic analysis
Extraction confidence 0.95
Key finding

Phylodynamic analysis showed a 20-fold expansion of SARS-CoV-2 effective population size prior to the Wuhan lockdown, indicating rapid early viral spread.

Virus
Location
Not specified
Supporting text

The effective population size of the virus showed an approximately 20-fold increase from the onset of the outbreak to the lockdown of Wuhan (1/23/2020) and ceased to increase afterwards, demonstrating the effectiveness of social distancing in preventing its spread.

Analysis methods
phylodynamic analysis
Extraction confidence 0.90
Key finding

Network analysis of SARS-CoV-2 haplotypes indicated that strains from Wuhan occupied central positions, supporting an origin in China.

Virus
Location
Not specified
Supporting text

A network analysis found that haplotypes collected from Wuhan were interior and had more mutational connections, both of which are consistent with the observation that the SARS-CoV-2 outbreak originated in China.

Analysis methods
haplotype network analysis
2 records
Extraction confidence 0.70
Key finding

SARS-CoV-2 acquired two amino-acid substitutions (orf8 84S and orf3 251V) during the outbreak that may influence infectivity.

Virus
Host
Not specified
Location
Not specified
Supporting text

Two mutations, 84S in orf8 protein and 251 V in orf3 protein, occurred coincidentally with human intervention. The former first appeared on 1/5/2020 and plateaued around 1/23/2020. The latter rapidly increased in frequency after 1/23/2020. Thus, the roles of these mutations on infectivity need to be elucidated.

Genes or proteins
orf8; orf3
Mutations
orf8 84S; orf3 251V
Mechanism types
infectivity; molecular_adaptation
Extraction confidence 0.70
Key finding

SARS-CoV-2 spike receptor-binding domain shows similarity to pangolin coronavirus due to ancient genomic introgression, suggesting historical receptor-binding adaptation.

Virus
Host
Not specified
Location
Not specified
Supporting text

The sequence similarity in the spike receptor binding domain between SARS-CoV-2 and a sequence from pangolin is probably due to an ancient intergenomic introgression that occurred approximately 40 years ago.

Genes or proteins
spike receptor binding domain
Mechanism types
receptor_binding; intergenomic_introgression
1 records
Extraction confidence 0.85
Key finding

SARS-CoV-2 likely experienced an ancient recombination event with a pangolin coronavirus in the spike receptor binding domain around 40 years ago.

Host
Not specified
Location
Not specified
Supporting text

The sequence similarity in the spike receptor binding domain between SARS-CoV-2 and a sequence from pangolin is probably due to an ancient intergenomic introgression that occurred approximately 40 years ago.

Event type
recombination
Genes or segments
spike receptor binding domain