Literature detail

Horizontal gene transfer and recombination analysis of SARS-CoV-2 genes helps discover its close relatives and shed light on its origin.

Vladimir Makarenkov1 Bogdan Mazoure2 Guillaume Rabusseau2,3 Pierre Legendre4
Affiliations 4 institutions
  1. Département d'informatique, Université du Québec à Montréal, Montreal, QC, Canada. [email protected].
  2. Montreal Institute for Learning Algorithms (Mila), Montreal, QC, Canada.
  3. Département d'informatique et de Recherche Opérationnelle, Université de Montréal and Canada CIFAR AI Chair, Montreal, QC, Canada.
  4. Département de Sciences Biologiques, Université de Montréal, C. P. 6128, Succursale Centre-Ville, Montreal, QC, H3C 3J7, Canada.
PMID 33514319 2021 BMC Ecol Evol eng epublish
PubMed DOI Browse context

Article

Publication summary

The SARS-CoV-2 pandemic is one of  the greatest  global medical and social challenges that have emerged in recent history. Human coronavirus strains discovered during previous SARS outbreaks have been hypothesized to pass from bats to humans using intermediate hosts, e.g. civets for SARS-CoV and camels for MERS-CoV. The discovery of an intermediate host of SARS-CoV-2 and the identification of specific mechanism of its emergence in humans are topics of primary evolutionary importance. In this study we investigate the evolutionary patterns of 11 main genes of SARS-CoV-2. Previous studies suggested that the genome of SARS-CoV-2 is highly similar to the horseshoe bat coronavirus RaTG13 for most of the genes and to some Malayan pangolin coronavirus (CoV) strains for the receptor binding (RB) domain of the spike protein. We provide a detailed list of statistically significant horizontal gene transfer and recombination events (both intergenic and intragenic) inferred for each of 11 main genes of the SARS-CoV-2 genome. Our analysis reveals that two continuous regions of genes S and N of SARS-CoV-2 may result from intragenic recombination between RaTG13 and Guangdong (GD) Pangolin CoVs. Statistically significant gene transfer-recombination events between RaTG13 and GD Pangolin CoV have been identified in region [1215-1425] of gene S and region [534-727] of gene N. Moreover, some statistically significant recombination events between the ancestors of SARS-CoV-2, RaTG13, GD Pangolin CoV and bat CoV ZC45-ZXC21 coronaviruses have been identified in genes ORF1ab, S, ORF3a, ORF7a, ORF8 and N. Furthermore, topology-based clustering of gene trees inferred for 25 CoV organisms revealed a three-way evolution of coronavirus genes, with gene phylogenies of ORF1ab, S and N forming the first cluster, gene phylogenies of ORF3a, E, M, ORF6, ORF7a, ORF7b and ORF8 forming the second cluster, and phylogeny of gene ORF10 forming the third cluster. The results of our horizontal gene transfer and recombination analysis suggest that SARS-CoV-2 could not only be a chimera virus resulting from recombination of the bat RaTG13 and Guangdong pangolin coronaviruses but also a close relative of the bat CoV ZC45 and ZXC21 strains. They also indicate that a GD pangolin may be an intermediate host of this dangerous virus.

Consensus tree Evolution of SARS-CoV-2 Gene evolution Horizontal gene transfer Phylogenetic network Recombination COVID-19 SARS-CoV-2 Animals Evolution, Molecular Gene Transfer, Horizontal Genome, Viral Humans

Structured evidence records

Evidence records

3 total
2 records
Extraction confidence 0.95
Key finding

Genomic recombination analyses showed intragenic recombination in SARS-CoV-2 genes S and N between bat coronavirus RaTG13 and Guangdong pangolin CoVs, revealing evolutionary relationships that may explain its origin.

Virus
Location
Not specified
Supporting text

We provide a detailed list of statistically significant horizontal gene transfer and recombination events (both intergenic and intragenic) inferred for each of 11 main genes of the SARS-CoV-2 genome. Our analysis reveals that two continuous regions of genes S and N of SARS-CoV-2 may result from intragenic recombination between RaTG13 and Guangdong (GD) Pangolin CoVs. Statistically significant gene transfer-recombination events between RaTG13 and GD Pangolin CoV have been identified in region [1215-1425] of gene S and region [534-727] of gene N.

Genes or proteins
S; N
Analysis methods
recombination analysis; horizontal gene transfer analysis; phylogenetic analysis
Extraction confidence 0.95
Key finding

Comparative genomic analysis identified significant recombination signals among ancestors of SARS-CoV-2, bat RaTG13, Guangdong pangolin CoV, and bat CoV ZC45/ZXC21 across multiple genes including ORF1ab, S, ORF3a, ORF7a, ORF8, and N.

Virus
Location
Not specified
Supporting text

Moreover, some statistically significant recombination events between the ancestors of SARS-CoV-2, RaTG13, GD Pangolin CoV and bat CoV ZC45-ZXC21 coronaviruses have been identified in genes ORF1ab, S, ORF3a, ORF7a, ORF8 and N.

Genes or proteins
ORF1ab; S; ORF3a; ORF7a; ORF8; N
Analysis methods
recombination analysis; phylogenetic network; comparative genomics
1 records
Extraction confidence 0.98
Key finding

SARS-CoV-2 contains intragenic recombination in the S and N genes involving bat RaTG13 and Guangdong pangolin coronaviruses.

Host
Not specified
Location
Not specified
Supporting text

Our analysis reveals that two continuous regions of genes S and N of SARS-CoV-2 may result from intragenic recombination between RaTG13 and Guangdong (GD) Pangolin CoVs.

Event type
recombination
Genes or segments
S; N