Literature detail

Confronting data sparsity to identify potential sources of Zika virus spillover infection among primates.

Barbara A Han1 Subhabrata Majumdar2 Flavio P Calmon3 Benjamin S Glicksberg4 Raya Horesh5 Abhishek Kumar Adam Perer6 Elisa B von Marschall7 Dennis Wei5 Aleksandra Mojsilović5 Kush R Varshney5
Affiliations 7 institutions
  1. Cary Institute of Ecosystem Studies, Box AB Millbrook, NY 12545, USA. Electronic address: [email protected].
  2. University of Florida Informatics Institute, 432 Newell Drive, CISE Bldg E251, Gainesville, FL 32611, USA.
  3. Harvard University, 29 Oxford St, Cambridge, MA 02138, USA.
  4. Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, 94158, USA.
  5. IBM Research, 1101 Kitchawan Rd, Yorktown Heights, NY 10598, USA.
  6. Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213, USA.
  7. IBM Watson Media & Weather, 550 Assembly St, Columbia, SC 29201, USA.
PMID 30902616 2019 Epidemics eng ppublish
PubMed DOI Browse context

Article

Publication summary

The recent Zika virus (ZIKV) epidemic in the Americas ranks among the largest outbreaks in modern times. Like other mosquito-borne flaviviruses, ZIKV circulates in sylvatic cycles among primates that can serve as reservoirs of spillover infection to humans. Identifying sylvatic reservoirs is critical to mitigating spillover risk, but relevant surveillance and biological data remain limited for this and most other zoonoses. We confronted this data sparsity by combining a machine learning method, Bayesian multi-label learning, with a multiple imputation method on primate traits. The resulting models distinguished flavivirus-positive primates with 82% accuracy and suggest that species posing the greatest spillover risk are also among the best adapted to human habitations. Given pervasive data sparsity describing animal hosts, and the virtual guarantee of data sparsity in scenarios involving novel or emerging zoonoses, we show that computational methods can be useful in extracting actionable inference from available data to support improved epidemiological response and prevention.

Arbovirus Bayesian multi-task learning Ecology Flavivirus Imputation Machine learning Neotropical Non-human primate Predictive analytics Spillback Spillover Surveillance Animals Bayes Theorem Humans Primates Risk Zika Virus

Structured evidence records

Evidence records

3 total
1 records
Extraction confidence 0.90
Key finding

Zika virus circulates among non-human primates in sylvatic cycles, and species most adapted to human habitats are predicted to be key reservoirs contributing to spillover risk.

Virus
Host
Location
Supporting text

Like other mosquito-borne flaviviruses, ZIKV circulates in sylvatic cycles among primates that can serve as reservoirs of spillover infection to humans. The resulting models distinguished flavivirus-positive primates with 82% accuracy and suggest that species posing the greatest spillover risk are also among the best adapted to human habitations.

Method
machine learning; Bayesian multi-label learning; multiple imputation
Geographic raw
Neotropical
1 records
Extraction confidence 0.70
Key finding

Zika virus is transmitted from non-human primates to humans through spillover from sylvatic cycles.

Virus
Location
Supporting text

ZIKV circulates in sylvatic cycles among primates that can serve as reservoirs of spillover infection to humans.

Method
Bayesian multi-label learning; multiple imputation
Study design
computational modeling
Transmission direction
animal-to-human
Geographic raw
Americas
1 records
Extraction confidence 0.80
Key finding

Computational modeling identified potential non-human primate reservoirs of Zika virus to inform and improve surveillance of sylvatic cycles and spillover risk.

Virus
Host
Location
Supporting text

Like other mosquito-borne flaviviruses, ZIKV circulates in sylvatic cycles among primates that can serve as reservoirs of spillover infection to humans. ... we show that computational methods can be useful in extracting actionable inference from available data to support improved epidemiological response and prevention.

Method
machine learning; Bayesian multi-label learning; multiple imputation
Geographic raw
Americas