Affiliations 

  • 1 School of Data Sciences, Perdana University, Serdang, Selangor, Malaysia
  • 2 School of Medicine, Johns Hopkins University, Baltimore, MD, United States of America
PeerJ, 2020;7:e7954.
PMID: 32518710 DOI: 10.7717/peerj.7954

Abstract

Background: Influenza A (H5N1) virus is a global concern with potential as a pandemic threat. High sequence variability of influenza A viruses is a major challenge for effective vaccine design. A continuing goal towards this is a greater understanding of influenza A (H5N1) proteome sequence diversity in the context of the immune system (antigenic diversity), the dynamics of mutation, and effective strategies to overcome the diversity for vaccine design.

Methods: Herein, we report a comprehensive study of the dynamics of H5N1 mutations by analysis of the aligned overlapping nonamer positions (1-9, 2-10, etc.) of more than 13,000 protein sequences of avian and human influenza A (H5N1) viruses, reported over at least 50 years. Entropy calculations were performed on 9,408 overlapping nonamer position of the proteome to study the diversity in the context of immune system. The nonamers represent the predominant length of the binding cores for peptides recognized by the cellular immune system. To further dissect the sequence diversity, each overlapping nonamer position was quantitatively analyzed for four patterns of sequence diversity motifs: index, major, minor and unique.

Results: Almost all of the aligned overlapping nonamer positions of each viral proteome exhibited variants (major, minor, and unique) to the predominant index sequence. Each variant motif displayed a characteristic pattern of incidence change in relation to increased total variants. The major variant exhibited a restrictive pyramidal incidence pattern, with peak incidence at 50% total variants. Post this peak incidence, the minor variants became the predominant motif for majority of the positions. Unique variants, each sequence observed only once, were present at nearly all of the nonamer positions. The diversity motifs (index and variants) demonstrated complex inter-relationships, with motif switching being a common phenomenon. Additionally, 25 highly conserved sequences were identified to be shared across viruses of both hosts, with half conserved to several other influenza A subtypes.

Discussion: The presence of distinct sequences (nonatypes) at nearly all nonamer positions represents a large repertoire of reported viral variants in the proteome, which influence the variability dynamics of the viral population. This work elucidated and provided important insights on the components that make up the viral diversity, delineating inherent patterns in the organization of sequence changes that function in the viral fitness-selection. Additionally, it provides a catalogue of all the mutational changes involved in the dynamics of H5N1 viral diversity for both avian and human host populations. This work provides data relevant for the design of prophylactics and therapeutics that overcome the diversity of the virus, and can aid in the surveillance of existing and future strains of influenza viruses.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.