VIROLOGY

crossm A Pyrosequencing-Based Approach to High-Throughput Identification of Influenza A(H3N2) Virus Clades Harboring Antigenic Drift Variants Vasiliy P. Mishin,a Tatiana Baranovich,a,b Rebecca Garten,a Anton Chesnokov,a,c Anwar I. Abd Elal,a,c Michelle Adamczyk,a,c Jennifer LaPlante,d Kirsten St. George,d Alicia M. Fry,a John Barnes,a Stephanie C. Chester,e Xiyan Xu,a Jacqueline M. Katz,a David E. Wentworth,a Larisa V. Gubarevaa Influenza Division, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention (CDC), Collaborating Center for Surveillance, Epidemiology and Control of Influenza, Atlanta, Georgia, USAa; Carter Consulting, Inc., Atlanta, Georgia, USAb; Battelle Memorial Institute, Atlanta, Georgia, USAc; Wadsworth Center, New York State Department of Health (NYSDOH), Albany, New York, USAd; Association of Public Health Laboratories, Silver Spring, Maryland, USAe

ABSTRACT The rapid evolution of influenza A(H3N2) viruses necessitates close mon-

itoring of their antigenic properties so the emergence and spread of antigenic drift variants can be rapidly identified. Changes in hemagglutinin (HA) acquired by contemporary A(H3N2) viruses hinder antigenic characterization by traditional methods, thus complicating vaccine strain selection. Sequence-based approaches have been used to infer virus antigenicity; however, they are time consuming and midthroughput. To facilitate virological surveillance and epidemiological studies, we developed and validated a pyrosequencing approach that enables identification of six HA clades of contemporary A(H3N2) viruses. The identification scheme of viruses of the H3 clades 3C.2, 3C.2a, 3C.2b, 3C.3, 3C.3a, and 3C.3b is based on the interrogation of five single nucleotide polymorphisms (SNPs) within three neighboring HA regions, namely 412 to 431, 465 to 481, and 559 to 571. Two bioinformatics tools, IdentiFire (Qiagen) and FireComb (developed in-house), were utilized to expedite pyrosequencing data analysis. The assay’s analytical sensitivity was 10 focus forming units, and respiratory specimens with threshold cycle (CT) values of ⬍34 typically produced good quality pyrograms. When applied to 120 A(H3N2) virus isolates and 27 respiratory specimens, the assay displayed 100% agreement with clades determined by HA sequencing coupled with phylogenetics. The multi-SNP analysis described here was readily adopted by another laboratory with pyrosequencing capabilities. The implementation of this approach enhanced the findings from virological surveillance and epidemiological studies between 2013 and 2016, which examined more than 3,000 A(H3N2) viruses.

Received 2 September 2016 Returned for modification 28 September 2016 Accepted 18 October 2016 Accepted manuscript posted online 26 October 2016 Citation Mishin VP, Baranovich T, Garten R, Chesnokov A, Abd Elal AI, Adamczyk M, LaPlante J, St George K, Fry AM, Barnes J, Chester SC, Xu X, Katz JM, Wentworth DE, Gubareva LV. 2017. A pyrosequencing-based approach to high-throughput identification of influenza A(H3N2) virus clades harboring antigenic drift variants. J Clin Microbiol 55:145–154. https://doi.org/10.1128/ JCM.01840-16. Editor Alexander J. McAdam, Boston Children's Hospital Copyright © 2016 American Society for Microbiology. All Rights Reserved. Address correspondence to Larisa V. Gubareva, [email protected]. V.P.M. and T.B. contributed equally to this work.

KEYWORDS A(H3N2), genotyping, influenza, pyrosequencing

S

ince their introduction into the human population in 1968, influenza A(H3N2) viruses have been responsible for seasonal epidemics and associated with both a prolonged duration of the epidemic season and a greater disease severity (1–4). The hemagglutinin (HA) glycoprotein is the major surface antigen of influenza viruses; it is also responsible for the receptor binding and membrane fusion required for entrance into susceptible host cells (5). Decades ago, this glycoprotein was named “hemagglutinin” for its ability to bind and bridge erythrocytes. The other surface antigen, neuraminidase (NA), is a receptor-destroying enzyme (6). As with other seasonal influenza viruses, A(H3N2) viruses undergo genetic and antigenic changes, called antigenic drift, January 2017 Volume 55 Issue 1

Journal of Clinical Microbiology

jcm.asm.org 145

Mishin et al.

Journal of Clinical Microbiology

which allow them to elude the host immune response (3, 4), necessitating frequent updates of seasonal influenza vaccines. Historically, the A(H3N2) vaccine component has been updated more frequently due to the faster antigenic drift in this subtype (7). Worldwide influenza surveillance is conducted by laboratories participating in the WHO Global Influenza Surveillance and Response System (GISRS). The roles of the WHO GISRS are to monitor antigenic and genetic evolution of influenza viruses and to provide recommendations for disease control measures, including updates to the seasonal vaccine composition. Hemagglutination inhibition (HI) is considered a gold-standard, high-throughput phenotypic characterization assay for influenza viruses and is commonly used for understanding their antigenic evolution (8). The ability of recent A(H3N2) viruses to agglutinate erythrocytes of avian and mammalian origins has been severely compromised (9, 10). This new trait curtails the usefulness of the HI assay in detecting antigenic drift variants and hinders vaccine strain selection for A(H3N2) viruses. Furthermore, recent A(H3N2) viruses propagated in cell culture (e.g., in MDCK cells) often acquire changes in NA, transforming this receptor-destroying enzyme into a receptor-binding glycoprotein (11–13). This feature further impedes antigenic analysis with the HI assay. Evolutionary changes in the viral genome are monitored by nucleotide sequencing and phylogenetic analyses (14). Correlating sequencing and antigenic data facilitates identification of the HA changes responsible for antigenic differences between the vaccine and emerging viruses (8). Based on the HA phylogeny, recent A(H3N2) viruses were divided into seven clades, 1 to 7 (14, 15), and their subsequent emerging subgroups have been given alphanumeric names. For example, A/Texas/50/2012, which was recommended as the vaccine strain for the 2013 to 2014 and 2014 to 2015 North Hemisphere seasons (16, 17), belongs to the H3 clade 3C.1. By the spring-summer of 2014, A(H3N2) viruses belonging to six H3 clades (3C.2, 3C.2a, 3C.2b, 3C.3, 3C.3a, and 3C.3b) were cocirculating globally (18–20). Initial information gained from HI data indicated antigenic differences between the vaccine strain, A/Texas/50/2012, and circulating viruses from two H3 clades, 3C.2a and 3C.3a. Notably, viruses from these two clades often displayed a diminished ability to agglutinate red blood cells, thereby hindering the ability to test them in the HI assay (21). Therefore, monitoring the spread of these antigenically drifted viruses requires sequencing and phylogenetic analysis of the HA genes. Although informative, genome segment sequencing and phylogenetic analysis are time consuming, labor-intensive, and midthroughput. For expediting clade identification, it is essential to develop a method that facilitates high-throughput identification of cocirculating A(H3N2) virus clades. Pyrosequencing, utilizing the Qiagen testing platform, is a powerful genotyping approach used in many different fields of research and clinical diagnostics (reviewed in references 22–24). In this study, we describe the development and validation of a pyrosequencing-based strategy for differentiating the clades/subgroups of recent influenza A(H3N2) viruses and show how it can be used to infer antigenic properties and monitor clade prevalence in the human population. RESULTS Selection of SNPs and assay design. Clades 3C.2a and 3C.3a were reported to harbor HA antigenic drift variants and each has a different substitution at amino acid residue position 159 (nucleotide triplet 475 to 477) that resides at the receptor-binding site of HA (21). With a few exceptions, clade 3C.2a viruses carry amino acid substitution F159Y (TTC¡TAC), while clade 3C.3a viruses contain F159S (TTC¡TCC) (Fig. 1A). Therefore, the single nucleotide polymorphism (SNP) at nucleotide 476 (476-SNP) can be used to distinguish between the antigenically drifted variants and vaccine-like viruses. Pyrosequencing is well suited for detecting SNPs because an individual pyrosequencing reaction generates a short sequence (20 to 40 nucleotides [nt]) at the region of interest. Although the detection of 476-SNP is straightforward, it alone does not provide the necessary genetic information for identifying the remaining clades in January 2017 Volume 55 Issue 1

jcm.asm.org 146

Pyrosequencing for Influenza H3 Clade Identification

Journal of Clinical Microbiology

FIG 1 Hemagglutinin clades (genotypes) of influenza A(H3N2) viruses from the 2013 to 2014 and 2014 to 2015 influenza seasons. (A) Phylogenetic tree of HA gene sequences (1,653 nt). Influenza vaccine strain for the 2013 to 2014 influenza season is shown in bold italics. Selected amino acid differences among representative sequences of each clade are shown. Amino acids at position 159 (nucleotide triplet 475 to 477) are shown in bold. (B) HA sequence alignment of representative viruses from six clades. RT-PCR primers are indicated by solid arrows, RT-PCR amplicons by the black solid lines, sequencing primers by the broken arrows, and regions 1, 2, and 3 are denoted by black squares. The nucleotide triplet encoding the amino acid at position 159 is marked with bold black dots.

circulation. Furthermore, a 476-SNP does not enable proper identification of a few 3C.3a viruses that, like 3C.2a viruses, contain F159Y (TTC¡TAC) (Table S3). Thus, additional markers for distinguishing individual clades were needed. Therefore, we assembled a set of 975 HA sequences of influenza A(H3N2) viruses (see Materials and Methods; see also Table S3 in the supplemental material). According to the phylogenetic analysis, each of these sequences belonged to one of the six H3 clades, 3C.2, 3C.2a, 3C.2b, 3C.3, 3C.3a, or 3C.3b (an HA phylogenetic tree is not shown). Sequences flanking the 476-SNP were interrogated for additional markers (i.e., SNPs) to distinguish the clades of interest (Fig. 1B). We identified five SNPs (at nucleotides 412, 424, 431, 476, and 561) within three pyrosequencing regions of the HA gene that together enabled unequivocal identification of the six cocirculating clades (Fig. 2, Table 1). The deductive algorithm for clade identification was that 424-SNP separates the 3C.2-related viruses (A424) from the 3C.3-related viruses (G424), and the combination of A424 and A431 is sufficient to classify 3C.2 viruses (Fig. 2, region 1). By expanding the analysis to region 2, three additional subgroups, 3C.2a, 3C.2b, and 3C.3a, were identified. Specifically, G431 and A476, in combination, unequivocally identified the 3C.2a viruses (Fig. 2). The three SNPs, G412, G431, and T476, identified the 3C.2b viruses, while the 3C.3a viruses share T412, G424, A431, and C476. Region 3 was added to the pyrosequencing scheme for

FIG 2 Algorithm for HA clade identification using the pyrosequencing assay. The assay resolves the short sequences within regions 1 to 3 of the HA. SNPs used for identification of clade 3C.2 are in red, for 3C.2a, 3C.2b, and 3C.3a are in green, and in purple for 3C.3 and 3C.3b. *, nucleotide variations, as listed in Table S3 in the supplemental material. January 2017 Volume 55 Issue 1

jcm.asm.org 147

Mishin et al.

Journal of Clinical Microbiology

TABLE 1 SNP profiles of the six clades Nucleotide positiona (amino acid position) HA clade 3C.2 3C.2a 3C.2b 3C.3 3C.3a 3C.3b aBold

412 (138) Gb G G G T Gb

424 (142) A Ab A G G G

431 (144) A G G Ab A A

476 (159) T A T Tb C T

561 (187) G Gb G Gb Gb A

font designates defining SNPs for the respective clade. variations were observed as outlined in Table S1 in the supplemental material.

bNucleotide

discrimination of the two remaining clades, 3C.3 and 3C.3b. A561 is shared by all 3C.3b sequences but is absent in 3C.3 sequences (Fig. 2). The 975 sequences were analyzed for the presence of highly conserved sequences proximal to each of the three regions for the design of primers for pyrosequencing reactions (Fig. 1B). Two conserved sequences were also selected for the forward and biotinylated-reverse primers for generating a single reverse transcription (RT)-PCR amplicon that encompasses all three pyrosequencing regions (Fig. 1B). To test the proposed identification scheme and assay design, we used a panel of six A(H3N2) virus isolates that represent each of the distinct clades of interest (Table 2). RT-PCR yielded amplicons of the expected size, 276 bp (data not shown). Pyrosequencing reactions were then carried out using the primers designed for the specific regions. For each reference virus, good quality pyrograms and the expected sequence readouts were obtained (see Fig. S1 to S3). The pyrosequencing data were analyzed and the deduced clades were in complete agreement with the clades determined by codon complete HA sequencing and phylogenetic analysis (data not shown). Bioinformatics tools for data analysis. Pyrosequencing platforms offer a highthroughput format conducive to virological surveillance needs. However, the proposed identification scheme necessitated a visual analysis of three sequence readouts for each virus, accurate recording of the results of five SNP analyses, and deduction of the clade’s identity. Bioinformatics tools were developed to expedite the sequence analysis and curtail human error. IdentiFire software (Qiagen) was utilized for analyzing the sequences generated by pyrosequencing. This software performs an automated comparison of sequence readouts against reference sequences, which are deposited into a custom sequence library. To this end, we constructed a library of reference nucleotide sequences (also known as identifiers) for each of the three target regions using the data set of 975 HA sequences. This library contained 27, 25, and 7 unique identifiers for regions 1, 2, and 3, respectively. If an exact match was found, the software reported it as the identifier (e.g., R1-3C.2a-v1 identifies region 1, clade 3C.2a, variant 1). Noteworthy, if an exact match was not found, a sequence readout was flagged for further investigation. To identify the clades, a second bioinformatics tool that automatically combined the results of the IdentiFire analyses was needed. FireComb, an in-house Excel-based

TABLE 2 Validation panel of reference A(H3N2) influenza viruses and analytical sensitivity H3 cladea 3C.2 3C.2a 3C.2b 3C.3 3C.3a 3C.3b

GISAID accession Alphanumeric name no. A/Puerto Rico/22/2014 EPI540034 A/Florida/11/2015 EPI540034 A/Utah/08/2014 EPI533499 A/Florida/37/2014 EPI545749 A/Oregon/09/2014 EPI551035 A/Texas/42/2014 EPI551067

Titerb (log10 FFU50/ml) 7.38 7.35 7.27 7.92 7.90 7.20

RT-PCR result (test 1/test 2/test 3/test 4) for indicated FFU/reaction 104.0 ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹

103.0 ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹

102.0 ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹

101.0 ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹ ⫹/⫹/⫹/⫹

100.0 ⫹/⫺/⫺/⫹ ⫹/⫺/⫹/⫺ ⫹/⫺/⫺/⫹ ⫺/⫹/⫺/⫹ ⫺/⫹/⫺/⫹ ⫹/⫺/⫺/⫹

10ⴚ1.0 ⫺/⫺/⫺/⫺ ⫺/⫺/⫺/⫺ ⫺/⫺/⫺/⫺ ⫺/⫺/⫺/⫹ ⫺/⫺/⫺/⫺ ⫺/⫺/⫺/⫹

Limit of detection (FFU/mlc) 103.3 103.3 103.3 103.3 103.3 103.3

aDetermined

by full HA gene sequencing/phylogenetic analysis. in MDCK-SIAT1 cells. cCorresponds to 10 FFU per RT-PCR (5 ␮l). bDetermined

January 2017 Volume 55 Issue 1

jcm.asm.org 148

Pyrosequencing for Influenza H3 Clade Identification

Journal of Clinical Microbiology

TABLE 3 H3 clades identified by pyrosequencing assays in respiratory specimens CT H3 clade 3C.2a 3C.3 3C.3a 3C.3b “Unknown”c

No. of virusesa 12 4 8 8 8

Value ⴞ SD 29.97 ⫾ 3.36 28.83 ⫾ 4.42 28.65 ⫾ 1.81 30.42 ⫾ 1.04 35.96 ⫾ 1.79

Range 25.09–34.06b 25.36–36.20b 26.23–31.07 28.10–31.60 32.76–38.41

aRespiratory

specimens were typed/subtyped using the CDC real-time RT-PCR diagnostic assay. has 1 respiratory specimen with CT of ⬎34. cThe combination of identifiers was not found and the results were reported as “Unknown” by the FireComb program; CT values for most of these respiratory specimens were above the cutoff value (CT ⬎34) established for the assay. bGroup

program, was created to infer the clade based on the combination of the three identifiers. FireComb ensures that the clade identifiers are in agreement. For example, if the result for region 1 was an identifier R1-3C.3/3C.3b-v1 but the region 2 result was R2-3C.2a-v1, the contradiction would be caught by FireComb. FireComb also flags a previously unseen combination of SNP identifiers as an indeterminate outcome (“unknown”). In such instances, retesting and additional investigation were required. In the subsequent experiments, both of the bioinformatics tools (IdentiFire and FireComb) were utilized for the analysis of pyrosequencing data. Assay validation. The pyrosequencing data obtained from six reference viruses representing the clades of interest (see above) were analyzed using IdentiFire and FireComb. The efficiency and reproducibility of the pyrosequencing assay were determined by performing three analytical runs in an analyst-blinded fashion on two consecutive days. The efficiency (the number of samples yielding the predicted genotype divided by the total number of samples analyzed) was 100% (data not shown). To validate the robustness of the assay, we examined a set of 120 A(H3N2) virus isolates collected in the United States between October 2014 and December 2014. A total of five clades were identified, namely 3C.2a (n ⫽ 81), 3C.2b (n ⫽ 1), 3C.3 (n ⫽ 28), 3C.3a (n ⫽ 5), and 3C.3b (n ⫽ 5), and the results were in 100% agreement with the clades identified by the codon complete HA gene sequencing (see Table S4). Analytical sensitivity. Next, we investigated the limit of detection (LOD) of the pyrosequencing assay by testing dilutions of the six viruses from the validation panel (Table 2). The LOD was defined as the lowest infectious virus titer at which all replicates yielded pyrosequencing data and correct clade identification. The LOD for each of the 6 clades was 10 focus-forming units (FFU)/reaction, which corresponds to 103.3 FFU/ml (Table 2). This sensitivity indicated the assay was suitable for testing viruses present in respiratory specimens. Analysis of viruses in original respiratory specimens. To further evaluate the assay’s performance, we tested a set of 27 respiratory specimens for which H3 clades were determined using HA gene sequencing/phylogenetic analyses (see Table S5). All 27 specimens were correctly classified by this pyrosequencing strategy. Subsequently, we tested 40 A(H3N2) respiratory specimens with threshold cycle (CT) values of ⬎25 that were collected between December 2014 and February 2015 (Table 3). The clade was identified for 30 respiratory specimens that had CT values equal to or less than 34; it also identified two that had CT values higher than 34 (Fig. 3). Among those that failed the test, seven had CT values higher than 34, while one had a lower value (32.76). Therefore, the CT value of 34 was established as the cutoff for inclusion of respiratory specimens for genetic classification using this pyrosequencing strategy. Public health laboratory. The pyrosequencing technology is available to many U.S. public health laboratories conducting influenza virological surveillance. Therefore, for assessing the robustness and readiness of the H3 clade identification pyrosequencingbased approach for adoption by other surveillance laboratories, the standard operation procedure was shared with the Wadsworth Center, NYSDOH. The CDC also provided the January 2017 Volume 55 Issue 1

jcm.asm.org 149

Mishin et al.

Journal of Clinical Microbiology

FIG 3 Determination of CT value for original respiratory specimens. CT values were determined using the CDC real-time RT-PCR diagnostic assay (detailed description of the assay is available at http://www.accessdata.fda.gov/cdrh_docs/pdf8/k080570.pdf).

necessary primers, reference A(H3N2) viruses, and bioinformatics tools. Next, a blinded panel of A(H3N2) viruses, which contained 12 virus isolates and six clinical specimens, was provided for assessment. The results of the H3 clade identification for the test panel samples were 100% accurate (see Table S6). The complete project was accomplished in a short period of time. Therefore, the clade identification of influenza A(H3N2) viruses using the pyrosequencing-based approach can be readily adopted by laboratories proficient with pyrosequencing technology, and can thus be utilized to enhance antigenic drift monitoring. DISCUSSION Influenza is an important vaccine-preventable disease, and vaccine effectiveness is highest when its components are antigenically similar to predominantly circulating strains. Historically, the HI assay provided high-throughput antigenic characterization of influenza viruses. However, receptor binding changes in HA of recent A(H3N2) viruses have diminished the usefulness of the HI assay. This presents a challenge for virus surveillance as it relates to vaccine strain selection and underscores the importance of developing new high-throughput methods for monitoring seasonal A(H3N2) viruses. In recent years, nucleotide sequencing methods have improved rapidly through technological advances, such as various next generation sequencing (NGS) approaches, and the development of powerful bioinformatics tools. While sequencing coupled with phylogenetics provides information on the virus evolution, data from antigenic analysis is required for detecting the emergence of antigenic drift variants. Once genetic signatures of antigenic drift are identified, they can be used to determine the prevalence of drift genotypes and to infer antigenic properties (7). Over the years, a number of RT-PCR-based approaches have been developed for expediting influenza virus characterization. For example, clade distribution of A(H3N2) viruses circulating between 1997 and 2006 was studied using the RT-PCR/electrospray ionization mass spectrometry (ESI MS), an approach based on the identification of unique nucleotide compositions of less variable internal genes (25). Although powerful, this approach is limited by access to costly equipment and requires an external database (Abbott Molecular, Carlsbad, CA) for data analysis and interpretation. Conversely, pyrosequencing (Qiagen) has been utilized widely by surveillance laboratories in the United States and other countries, mainly for monitoring drug resistance among influenza viruses (26–30). Unlike drug resistance testing, which commonly requires the analysis of a single SNP, identification of six H3 clades presents a more challenging task. Upon interrogation of a large set of H3 sequences with known clade identities, three short information-rich regions were selected for assay design. By applying the pyrosequencing technology combined with two bioinformatics tools, IdentiFire (Qiagen) and FireComb (in-house), we achieved high-throughput and accurate clade determination of influenza A(H3N2) viruses collected globally during two consecutive influenza seasons between 2013 and 2015. January 2017 Volume 55 Issue 1

jcm.asm.org 150

Pyrosequencing for Influenza H3 Clade Identification

Journal of Clinical Microbiology

The assay entails the generation of a single RT-PCR amplicon that encompasses the three target regions of the HA gene, which are analyzed using three pyrosequencing reactions (one reaction per target region). The resulting sequence readouts are analyzed to determine a total of five SNPs necessary for clade discrimination. Two bioinformatics tools were used to facilitate data analysis, expedite testing, and reduce human error. The first of these bioinformatics tools, IdentiFire software, requires the use of a library of reference sequences (unique identifiers). Such a library was built to reflect all known genetic variations within each region. Notably, the library contained a large number of reference sequences, which is a reflection of the high genetic variability of HA due to natural evolution and to artifacts of virus culturing (19, 31). The result of the IdentiFire software was only accepted if the homology with the reference sequence was 100%. When the homology was lower and retesting did not resolve the issue, further investigation (i.e., complete HA sequencing and phylogenetic analysis) was carried out; the IdentiFire library was updated if a new sequence variation was detected. FireComb, the second of the bioinformatics tools, was used for expediting the process of combining and reconciling the results generated by IdentiFire. This software either makes the final clade determination or, if the combination of unique identifiers is new, flags the result as unknown. Of note, there were no misidentifications of clades (false-positive results) using the developed pyrosequencing approach. We demonstrated that the assay can readily be utilized by surveillance laboratories, thus enhancing the capabilities for monitoring antigenic drift variants. Not only did this approach meet the requirements of influenza surveillance (identification of all currently circulating H3 clades), but it was also successfully applied for testing thousands of viruses collected during the 2013 to 2014 and 2014 to 2015 seasons. Additionally, the assay proved useful in the expedient characterization of respiratory specimens collected for vaccine effectiveness studies during the 2014 to 2015 (32, 33) and 2015 to 2016 (unpublished) seasons. The attractiveness of this approach stems from the fast turnaround time, highthroughput, relatively low cost, and availability of equipment and trained personnel at many public health laboratories. A 96-well format for both viral (v)RNA extraction and pyrosequencing (Pyromark Q96 ID) is conducive for high-throughput testing. Starting from the RT-PCR amplification step, the entire process requires about 6 h to analyze 96 samples. This is a substantial reduction in turnaround time compared to that for HA gene sequencing, using either Sanger or NGS technologies. Moreover, clade identification by pyrosequencing can be applied to specimens with high CT values that typically fail complete HA gene amplification required for phylogenetic analysis. Despite the highlighted advantages, pyrosequencing is not a substitute for either antigenic characterization (e.g., the HI assay) or phylogenetic analysis. The latter methods provide detailed information and are vital for the successful design of the pyrosequencing approach described in this study. It is expected that the IdentiFire library and primers for pyrosequencing will require periodic updates to reflect HA changes acquired by continuously evolving influenza viruses. In conclusion, we developed a pyrosequencing-based assay for the rapid, reproducible and unambiguous clade identification of recent A(H3N2) influenza viruses. This approach represents an important advance in the methodology for virological surveillance. MATERIALS AND METHODS Influenza virus isolates and respiratory specimens. We used A(H3N2) influenza virus isolates (n ⫽ 120) and respiratory specimens (throat swabs, nasal swabs and washes, nasopharyngeal swabs, and sputum specimens) (n ⫽ 27) that were collected between 1 October 2013 and 1 July 2015 in the United States and other countries and were submitted to the WHO Collaborating Center for Surveillance, Epidemiology and Control of Influenza at the CDC in Atlanta, Georgia for virological surveillance. The detailed list of viruses is provided in Tables S1 and S2 in the supplemental material. Viruses were propagated in MDCK-SIAT1 cells or embryonated chicken eggs as described elsewhere (34). Samples were subtyped based the CDC real-time RT-PCR Diagnostic assay and/or HI assay (35). Infectious virus titers were determined using focus formation assays and MDCK-SIAT1 cells (36). The resulted virus titers were expressed as the number of focus forming units (FFU) per milliliter. Virological surveillance is deemed a public health practice, and as such, is exempt from review by the CDC internal review board. January 2017 Volume 55 Issue 1

jcm.asm.org 151

Mishin et al.

Journal of Clinical Microbiology

TABLE 4 RT-PCR and sequencing primers Primer H3-F370 H3-R645-biotin H3clade-R1-F395 H3clade-R2-F445 H3clade-R3-F545 aH3 bNA,

Purpose RT-PCR forward RT-PCR feverse Sequencing region 1 Sequencing region 2 Sequencing region 3

Sequence (5=–3=) AGC TTC AAT TGG GCT GGA GTC Biotin-CG GGA TTA CAG CTT GTT GGC AAA ACG GAA CAA GTT CT AGT AGA TTA AAT TGG TTG AC TTC ACC ACC CGG GTA

Target nucleotide regiona NAb

Amplicon size (bp) 276

412–435 465–481 560–571

NA NA NA

HA numbering. not applicable.

RNA isolation and HA gene fragment amplification by RT-PCR. Viral RNA (vRNA) was extracted as described previously (26). Briefly, 100 ␮l of sample underwent total nucleic acid extraction using the MagNA pure 96 system (Roche Diagnostics, Basel, Switzerland). RNA was eluted in 50 ␮l (for original respiratory specimen) or 100 ␮l (for virus isolates) and stored at ⫺30°C. A SuperScript III one-step RT-PCR system with Platinum Taq high fidelity enzyme (Invitrogen, Carlsbad, CA) was used for cDNA synthesis and HA fragment amplification using 5 ␮l of vRNA in 50 ␮l according to manufacturer’s guidelines. The amplicons were analyzed by electrophoresis on 2% agarose E-gels (Invitrogen, Carlsbad, CA). Sequencing of the full-length HA gene was performed on isolates and/or original respiratory specimens using either the conventional Sanger sequencing (Applied Biosystems, Foster City, CA/Grand Island, NY) or nextgeneration sequencing (NGS) MiSeq system (Illumina, San Diego, CA) (37). Design of the clade identification pyrosequencing assay. BioEdit software (38) was used for the alignment of the 975 HA sequences of A(H3N2) human influenza viruses that were collected in 48 countries and 40 U.S. states between October 2013 and November 2014 and have been available in the GISAID database as of 1 December 1 2014. H3 HA gene phylogeny was generated and constructed with HA clade reference sequences using MEGA5 (39). The evolutionary distances were computed using the Tamura Nei method, and the phylogeny was inferred using the neighbor-joining method with bootstrap analysis of 1,000 replicates; the HA clade for each sequence was assigned. Consensus amino acid sequences (80% threshold) and positional frequency tables for sequences within each of the six HA clades of interest were generated in BioEdit, and unique amino acids for each HA clade were identified. The PSQ Assay Design software, version 1.0.6 (Qiagen, Valencia, CA) was used for designing the primers for the reverse transcription and HA gene fragment amplification using RT-PCR and to design primers for pyrosequencing reactions for targeting nucleotide regions 412 to 431, 465 to 481, and 559 to 571. The specificity of the designed primers was tested by using BLAST suite against sequences in the GenBank database (data not shown) (40). Several pairs of RT-PCR primers were evaluated to achieve the best results using serially diluted vRNA preparations. The primer pair demonstrating the highest sensitivity was used in the study (Table 4). Pyrosequencing and data analysis. Pyrosequencing reactions were performed on a PyroMark Q96 ID instrument as described previously (26). The three sequencing reactions were conducted using the sequencing primers at a final concentration of 0.45 ␮M. Fifteen microliters of RT-PCR amplicon was used in each pyrosequencing reaction. The pyrosequencing reaction (SQA mode) was run using the nucleotide dispensation order (CATG)11 for primer H3clade-R1-F395, (CATG)7 for primer H3clade-R2-F445, and ATATCATCG(CATG)4 for primer H3clade-R3-F545. A purpose-built library was constructed based on the full HA gene sequences. A unique identifier was assigned to each sequence variant; IdentiFire software identified each sequence readout by comparing it to identifiers in the purpose-built library (see Table S3). We created an algorithm developed in-house, FireComb (program is available upon request) to assist with the final identification of a clade. This program was written in Microsoft Excel and utilizes both Excel datasheets and Visual Basic for Applications (VBA). We constructed the library of reference nucleotide sequences (also known as identifiers). The VBA code was written for searching an existing datasheet in the same Excel file and looking for combinations of three identifiers, which are the input into the main Excel datasheet. FireComb searches for a combination of identifiers that were previously defined by the user and reports the matching clade. If the exact combination is not found, the program will report the results as “unknown.” The Visual Basic code used for the clade identifications is provided in the supplemental material (FireComb script and command line input). Accession number(s). GISAID accession numbers for the HA genes of A/Puerto Rico/22/2014 (H3N2), A/Florida/11/2015 (H3N2), A/Utah/08/2014 (H3N2), A/Florida/37/2014 (H3N2), A/Oregon/09/2014 (H3N2), and A/Texas/42/2014 (H3N2) are EPI540034, EPI540034, EPI533499, EPI545749, EPI551035, and EPI551067, respectively.

SUPPLEMENTAL MATERIAL Supplemental material for this article may be found at https://doi.org/10.1128/ JCM.01840-16. TEXT TEXT TEXT TEXT TEXT

S1, S2, S3, S4, S5,

PDF PDF PDF PDF PDF

file, file, file, file, file,

0.04 0.08 0.07 0.07 0.03

MB. MB. MB. MB. MB.

January 2017 Volume 55 Issue 1

jcm.asm.org 152

Pyrosequencing for Influenza H3 Clade Identification

TEXT TEXT TEXT TEXT TEXT

Journal of Clinical Microbiology

S6, PDF file, 0.02 MB. S7, PDF file, 0.09 MB. S8, PDF file, 0.05 MB. S9, PDF file, 0.04 MB. S10, PDF file, 0.05 MB.

ACKNOWLEDGMENTS We thank the laboratories of the WHO GISRS for influenza virus submissions. We also greatly appreciate the valuable contributions to this project made by the members of the Virology, Surveillance and Diagnosis Branch, Influenza Division, Centers for Disease Control and Prevention. This work was supported by the Centers for Disease Control and Prevention. The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.

REFERENCES 1. Centers for Disease Control and Prevention (CDC). 2013. Influenza activity—United States, 2012–13 season and composition of the 2013–14 influenza vaccine. MMWR Morb Mortal Wkly Rep 62:473– 479. 2. Appiah GD, Blanton L, D’Mello T, Kniss K, Smith S, Mustaquim D, Steffens C, Dhara R, Cohen J, Chaves SS, Bresee J, Wallis T, Xu X, Abd Elal AI, Gubareva L, Wentworth DE, Katz J, Jernigan D, Brammer L, Centers for Disease Control and Prevention (CDC). 2015. Influenza activity—United States, 2014 –15 season and composition of the 2015–16 influenza vaccine. MMWR Morb Mortal Wkly Rep 64:583–590. 3. Kawaoka Y, Krauss S, Webster RG. 1989. Avian-to-human transmission of the PB1 gene of influenza A viruses in the 1957 and 1968 pandemics. J Virol 63:4603– 4608. 4. Doshi P. 2008. Trends in recorded influenza mortality: United States, 1900 –2004. Am J Public Health 98:939 –945. https://doi.org/10.2105/ AJPH.2007.119933. 5. Wilson IA, Cox NJ. 1990. Structural basis of immune recognition of influenza virus hemagglutinin. Annu Rev Immunol 8:737–771. https:// doi.org/10.1146/annurev.iy.08.040190.003513. 6. Gamblin SJ, Skehel JJ. 2010. Influenza hemagglutinin and neuraminidase membrane glycoproteins. J Biol Chem 285:28403–28409. https://doi.org/ 10.1074/jbc.R110.129809. 7. Bedford T, Suchard MA, Lemey P, Dudas G, Gregory V, Hay AJ, McCauley JW, Russell CA, Smith DJ, Rambaut A. 2014. Integrating influenza antigenic dynamics with molecular evolution. eLife 3:e01914. https:// doi.org/10.7554/eLife.01914. 8. Smith DJ, Lapedes AS, de Jong JC, Bestebroer TM, Rimmelzwaan GF, Osterhaus AD, Fouchier RA. 2004. Mapping the antigenic and genetic evolution of influenza virus. Science 305:371–376. https://doi.org/ 10.1126/science.1097211. 9. Ito T, Suzuki Y, Mitnaul L, Vines A, Kida H, Kawaoka Y. 1997. Receptor specificity of influenza A viruses correlates with the agglutination of erythrocytes from different animal species. Virology 227:493– 499. https://doi.org/10.1006/viro.1996.8323. 10. Lin YP, Xiong X, Wharton SA, Martin SR, Coombs PJ, Vachieri SG, Christodoulou E, Walker PA, Liu J, Skehel JJ, Gamblin SJ, Hay AJ, Daniels RS, McCauley JW. 2012. Evolution of the receptor binding properties of the influenza A(H3N2) hemagglutinin. Proc Natl Acad Sci U S A 109: 21474 –21479. https://doi.org/10.1073/pnas.1218841110. 11. Mohr PG, Deng YM, McKimm-Breschkin JL. 2015. The neuraminidases of MDCK grown human influenza A(H3N2) viruses isolated since 1994 can demonstrate receptor binding. Virol J 12:67. https://doi.org/10.1186/ s12985-015-0295-3. 12. Zhu X, McBride R, Nycholat CM, Yu W, Paulson JC, Wilson IA. 2012. Influenza virus neuraminidases with reduced enzymatic activity that avidly bind sialic acid receptors. J Virol 86:13371–13383. https://doi.org/ 10.1128/JVI.01426-12. 13. Lin YP, Gregory V, Collins P, Kloess J, Wharton S, Cattle N, Lackenby A, Daniels R, Hay A. 2010. Neuraminidase receptor binding variants of human influenza A(H3N2) viruses resulting from substitution of aspartic acid 151 in the catalytic site: a role in virus attachment? J Virol 84: 6769 – 6781. https://doi.org/10.1128/JVI.00458-10. January 2017 Volume 55 Issue 1

14. World Health Organization (WHO) Influenza Centre, London. 2012. Report prepared for the WHO annual consultation on the composition of influenza vaccine for the Southern Hemisphere 2013. WHO Influenza Centre, London, United Kingdom. https://www.crick.ac.uk/media/ 221897/interim_report_september_2012_2.pdf. 15. Barr IG, Russell C, Besselaar TG, Cox NJ, Daniels RS, Donis R, Engelhardt OG, Grohmann G, Itamura S, Kelso A, McCauley J, Odagiri T, SchultzCherry S, Shu Y, Smith D, Tashiro M, Wang D, Webby R, Xu X, Ye Z, Zhang W, Writing Committee of the World Health Organization Consultation on Northern Hemisphere Influenza Vaccine Composition for 2013–2014. 2014. WHO recommendations for the viruses used in the 2013-2014 Northern Hemisphere influenza vaccine: epidemiology, antigenic and genetic characteristics of influenza A(H1N1)pdm09, A(H3N2) and B influenza viruses collected from October 2012 to January 2013. Vaccine 32:4713– 4725. https://doi.org/10.1016/j.vaccine.2014.02.014. 16. World Health Organization (WHO). 2013. Influenza A(H3N2) candidate vaccine viruses and potency testing reagents for vaccine development and production for the northern hemisphere 2013–14. http:// www.who.int/influenza/vaccines/virus/candidates_reagents/ summary_a_h3n2_cvv_nh1314.pdf?ua⫽1. 17. World Health Organization (WHO). 2014. Influenza A(H3N2) candidate vaccine viruses and potency testing reagents for development and production of vaccines for use in the northern hemisphere 2014 –2015 influenza season. http://www.who.int/influenza/vaccines/virus/ candidates_reagents/summary_a_h3n2_cvv_nh1415.pdf?ua⫽1. Accessed 6 June 6 2016. 18. Pebody RG, Warburton F, Ellis J, Andrews N, Thompson C, von Wissmann B, Green HK, Cottrell S, Johnston J, de Lusignan S, Moore C, Gunson R, Robertson C, McMenamin J, Zambon M. 2015. Low effectiveness of seasonal influenza vaccine in preventing laboratory-confirmed influenza in primary care in the United Kingdom: 2014/15 mid-season results. Euro Surveill 20:pii⫽21025. http://www.eurosurveillance.org/ViewArticle .aspx?ArticleId⫽21025. 19. Skowronski DM, Chambers C, Sabaiduc S, De Serres G, Dickinson JA, Winter AL, Drews SJ, Fonseca K, Charest H, Gubbay JB, Petric M, Krajden M, Kwindt TL, Martineau C, Eshaghi A, Bastien N, Li Y. 2015. Interim estimates of 2014/15 vaccine effectiveness against influenza A(H3N2) from Canada’s Sentinel Physician Surveillance Network, January 2015. Euro Surveill 20:pii⫽21022. http://www.eurosurveillance.org/ Viewarticle.aspx?ArticleId⫽21022. 20. Broberg E, Snacken R, Adlhoch C, Beaute J, Galinska M, Pereyaslov D, Brown C, Penttinen P, on behalf of the WHO European Region and the European Influenza Surveillance Network. 2015. Start of the 2014/15 influenza season in Europe: drifted influenza A(H3N2) viruses circulate as dominant subtype. Euro Surveill 20:pii⫽21023. http:// www.eurosurveillance.org/ViewArticle.aspx?ArticleId⫽21023. 21. Chambers BS, Parkhouse K, Ross TM, Alby K, Hensley SE. 2015. Identification of hemagglutinin residues responsible for H3N2 antigenic drift during the 2014 –2015 influenza season. Cell Rep 12:1– 6. https://doi.org/ 10.1016/j.celrep.2015.06.005. 22. Elahi E, Ronaghi M. 2004. Pyrosequencing: a tool for DNA sequencing jcm.asm.org 153

Mishin et al.

23.

24.

25.

26.

27.

28.

29.

30.

31.

32.

analysis. Methods Mol Biol 255:211–219. https://doi.org/10.1385/1 -59259-752-1:211. Novais RC, Thorstenson YR. 2011. The evolution of pyrosequencing for microbiology: from genes to genomes. J Microbiol Methods 86:1–7. https://doi.org/10.1016/j.mimet.2011.04.006. Harrington CT, Lin EI, Olson MT, Eshleman JR. 2013. Fundamentals of pyrosequencing. Arch Pathol Lab Med 137:1296 –1303. https://doi.org/ 10.5858/arpa.2012-0463-RA. Deyde VM, Sampath R, Gubareva LV. 2011. RT-PCR/electrospray ionization mass spectrometry approach in detection and characterization of influenza viruses. Expert Rev Mol Diagn 11:41–52. https://doi.org/ 10.1586/erm.10.107. Levine M, Sheu TG, Gubareva LV, Mishin VP. 2011. Detection of hemagglutinin variants of the pandemic influenza A (H1N1) 2009 virus by pyrosequencing. J Clin Microbiol 49:1307–1312. https://doi.org/10.1128/ JCM.02424-10. Deyde VM, Sheu TG, Trujillo AA, Okomo-Adhiambo M, Garten R, Klimov AI, Gubareva LV. 2010. Detection of molecular markers of drug resistance in 2009 pandemic influenza A (H1N1) viruses by pyrosequencing. Antimicrob Agents Chemother 54:1102–1110. https://doi.org/10.1128/ AAC.01417-09. Sheu TG, Deyde VM, Garten RJ, Klimov AI, Gubareva LV. 2010. Detection of antiviral resistance and genetic lineage markers in influenza B virus neuraminidase using pyrosequencing. Antiviral Res 85:354 –360. https:// doi.org/10.1016/j.antiviral.2009.10.022. Tamura D, Okomo-Adhiambo M, Mishin VP, Guo Z, Xu X, Villanueva J, Fry AM, Stevens J, Gubareva LV. 2015. Application of a seven-target pyrosequencing assay to improve the detection of neuraminidase inhibitorresistant influenza A(H3N2) viruses. Antimicrob Agents Chemother 59: 2374 –2379. https://doi.org/10.1128/AAC.04939-14. Deng YM, Caldwell N, Barr IG. 2011. Rapid detection and subtyping of human influenza A viruses and reassortants by pyrosequencing. PLoS One 6:e23400. https://doi.org/10.1371/journal.pone.0023400. Chambers C, Skowronski DM, Sabaiduc S, Murti M, Gustafson R, Pollock S, Hoyano D, Allison S, Krajden M. 2015. Detection of influenza A(H3N2) clade 3C.2a viruses in patients with suspected mumps in British Columbia, Canada, during the 2014/15 influenza season. Euro Surveill 20:pii⫽30015. http://www.eurosurveillance.org/ViewArticle .aspx?ArticleId⫽21239. Bissielo A, Pierse N, Huang QS, Thompson MG, Kelly H, Mishin VP, Turner

January 2017 Volume 55 Issue 1

Journal of Clinical Microbiology

33.

34.

35.

36.

37.

38.

39.

40.

N, Shivers. 2016. Effectiveness of seasonal influenza vaccine in preventing influenza primary care visits and hospitalisation in Auckland, New Zealand in 2015: interim estimates. Euro Surveill 21:pii⫽30101. http:// www.eurosurveillance.org/ViewArticle.aspx?ArticleId⫽21342. Flannery B, Zimmerman RK, Gubareva LV, Garten RJ, Chung JR, Nowalk MP, Jackson ML, Jackson LA, Monto AS, Ohmit SE, Belongia EA, McLean HQ, Gaglani M, Piedra PA, Mishin VP, Chesnokov AP, Spencer S, Thaker SN, Barnes JR, Foust A, Sessions W, Xu X, Katz J, Fry AM. 2016. Enhanced genetic characterization of influenza A(H3N2) viruses and vaccine effectiveness by genetic group, 2014 –2015. J Infect Dis 214:1010 –1019. https://doi.org/10.1093/infdis/jiw181. World Health Organization (WHO) Global Influenza Surveillance Network. 2011. Manual for the laboratory diagnosis and virological surveillance of influenza. http://whqlibdoc.who.int/publications/2011/ 9789241548090_eng.pdf. Centers for Disease Control (CDC). 2008. 510(k) Summary for Centers for Disease Control and Prevention human influenza virus real-time RT-PCR detection and characterization panel. http://www.accessdata.fda.gov/ cdrh_docs/pdf8/k080570.pdf. Sleeman K, Mishin VP, Deyde VM, Furuta Y, Klimov AI, Gubareva LV. 2010. In vitro antiviral activity of favipiravir (T-705) against drug-resistant influenza and 2009 A(H1N1) viruses. Antimicrob Agents Chemother 54:2517–2524. https://doi.org/10.1128/AAC.01739-09. Zhou B, Donnelly ME, Scholes DT, St George K, Hatta M, Kawaoka Y, Wentworth DE. 2009. Single-reaction genomic amplification accelerates sequencing and vaccine production for classical and swine origin human influenza A viruses. J Virol 83:10309 –10313. https://doi.org/10.1128/ JVI.01109-09. Hall TA. 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser 41:95–98. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 28:2731–2739. https://doi.org/10.1093/molbev/msr121. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389 –3402. https:// doi.org/10.1093/nar/25.17.3389.

jcm.asm.org 154

A Pyrosequencing-Based Approach to High-Throughput Identification of Influenza A(H3N2) Virus Clades Harboring Antigenic Drift Variants.

The rapid evolution of influenza A(H3N2) viruses necessitates close monitoring of their antigenic properties so the emergence and spread of antigenic ...
770KB Sizes 0 Downloads 9 Views