YJINF3660_proof ■ 30 January 2016 ■ 1/10 Journal of Infection (2016) xx, 1e10

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

www.elsevierhealth.com/journals/jinf

Q6

Q5

Genomic dissection of Australian Bordetella pertussis isolates from the 2008e2012 epidemic Azadeh Safarchi a, Sophie Octavia a, Sunny Z. Wu a, Sandeep Kaur a, Vitali Sintchenko b,c, Gwendolyn L. Gilbert b,c, Nicholas Wood b, Peter McIntyre b, Helen Marshall d, Anthony D. Keil e, Ruiting Lan a,* a

Q1

School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia b Centre for Infectious Diseases and Microbiology e Public Health, Institute of Clinical Pathology and Medical Research, Pathology West, Westmead Hospital, New South Wales, Australia c Marie Bashir Institute for Infectious Diseases and Biosecurity, The University of Sydney, New South Wales, Australia d Vaccinology and Immunology Research Trials Unit, Women’s and Children’s Hospital and School of Medicine and Robinson Research Institute, University of Adelaide, South Australia, Australia e Department of Microbiology, PathWest Laboratory Medicine WA, Princess Margaret Hospital for Children, Perth, Australia Accepted 14 January 2016 Available online - - -

KEYWORDS Bordetella pertussis; Molecular epidemiology; Whole genome sequencing; Microevolution; Single nucleotide polymorphism

Summary Objectives: Despite high pertussis vaccination coverage, Australia experienced a prolonged epidemic in 2008e2012. The predominant Bordetella pertussis genotype harboured pertussis toxin promoter allele, ptxP3, and pertactin gene allele, prn2. The emergence and expansion of non-expressing prn isolates (Prn negative), were also observed. We aimed to investigate the microevolution and genomic diversity of epidemic B. pertussis isolates. Methods: We sequenced 22 B. pertussis isolates collected in 2008e2012 from two states of Australia which are geographically widely separated. Ten of the 22 were Prn negative isolates with three different modes of silencing of prn (prn::IS481F, prn::IS481R and prn::IS1002). Five pre-epidemic isolates were also sequenced for comparison. Results: Five single nucleotide polymorphisms were common in the epidemic isolates and differentiated them from pre-epidemic isolates. The Australian epidemic isolates can be divided into five lineages (EL1eEL5) with EL1 containing only Prn negative isolates. Comparison

* Corresponding author. Tel.: þ61 2 9385 2095; fax: þ61 2 9385 1483. E-mail address: [email protected] (R. Lan). http://dx.doi.org/10.1016/j.jinf.2016.01.005 0163-4453/ª 2016 The British Infection Association. Published by Elsevier Ltd. All rights reserved. Please cite this article in press as: Safarchi A, et al., Genomic dissection of Australian Bordetella pertussis isolates from the 2008e2012 epidemic, J Infect (2016), http://dx.doi.org/10.1016/j.jinf.2016.01.005

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122

YJINF3660_proof ■ 30 January 2016 ■ 2/10

2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62

A. Safarchi et al. with global isolates showed that three lineages remained geographically and temporally distinct whereas two lineages mixed with isolates from 2012 UK outbreak. Conclusion: Our results suggest significant diversification and the microevolution of B. pertussis within the 2008e2012 Australian epidemic. ª 2016 The British Infection Association. Published by Elsevier Ltd. All rights reserved.

Q2

Introduction Bordetella pertussis, a small Gram-negative bacterium, is the causative agent of the respiratory infection, pertussis. The infection can be life threatening in infants and young children. Introduction of whole cell vaccine (WCV) during the 1950s significantly reduced the morbidity and mortality of pertussis globally. However, due to the side effect profiles of WCV, acellular vaccine (ACV) was developed in the 1980s.1 In Australia, ACV replaced the WCV initially as a booster in 1997 and then for all scheduled doses by 2000.2 Australia predominantly uses 3-component ACVs from GlaxoSmithKline containing detoxified pertussis toxin (Ptx), pertactin (Prn), filamentous haemagglutinin (FHA). The 5component ACV from Sanofi e Aventis containing additionally 2 types of fimbriae (Fim2 and Fim3) has also been used in Australia.3,4 Re-emergence of pertussis has been reported in many countries with high vaccination coverage, including the US, Canada, Japan, European countries and Australia.5e9 Despite high pertussis vaccine uptake in Australia, (91e92%) pertussis incidence is still the highest amongst all vaccine-preventable diseases. The latest pertussis epidemic commenced in 2008 and reached its peak in 2011.10,11 In the state of New South Wales (NSW), the pertussis notification and hospitalization rate were 2.7 and 3.9 times higher, respectively, as compared to the previous five-year average. In addition, there was a significant increase in notification and hospitalisation rates for infants aged less than one-year-old.10 Although it has been suggested that improvement in diagnostic laboratory techniques and increased awareness by general practitioners may explain the high rate of pertussis in developed countries, there is evidence that organism adaptation and antigenic drift may increase the incidence of pertussis in the highly immunised population.12e14 Strain variation and pathogen adaptation have been reported in different countries as evidenced by polymorphisms in several virulence associated genes and their promoters, including those included in the ACV: ptx genes and its promoter ptxP, prn, fhaB and fim14e16. Recent studies in several countries have reported the emergence and increasing circulation of isolates that do not express Prn.17e22 Comparative genomics of pre-vaccination and modern B. pertussis strains showed that single nucleotide polymorphism (SNP) in important genes including virulenceassociated genes or regulatory genes may have helped the organism to survive under vaccine selection pressure.23e25 Previously, SNP typing classified 208 Australian B. pertussis isolates collected since the 1970s into 30 SNP profiles (SPs) which are further grouped into five SNP clusters.26 An increase in prevalence of SNP cluster I was documented after the introduction of ACV.26 Australia experienced a prolonged outbreak from 2008 to 2012.

Typing of 194 B. pertussis isolated collected from the epidemic found three predominant SPs with SP13, SP14 and SP16 representing 49%, 24% and 12% of the total isolates genotyped.26 During the epidemic, Prn negative isolates emerged in 2008 and increased to 78% by 2012.27 In this study, we used whole genome sequencing to investigate the genetic diversity of B. pertussis isolates associated with the Australian pertussis epidemic of 2008e2012 and to determine evolutionary characteristics of Australian epidemic B. pertussis. We sequenced 22 SP13 B. pertussis isolates collected in 2008e2012 from two states of Australia, NSW and Western Australia (WA) which are geographically widely separated. Ten of the 22 were Prn negative isolates which allowed us to determine the origin and relationships of these Prn negative isolates.

Materials and methods Bacterial strains A total of 27 B. pertussis SP13 isolates were selected based on year and state of isolation and the inactivation mechanism of prn gene. Amongst the 27 selected B. pertussis SP13 isolates, 22 of them represented isolates from the Australian 2008e2012 epidemic and are referred to henceforth as epidemic isolates. The remaining five isolates represented isolates prior to the 2008e2012 epidemic and are referred to as pre-epidemic isolates. Bacterial isolates were maintained on Bordet-Gengou agar (Oxoid) supplemented with 10% horse blood (Oxoid) at 37  C for 3e5 days. Genomic DNA was extracted and purified from pure culture using the phenol-chloroform method as described previously.28

DNA sequencing and assembly DNA libraries were constructed with the insert size of 250 bp paired-end using NexteraXT kit (Illumina) and sequenced on the MiSeq (Illumina). Genome sequencing was done in a multiplex of 24. De novo assembly was performed for all sequencing data using Velvet version 1.2.0829 to combine reads into contigs. These contigs were aligned to the reference B. pertussis strain Tohama I (GeneBank accession number BX470248) using progressiveMauve (version 2.3.1).30 The strain Tohama I was used as a reference to generate comparable data as all previous genomic studies used Tohama I as a reference. Some studies revealed that there are some genomic regions present in B. pertussis strains but are not found in Tohama I.31,32 Therefore, the genome of B. pertussis strain CS,33 which is fully sequenced, was also used to investigate the genomic diversity among these regions of differences. The

Please cite this article in press as: Safarchi A, et al., Genomic dissection of Australian Bordetella pertussis isolates from the 2008e2012 epidemic, J Infect (2016), http://dx.doi.org/10.1016/j.jinf.2016.01.005

63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124

YJINF3660_proof ■ 30 January 2016 ■ 3/10

Genomics of epidemic Australian B. pertussis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62

raw reads were submitted to the GeneBank database under the BioProject No. PRJNA280658.

SNP identification SNP detection was performed as described previously34 using a combination of Burrows-Wheeler Alignment (BWA) tools (version 0.7.5),35 Samtools (version 0.1.19)36 and progressiveMauve. Short insertions/deletions (indels), which are less than 100 bp, were also identified using Samtools.

Insertion sequence elements analysis The distribution of insertion elements (IS) in the SP13 isolates was investigated using a custom script. Briefly, the reads that contained partial matches to IS481 and IS1002 sequences were captured and BLAST was then used to identify the insertion site and the direction of the IS. We sequenced Tohama I genome to confirm the ability of this script to accurately locate the IS positions on the genome.

Phylogenetic analysis Phylogenetic analysis was conducted using MEGA (version 5).37 The minimum evolution algorithm was applied based on the Nucleotide Maximum Composite Likelihood analysis of all positions. Bootstrap analysis was based on 1000 replicates. B. pertussis strain Tohama I was used as outgroup. Mutation rate and ancestral node date for SP13 epidemic lineage were estimated using Bayesian analysis in the BEAST (version 1.7.5) package38 as described previously in Bart et al.14 Analyses were run using the variable sites within SP13 lineage and their isolation dates as the number of days before the present, under a GTR model of evolution, with all combinations of constant, expansion, logistic and extended skyline population size models and strict, relaxed exponential and relaxed lognormal clock models. For each combination, three independent Markov Chains were run for 10 million generations each, with parameter values sampled every 1000 generations. Chains were manually checked for reasonable effective sample size values and for convergence between the three replicate chains using Tracer (version 1.5). Tracer was also used to identify a suitable burn-in period to remove from the beginning of each chain and to assess the model with best fit the data using Bayes factors. The strict clock and extended Bayesian skyline population stepwise models were found to be the most appropriate, so this combination of models was used for further analyses. Chains were combined and down-sampled to every 10,000 generations using LogCombiner with 10% of burn-in removed. A maximum clade credibility tree was computed with TreeAnnotator which is included with the BEAST package.

Results and discussion Selection and sequencing of epidemic isolates In this study, whole genome sequencing was used to investigate the microevolution and genomic diversity of 22 epidemic B. pertussis isolates. The isolates were

3 collected from two geographically separated states of Australia, NSW, the most populous eastern state, and WA, the western state. All isolates belonged to SP13, the predominant SNP profile causing the epidemic. We selected 12 isolates from NSW and 10 from WA, isolated during 2008e2012 with at least two isolates per state per year. Ten of these were Prn negative with different mechanisms of inactivation. Strains inactivated by IS481R (insertion of the IS into prn in the forward and reverse orientation relative to the IS encoded tnpA gene, referred to as F and R, respectively) were predominant (one prn::IS481F, seven prn::IS481R and two prn::IS1002 disruptions). Additionally, five Australian pre-epidemic SP13 isolates from 1997 to 2002, were selected for comparison. The use of preepidemic SP13 facilitates identification of genetic changes present in all epidemic strains. The average number of reads generated per genome was 2,274,658. The average coverage depth of the contigs for all genomes was 48.35, with the lowest coverage of 33 for strain L1423. Percentage match of reads to reference strain Tohama I chromosome ranged from 96.23% to 98.93% using Qualimap version 2.39 The number of contigs ranged from 640 to 3184, with 999 on average (Supplementary Table 1).

Significant polymorphisms in SP13 strains Two approaches e de novo assembly and reads mapping to the reference Tohama I genome e were used for SNP calling. A total of 305 SNPs (mutation in one or more of the 27 SP13 isolates) were detected, 44 of which (14.14%) were located in intergenic (IG) regions (Supplementary Table 2). More than half of the SNPs (188 or 62%) were common to all SP13 isolates. These SNPs were also observed in other studies14,16,40 and will not be discussed further. There were 117 SNPs variably present in the SP13 isolates and only five were shared by all epidemic SP13 isolates (Table 1). Of the 5 SNPs present in all epidemic SP13 isolates, two were located in IG regions while the remaining three SNPs were located in genes. The two IG SNPs were located between BP0137 encoding a transposase for IS481 and BP0138 encoding a pseudogene. Thus, they were likely to be neutral. Two of the three genic SNPs were nonsynonymous SNPs with one being in a virulence associated gene (BP0216 e sphB1) and the other in a gene encoding a transport binding protein (BP3570) while the third genic SNP is synonymous in a regulatory gene (BP0814). The significance of these two non-synonymous SNPs will be discussed in a separate section below. Of the 117 SNPs, 102 SNPs were located within genes, 54 of which were non-synonymous. Fourteen out of the 102 genic SNPs (13.72%) were in genes regulated by the bvg regulon (7 activated and 7 repressed),6 with four being nonsynonymous SNPs (Supplementary Table 2). Nonsynonymous SNPs in virulence-associated genes may be associated with host selection pressure in driving their evolution. This finding is consistent with the study by Seeley et al. which sequenced 100 UK B. pertussis isolates collected from 1920 to 2012 and showed that virulence genes have a higher substitution rate.16 Higher numbers of SNPs in virulence genes have also been observed in Finnish, Chinese and other isolates globally.14,40

Please cite this article in press as: Safarchi A, et al., Genomic dissection of Australian Bordetella pertussis isolates from the 2008e2012 epidemic, J Infect (2016), http://dx.doi.org/10.1016/j.jinf.2016.01.005

63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124

YJINF3660_proof ■ 30 January 2016 ■ 4/10

4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62

A. Safarchi et al. Table 1

Single nucleotide polymorphisms unique to epidemic lineages.

Lineage 2008e2012 epidemic

EL1

EL2

EL3 EL4

EL5

Position in Tohama I

Locus

Gene

Product

Category

136138 136140 223961 838886

BP0216 BP0814

sphB1

Virulence-associated genes Regulation

3783560 1162706 2244138 3758055

BP3570 BP1110 BP2120 BP3546

259371

BP0250

1288344 1358703 3060100 1692984 2657330 2570056 2570058 2681216 3347400 3485064 3555632 3927872

BP1226 BP1285

livJ

Autotransporter subtilisin-like protease Probable lysr-family transcriptional regulator 30S ribosomal protein S8 Serine protease Glutathione reductase Putative branched-chain amino acid transport system protein Tripartite tricarboxylate transporter family receptor Putative exported protein (pseudogene) Leu/Ile/Val-binding protein

Pseudogenes (phase-variable) Receptor family ligand binding

dapB

Putative autotransporter (Pseudogene) Dihydrodipicolinate reductase

Pseudogenes Amino acid biosynthesis

Threonine synthase (pseudogene) Putative exported protein (pseudogene)

Pseudogenes Pseudogenes

Conserved hypothetical protein Pyruvate kinase Branched-chain amino acid transport ATP-binding protein Putative lipoprotein

Conserved hypothetical Energy metabolism Transport/binding proteins

2191410 2514582 3388793

BP1610 BP2509 BP2427 BP2427 BP2528A BP3267 BP3333 BP3718 BP2072 BP3177

sphB3 gor

apaG pykA

Putative methylaspartate ammonialyase

Additionally, 12 SNPs were located in putative promoter regions, which may affect the transcription of downstream genes (Supplementary Table 2). We compared the SNP densities of genes from different functional categories, but no categories were found to be statistically different from the chromosomal average (data not shown). We also mapped the sequence reads to regions that were absent in Tohama I using Chinese CS strain as reference 33. We found only one SNP located in BPTD_0392 which encodes a hypothetical protein. The SNP was variably present among the epidemic isolates.

Phylogenetic relationships of the Australian SP13 isolates The SNPs identified were used to generate a minimum evolution tree to illustrate the genetic relationship of the 22 epidemic SP13 isolates and the five pre-epidemic SP13 isolates (Fig. 1). The genome sequence of B. pertussis strain Tohama I was used as an outgroup to root the tree. All Australian 2008e2012 epidemic isolates grouped together with a single origin and were clearly differentiated from pre-epidemic isolates by five SNPs (Table 1). The genomic tree revealed five epidemic lineages (EL), although the SNP differences between the lineages were

Ribosome constituents Virulence-associated genes Central/intermediary metabolism Transport/binding proteins Cell surface

Cell surface Energy metabolism

very small with two to seven SNPs supporting a lineage. EL1, EL2, EL3, EL4 and EL5 were supported by three, four, two, seven and three SNPs, respectively. Three lineages (EL1 to EL3) contained isolates from both NSW and WA indicating interstate spread of the lineage during the epidemic and highlighting the rapid spread of this respiratory pathogen. Two lineages contained isolates from one state only; EL4 was a NSW cluster with five isolates from 2008 to 2012, while EL5 was a WA lineage with two isolates from 2008 to 2009. However, as the sample numbers were small, EL4 and EL5 may have also spread to the other states. Temporal clustering was also evident. One pair of isolates (L1493 and L1507) collected in 2011 from NSW was identical. There is a need to sequence more isolates to obtain a better picture of the spatiotemporal clustering and the relative frequency of the different epidemic lineages. Alternatively, the lineagespecific SNPs detected can be used as markers for SNP typing to determine the spread and expansion of the epidemic lineages. In the 2008e2012 epidemic, Prn negative B. pertussis strains emerged at the start of the epidemic and increased to nearly 80% by 2012. Multiple mechanisms of Prn nonexpression were found with the insertion of IS481 as the main mechanism of gene disruption.27 Therefore, this study analysed the genomes of 10 Prn negative strains including

Please cite this article in press as: Safarchi A, et al., Genomic dissection of Australian Bordetella pertussis isolates from the 2008e2012 epidemic, J Infect (2016), http://dx.doi.org/10.1016/j.jinf.2016.01.005

63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124

YJINF3660_proof ■ 30 January 2016 ■ 5/10

BP2907

BP3440

BP2764

BP2839-IG-BP2840

BP2609

BP2497

IG-BP2468

IG-BP2327

BP1912

BP1987

BP1560-IG-BP1561

BP1399

BP1442-IG-BP1443

BP1395

BP1209

BP1054

BP0983

BP0985

BP0976-IG

BP0935

BP0551-IGBP0552

BP0764-IG-BP0765

BP0344-IG-BP0345

BP0326

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62

5

BP0276

Genomics of epidemic Australian B. pertussis

1 1 3 3

481R) L1779 (WA 2012 prn::IS481R) 3 L1780 (WA 2012 prn::IS481R) 8 3 L1770 (WA 2011 Prn+) 1 L1421 (NSW 2010 prn::IS1002) 4 3 L1658 (NSW 2012 prn::IS1002) 1 2 L1397 (WA 2010 Prn+) 3 L1376 (WA 2010 Prn+) 2 1 L1380 (WA 2008 Prn+) 1 5 L1382 (WA 2009 Prn+) 2 1 2 1 L1216 (NSW 2009 Prn+) 2 L1037 (NSW 2008 Prn+) L1419 (NSW 2009 Prn+) 7 L1663 (NSW 2012 prn::IS481F) 7 5 L1423 (NSW 2010 Prn+) 1 L1214 (NSW 2009 Prn+) 1 L1042 (NSW 2008 Prn+) 1 1 3 L1391 (WA 2008 Prn+) L1361 (WA 2009 prn::IS481R) 6 L475 (2000 Prn+) 8 L462 (1999 Prn+) 9 L482 (2001 Prn+) pre-epidemic 1 L524 (1997 Prn+) 7 1 L490 (2002 Prn+) Bordetella pertussis strain Tohama I 2

83

105

EL1

4

EL2

EL3

EL4

EL5

5

Figure 1 Phylogenetic relationships of 27 Australian Bordetella pertussis SP13 isolates. The minimum evolution tree was constructed based on 305 single nucleotide polymorphisms (SNPs). The number on the internal and terminal branches corresponds to the number of SNPs supporting each branch. Epidemic strains grouped into 5 epidemic lineages (EL). The distribution of insertion sequence (IS) elements is also displayed with black box indicating present of an IS. Pertactin negative isolates with different modes of inactivation are coloured.

seven prn::IS481R, one prn::IS481F and two prn::IS1002 isolates. Phylogenetic analysis showed that the 10 Prn negative isolates were distributed into four ELs (EL1, EL2, EL4 and EL5) showing that Prn negative strains arose independently at different time points. However, EL1 contained six of the seven Prn negative isolates due to IS481R insertion as the mode of inactivation. The isolates were from both NSW and WA and from 2011 to 2012, suggesting a single Prn negative strain spreading over the two years across both states (Fig. 1). The results confirmed that Prn negative strains can arise multiple times from different lineages. Considering that Prn negative strains have been reported in many countries17,18,22,27,41e43 and have been demonstrated to be fitter under ACV selection pressure,41,44 there are likely to be many independent IS-mediated prn inactivations, displaying the remarkable role IS has played in the adaptive evolution of B. pertussis. Further investigation with a larger number of Prn negative strains would need to be carried out to determine the extent of independent activations and any global expansion of the Prn negative strains. The date for the most recent common ancestor (MRCA) for the epidemic SP13 was estimated to be around late 1999 (95% confidence interval of late 1996- mid 2004), giving a few years for B. pertussis to diversify and spread geographically before contributing to an epidemic. A prolonged epidemic allowed mutations to accumulate and sequencing of isolates from different years revealed that the number of SNPs accumulated is quite small. Using the core genome

size of 3,485,846 bp,45 the mutation rate was determined to be 3.81  107 substitutions/site/year (95% CI 2.28  107e5.46  107). Bart et al.14 previously found the mutation rate of B. pertussis is 2.24  107 per site per year which equates to about 1 SNP per genome per year. Our calculation showed that the mutation rate was 1.5 times faster than that derived from the global B. pertussis data. As seen from Fig. 1, most of the closest related isolates from different years differed by one or two SNPs which was consistent with the mutation rate estimate. However, it is interesting to note that the study by Xu et al.40 reports that the mutation rate may vary among different B. pertussis populations. They estimated Finnish and Dutch B. pertussis populations to be higher (1.59  105, 95% CI 3.98  106e2.87  105) than those of the Chinese population (3.06  106, 95% CI 1.25  106e5.18  106).

The relationship of Australian SP13 isolates with contemporary global isolates We extended the phylogenetic analysis to establish the position of our epidemic lineages in the global picture by incorporating ptxP3 isolates that have been sequenced in three separate studies, including 75 isolates from the global genome study,14 96 isolates from the 2012 UK outbreak study16 and seven isolates from the Chinese/Finnish study,40 with a total of 205 isolates (Fig. 2). Unfortunately, Prn expression data were not available for these genomes

Please cite this article in press as: Safarchi A, et al., Genomic dissection of Australian Bordetella pertussis isolates from the 2008e2012 epidemic, J Infect (2016), http://dx.doi.org/10.1016/j.jinf.2016.01.005

63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124

YJINF3660_proof ■ 30 January 2016 ■ 6/10

6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62

A. Safarchi et al.

EL4

EL3

d

a

EL2 EL1 EL5

b

c

Figure 2 Phylogenetic relationships of Australian epidemic isolates and global ptxP3 isolates. Australian epidemic isolates and pre-epidemic isolates were indicated by red closed circles and red open circles, respectively. The UK 2012 outbreak (blue closed circle) and pre-2012 (blue open circle) isolates are also indicated. The major SNP changes as discussed in the text are indicated on the branches. a. The three SNPs uniquely found in EL1 including two non-synonymous SNPs (gor e SNP position 2244138 and BP3546 e SNP position 3788055); b. The five SNPs found in Australian SP13 epidemic lineage including two non-synonymous SNPs (sphB1 e SNP position 223961 and BP3570 e SNP position 3783560); c. Non-synonymous SNP in sphB1 (SNP position 224066) found in 21 isolates mainly of UK origin but not in Australian SP13 epidemic lineage; d. Non-synonymous SNP in BP3570 (SNP position 3784687) only found in two Swedish isolates.

so we cannot make a conclusion on the overall global diversity of Prn negative strains. We used the raw data reads from these studies to obtain the SNPs and inferred the phylogenetic tree using minimum evolution tree. Globally there is no evidence of geographic restriction of B. pertussis as ptxP3 strains have spread across the world.14 However, the timeframe and the speed

of the spread are unknown. The UK 2012 outbreak isolates were divided into two tight clusters with the larger cluster carrying the fim3-1 allele while the smaller cluster carrying fim3-2 allele. All Australian 2008e2012 epidemic isolates were grouped together with the larger UK outbreak cluster. Three ELs (EL1, EL2 and EL5) remained as distinctive lineages within the large cluster. EL3 contained two UK 2008

Please cite this article in press as: Safarchi A, et al., Genomic dissection of Australian Bordetella pertussis isolates from the 2008e2012 epidemic, J Infect (2016), http://dx.doi.org/10.1016/j.jinf.2016.01.005

63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124

YJINF3660_proof ■ 30 January 2016 ■ 7/10

Genomics of epidemic Australian B. pertussis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62

isolates while EL4 isolates were dispersed among UK isolates. The data suggest rapid global dissemination of pertussis strains. The UK and Australian data also showed that strains causing epidemics can be heterogeneous with two major clusters in the UK 2012 outbreak. We have previously shown that the Australian 2008e2012 epidemic was caused by several SNP profiles with SP13 being predominant and only a selected number of SP13 isolates were used in this study. The heterogeneity could be a result of locally circulating strains over the years and continuous importation of strains from different countries.

Potential role of adaptive SNPs in epidemic SP13 B. pertussis isolates The non-synonymous SNP present in BP0216 (sphB1) and in BP3570 in all epidemic SP13 isolates and one nonsynonymous SNP in the gor gene (BP2120) present only in EL1 isolates may be adaptive. Both non-synonymous SNPs found in BP0216 (sphB1) and in BP3570 in all SP13 isolates were also found in 59 other global isolates. All of these isolates including the epidemic SP13 isolates formed a separate lineage (Fig. 2), suggesting that these SNPs arose only once and no convergent evolution has occurred. The non-synonymous SNP (genome position 223961) in BP0216 (sphB1) found in all epidemic isolates is of particular interest as it may affect the maturation of Fha. sphB1 is a virulence associated gene positively regulated by BvgAS46 and encodes autotransporter SphB1, a 109 kDa exported protein, belonging to the superfamily of subtilisin-like serine proteases.47 Unlike other proteases which mainly proteolyse host proteins, its function is proteolysis maturation of Fha, the major adhesion in B. pertussis.48,49 Secretion of 230 kDa Fha needs maturation of 367 kDa precursors and removal of its w130 kDa C-terminal domain which is done by SphB1.49 It has been shown that the release of Fha depends on its maturation and SphB1 is required for its normal maturation.50 In SphB1deficient mutants, the ability of colonisation in mouse respiratory tract was significantly affected.49 SphB1 is a 931 amino acid linear protein consisting of five major regions (IeV) and two main domains. The subtilisin elike activity of the protein located in the region III with a putative Asp184-His221-Ser412 catalytic triad.48 The SNP, found in sphB1, changed amino acid 121 from valine to isoleucine which is located before catalytic site of the protein. Although the change was conservative with both being hydrophobic amino acids, it may affect the function of the protein as it is located just outside region III. This nonsynonymous SNP was one of the five found in sphB1 by Bart et al.14; the other four include one in another set of ptxP3 strains and three in subsets of ptxP1 strains. The second non-synonymous SNP (at genome position 224066) in the ptxP3 strains is present in a cluster of 21 isolates mainly of UK origin including 14 UK 2012 outbreak isolates, and one 2006 Finnish and one 2007 Australian isolate. It seems clear that sphB1 is under selection pressure to change. The other non-synonymous SNP found in all epidemic isolates was located in BP3570 encoding a branched-chain amino acid ATP Binding Cassette (ABC) transporter. ABC transporters are relatively specific for their own particular

7 substrate and can transfer small or large molecules with different hydrophobicity.51 It is a cytoplasmic membrane transport/binding protein and consists of two major domains which are similar to two subfamilies of ABC transporters in other bacteria such as LivM (N-terminal) and LivG (C-terminal) in Escherichia coli.52,53 The SNP caused the change of hydrophilic tyrosine into another polar amino acid, serine, and may impact on the function of the protein as it is located in the first domain of protein that is involved in the branched-chain amino acid activity. Analysis of global ptxP3 isolates revealed that there was another non-synonymous SNP which changed the amino acid from aspartic acid to asparagine. However, this SNP was only found in two isolates from Sweden which were grouped together on the phylogenetic tree (Fig. 2). A non-synonymous SNP in the gor gene (BP2120) unique to lineage EL1 may also be adaptive. The SNP has not been found in any other isolates globally. The nonsynonymous SNP resulted in an amino acid change from tyrosine, an aromatic polar hydrophobic amino acid to histidine, a positively charged polar hydrophilic amino acid. gor encodes glutathione reductase, which catalyses reversibly the reduction of glutathione disulfide to two glutathione. It has been shown that high concentration of reduced glutathione in the respiratory tract lining fluid plays as an important defence against oxidative damages.54 Stenson et al. studied the influence of cysteineecontaining compounds on transcription, assembly and secretion of Ptx in B. pertussis and found Ptx secretion is promoted efficiently in the in vivo study by reducing glutathione.55 There is a possible indirect interplay between the glutathione reductase and the Ptx secretion. Therefore, the changes in these three genes could have broader implications on B. pertussis evolution. Recent studies have focused on adaptive changes in the genes encoding vaccine antigens44,56,57 that may have been driven by vaccine selection pressure. Adaptive changes that may indirectly affect vaccine antigen genes have not been explored and further studies will be required to understand the biological significance of the changes observed in this study.

No specific small indels are associated with epidemic lineages A total of 43 indels were found in SP13 isolates; 21 were common to all 27 isolates (Supplementary Table 3). Thirty one indels were located on genes e 14 on pseudogenes and five on bvg activated genes. There were 25 frameshifts, of which 15 resulted from single base pair (bp) indels, six from 2 bp indels and one each from 4, 7, 8 and 31 bp indels. Five annotated pseudogenes, BP0880, BP2000, BP2738, BP2899 and BP3762, appeared to produce a complete coding sequence suggesting reversion to functional genes. These genes were common in all SP13 isolates except for BP2738. In contrast, four genes including BP2232, BP2595, BP2928, BP2946 and BP3465 had a stop codon resulting in proteins that were >20% shorter than expected which were considered as pseudogenes.58 Six indels, located in BP0967 (cysU ), BP1624 (kpsT ), BP2141, BP1186, BP3258, and BP3580, were in multiples

Please cite this article in press as: Safarchi A, et al., Genomic dissection of Australian Bordetella pertussis isolates from the 2008e2012 epidemic, J Infect (2016), http://dx.doi.org/10.1016/j.jinf.2016.01.005

63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124

YJINF3660_proof ■ 30 January 2016 ■ 8/10

8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62

of three bases and thus were not likely to affect the reading frame. Twelve indels were located in IG regions, some of which were found in the promoters including the promoters of fim2 and fim3 genes which are known to affect the expression of these genes.59,60 Thirteen indels were located in homopolymeric tracts in genic regions and are likely to be subjected to phase variation including one in bapC which is known to be phase variable. Two indels, one each located in BP0880 and BP2946, were previously reported to be present in ptxP3 strains.24 These shared indels are likely to have arisen only once. None of the indels reported to be specific to the UK ptxP3 strains by Sealey et al.16 was observed in our isolates. Generally, no indels were specific to the 2008e2012 epidemic strains. There were two, two, one and three indels specific for lineages EL1 to EL4, respectively. EL5 had no lineage specific indel. The two indels specific to EL1 were located in BP1186 and BP2018 with the former encoding a putative cytoplasmic protein, aldolase, and the latter encoding a transposase. The indel in BP1186 was a deletion of 3 bp and did not disrupt the reading frame. Of the two indels specific to EL2, one was located in the IG region between BP1052 and BP1053 while the other is located in BP2738 (bapC ), a pseudogene which, if intact, would encode an outer membrane protein. The 2 bp indel in bapC removed the early stop codon and the open reading frame appeared to be intact suggesting a possible reversion of the pseudogene. bapC is phase variable61 and recently it has been shown that BapC is an adhesion factor.62 EL2 consisted of two Prn negative strains and the adhesion role of BapC may become more important without Prn. The other indel located in the poly-C region of bapC was found in only one isolate (L1423). The only EL3 specific indel was located in the promoter region of BP1418 encoding methionine aminopeptidase and three EL4 specific indels were all located in IG regions, including the promoters of BP0191, BP1568 (fim3) and the IG region upstream of BP2086. Considering that some of the genes disrupted by indels were either activated by the bvg regulon or encoding cell surface proteins, it is likely that these genes have an effect on the virulence of the strains.

A. Safarchi et al.

Micro-evolution of B. pertussis by gene loss Several studies have investigated gene loss in B. pertussis.24,45,66e69 Currently circulating ptxP3 strains appear to have lost a large 24 Kb region (BP1948eBP1966) compared to the non-ptxP3 strains. The contigs of SP13 isolates were compared to the reference B. pertussis strain Tohama I using progressiveMauve30 to identify deletions in our epidemic strains. Three loci known as regions of difference including BP0910AeBP0937 as RD3, BP1134eBP1141 as RD5 and BP1947eBP1968 as RD10 were deleted in all SP13 isolates and have been reported by others with RD10 deleted in all ptxP3 strains.65,68 There were no unique gene losses in epidemic 13 isolates.

Conclusion Our comparative genomic analysis of the Australian 2008e2012 epidemic SP13 strains provided new insights into the evolution of the epidemic B. pertussis in Australia. Our findings indicate that small changes in the genomes including SNPs may have contributed to the expansion of the epidemic clone in Australia, with five SNPs unique to the epidemic SP13 isolates. We also found small indels leading to pseudogenisation as well as reversion of pseudogene to functional genes and parallel IS insertions. The epidemic SP13 isolates can be divided into five lineages (EL1eEL5). EL4 and EL5 were only found in NSW and WA, respectively, while the remaining three lineages were found in both states. All except one Prn negative isolate with IS481R disruption belong to EL1 as a single origin and spread to two states. The data also showed that the same mechanism of inactivation of the prn gene by IS481 insertion can occur independently. Comparison with recent global isolates showed that three lineages remained distinctive while 2 lineages mixed with isolates from the 2012 UK outbreak. The heterogeneity could be a result of continuous circulation of local strains over the years and simultaneous importation of strains from other countries. Our results shed lights on the microevolution of the 2008e2012 Australian epidemic B. pertussis.

Conflicts of interest Insertion sequence elements All authors declare no conflicts of interest. IS elements play an important role in the genome evolution of B. pertussis.63e65 There are more than 200 copies of IS in the genome. Parkhill et al. reported 238, six and 17 IS481, IS1002 and IS1663, respectively in Tohama I genome.66 A total of 25 IS insertions that were absent in the Tohama I genome were detected in one or more SP13 isolates, six of which were common to all SP13 isolates (Fig. 1). There was no unique IS insertion site common to SP13 epidemic strains. Thirteen IS insertions were unique to a single strain. All new IS insertion sites except those in the prn gene were due to the IS481R (Fig. 1). Most of the isolates with the same IS insertion were not grouped together suggesting that either the site is a hotspot for IS insertion or disruption of the gene is advantageous. However, a common insertion in the gene BP2327 was found in all four strains of EL4 (Fig. 1).

Statement of funding The funding for this project was provided by the National Health and Medical Research Council of Australia (No. 1011942).

Acknowledgements This study was supported by the National Health and Medical Research Council (NHMRC) of Australia. Azadeh Safarchi was supported by an international postgraduate research award at UNSW. Helen Marshall, Nicholas Wood and Vitali Sintchenko were supported by NHMRC Career Development Fellowships.

Please cite this article in press as: Safarchi A, et al., Genomic dissection of Australian Bordetella pertussis isolates from the 2008e2012 epidemic, J Infect (2016), http://dx.doi.org/10.1016/j.jinf.2016.01.005

63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124

YJINF3660_proof ■ 30 January 2016 ■ 9/10

Genomics of epidemic Australian B. pertussis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62

Appendix A. Supplementary data Supplementary data related to this article can be found at http://dx.doi.org/10.1016/j.jinf.2016.01.005.

References

Q3

1. Mattoo S, Cherry JD. Molecular pathogenesis, epidemiology, and clinical manifestations of respiratory infections due to Bordetella pertussis and other Bordetella subspecies. Clin Microbiol Rev 2005;18:326e82. 2. Octavia S, Sintchenko V, Gilbert GL, Lawrence A, Keil AD, Hogg G, et al. Newly emerging clones of Bordetella pertussis carrying prn2 and ptxP3 alleles implicated in Australian pertussis epidemic in 2008e2010. J Infect Dis 2012;205: 1220e4. 3. Kurniawan J, Maharjan RP, Chan WF, Reeves PR, Sintchenko V, Gilbert GL, et al. Bordetella pertussis clones identified by multilocus variable-number tandem-repeat analysis. Emerg Infect Dis 2010;16:297e300. 4. Quinn HE, Snelling TL, Macartney KK, McIntyre PB. Duration of protection after first dose of acellular pertussis vaccine in infants. Pediatrics 2014;133:e513e9. 5. Poynten M, McIntyre PB, Mooi FR, Heuvelman KJ, Gilbert GL. Temporal trends in circulating Bordetella pertussis strains in Australia. Epidemiol Infect 2004;132:185e93. 6. Kallonen T, He Q. Bordetella pertussis strain variation and evolution postvaccination. Expert Rev vaccines 2009;8: 863e75. 7. Elomaa A, Advani A, Donnelly D, Antila M, Mertsola J, He Q, et al. Population dynamics of Bordetella pertussis in Finland and Sweden, neighbouring countries with different vaccination histories. Vaccine 2007;25:918e26. 8. Mooi FR. Bordetella pertussis and vaccination: the persistence of a genetically monomorphic pathogen. Infect Genet Evol 2010;10:36e49. 9. Tan T, Dalby T, Forsyth K, Halperin SA, Heininger U, Hozbor D, et al. Pertussis across the globe: recent epidemiologic trends from 2000e2013. Pediatr Infect Dis J 2015. 10. Spokes PJ, Quinn HE, McAnulty JM. Review of the 2008e2009 pertussis epidemic in NSW: notifications and hospitalisations. N. S. W Public Health Bull 2010;21:167e73. 11. Pillsbury A, Quinn HE, McIntyre PB. Australian vaccine preventable disease epidemiological review series: pertussis, 2006e2012. Commun Dis Intell Q Rep 2014;38:E179e94. 12. Mooi FR, van Loo IH, King AJ. Adaptation of Bordetella pertussis to vaccination: a cause for its reemergence? Emerg Infect Dis 2001;7:526e8. 13. Wood N, McIntyre P. Pertussis: review of epidemiology, diagnosis, management and prevention. Paediatr Respir Rev 2008;9:201e11. quiz 11e12. 14. Bart MJ, Harris SR, Advani A, Arakawa Y, Bottero D, Bouchez V, et al. Global population structure and evolution of Bordetella pertussis and their relationship with vaccination. MBio 2014;5. 15. Mooi FR, NA VDM, De Melker HE. Pertussis resurgence: waning immunity and pathogen adaptation e two sides of the same coin. Epidemiol Infect 2013:1e10. 16. Sealey KL, Harris SR, Fry NK, Hurst LD, Gorringe AR, Parkhill J, et al. Genomic analysis of isolates from the United Kingdom 2012 pertussis outbreak reveals that vaccine antigen genes are unusually fast evolving. J Infect Dis 2015;212:294e301. 17. Bouchez V, Brun D, Cantinelli T, Dore G, Njamkepo E, Guiso N. First report and detailed characterization of B. pertussis isolates not expressing Pertussis Toxin or Pertactin. Vaccine 2009;27:6034e41.

9 18. Otsuka N, Han HJ, Toyoizumi-Ajisaka H, Nakamura Y, Arakawa Y, Shibayama K, et al. Prevalence and genetic characterization of pertactin-deficient Bordetella pertussis in Japan. PLoS One 2012;7:e31985. 19. Queenan AM, Cassiday PK, Evangelista A. Pertactin-negative variants of Bordetella pertussis in the United States. N Engl J Med 2013;368:583e4. 20. Zeddeman A, van Gent M, Heuvelman CJ, van der Heide HG, Bart MJ, Advani A, et al. Investigations into the emergence of pertactin-deficient Bordetella pertussis isolates in six European countries, 1996 to 2012. Euro Surveill Bull Eur les Mal Transm Z Eur Commun disease Bull 2014:19. 21. Quinlan T, Musser KA, Currenti SA, Zansky SM, Halse TA. Pertactin-negative variants of Bordetella pertussis in New York State: a retrospective analysis, 2004e2013. Mol Cell probes 2014;28:138e40. 22. Barkoff AM, Mertsola J, Guillot S, Guiso N, Berbers G, He Q. Appearance of Bordetella pertussis strains not expressing the vaccine antigen pertactin in Finland. Clinical and Vaccine Immunology: CVI 2012;19:1703e4. 23. Maharjan RP, Gu C, Reeves PR, Sintchenko V, Gilbert GL, Lan R. Genome-wide analysis of single nucleotide polymorphisms in Bordetella pertussis using comparative genomic sequencing. Res Microbiol 2008;159:602e8. 24. Bart MJ, van Gent M, van der Heide HG, Boekhorst J, Hermans P, Parkhill J, et al. Comparative genomics of prevaccination and modern Bordetella pertussis strains. BMC Genom 2010;11:627. 25. van Gent M, Bart MJ, van der Heide HG, Heuvelman KJ, Mooi FR. Small mutations in Bordetella pertussis are associated with selective sweeps. PLoS One 2012;7:e46407. 26. Octavia S, Maharjan RP, Sintchenko V, Stevenson G, Reeves PR, Gilbert GL, et al. Insight into evolution of Bordetella pertussis from comparative genomic analysis: evidence of vaccine-driven selection. Mol Biol Evol 2011;28:707e15. 27. Lam C, Octavia S, Ricafort L, Sintchenko V, Gilbert GL, Wood N, et al. Rapid increase in pertactin-deficient Bordetella pertussis isolates, Australia. Emerg Infect Dis 2014;20:626e33. 28. Octavia S, Lan R. Frequent recombination and low level of clonality within Salmonella enterica subspecies I. Microbiology 2006;152:1099e108. 29. Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008;18: 821e9. 30. Darling AE, Mau B, Perna NT. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 2010;5. 31. Caro V, Bouchez V, Guiso N. Is the sequenced Bordetella pertussis strain Tohama I representative of the species? J Clin Microbiol 2008;46:2125e8. 32. Park J, Zhang Y, Buboltz AM, Zhang X, Schuster SC, Ahuja U, et al. Comparative genomics of the classical Bordetella subspecies: the evolution and exchange of virulence-associated diversity amongst closely related pathogens. BMC Genom 2012;13:545. 33. Zhang S, Xu Y, Zhou Z, Wang S, Yang R, Wang J, et al. Complete genome sequence of Bordetella pertussis CS, a Chinese pertussis vaccine strain. J Bacteriol 2011;193:4017e8. 34. Pang S, Octavia S, Feng L, Liu B, Reeves PR, Lan R, et al. Genomic diversity and adaptation of Salmonella enterica serovar Typhimurium from analysis of six genomes of different phage types. BMC Genom 2013;14:718. 35. Li H, Durbin R. Fast and accurate short read alignment with BurrowseWheeler transform. Bioinformatics 2009;25: 1754e60. 36. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics 2009;25:2078e9.

Please cite this article in press as: Safarchi A, et al., Genomic dissection of Australian Bordetella pertussis isolates from the 2008e2012 epidemic, J Infect (2016), http://dx.doi.org/10.1016/j.jinf.2016.01.005

63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124

YJINF3660_proof ■ 30 January 2016 ■ 10/10

10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

37. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 2011;28:2731e9. 38. Drummond AJ, Suchard MA, Xie D, Rambaut A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol 2012; 29:1969e73. 39. Garcia-Alcalde F, Okonechnikov K, Carbonell J, Cruz LM, Gotz S, Tarazona S, et al. Qualimap: evaluating nextgeneration sequencing alignment data. Bioinformatics 2012; 28:2678e9. 40. Xu Y, Liu B, Grondahl-Yli-Hannuksila K, Tan Y, Feng L, Kallonen T, et al. Whole-genome sequencing reveals the effect of vaccination on the evolution of Bordetella pertussis. Sci Rep 2015;5:12888. 41. Hegerle N, Dore G, Guiso N. Pertactin deficient Bordetella pertussis present a better fitness in mice immunized with an acellular pertussis vaccine. Vaccine 2014;32:6597e600. 42. Martin SW, Pawloski L, Williams M, Weening K, DeBolt C, Qin X, et al. Pertactin-negative Bordetella pertussis strains: evidence for a possible selective advantage. Clinical Infectious Diseases: Official Publ Infect Dis Soc Am 2015;60:223e7. 43. Pawloski LC, Queenan AM, Cassiday PK, Lynch AS, Harrison MJ, Shang W, et al. Prevalence and molecular characterization of pertactin-deficient Bordetella pertussis in the United States. Clinical and Vaccine Immunology: CVI 2014;21:119e25. 44. Safarchi A, Octavia S, Luu LD, Tay CY, Sintchenko V, Wood N, et al. Pertactin negative Bordetella pertussis demonstrates higher fitness under vaccine selection pressure in a mixed infection model. Vaccine 2015;33:6277e81. 45. King AJ, van Gorkom T, van der Heide HGJ, Advani A, van der Lee S. Changes in the genomic content of circulating Bordetella pertussis strains isolated from the Netherlands, Sweden, Japan and Australia: adaptive evolution or drift? BMC Genom 2010;11:64. 46. Antoine R, Alonso S, Raze D, Coutte L, Lesjean S, Willery E, et al. New virulence-activated and virulence-repressed genes identified by systematic gene inactivation and generation of transcriptional fusions in Bordetella pertussis. J Bacteriol 2000;182:5902e5. 47. Siezen RJ, Leunissen JA. Subtilases: the superfamily of subtilisin-like serine proteases. Protein Science: A Publ Protein Soc 1997;6:501e23. 48. Coutte L, Antoine R, Drobecq H, Locht C, Jacob-Dubuisson F. Subtilisin-like autotransporter serves as maturation protease in a bacterial secretion pathway. EMBO J 2001;20:5040e8. 49. Coutte L, Alonso S, Reveneau N, Willery E, Quatannens B, Locht C, et al. Role of adhesin release for mucosal colonization by a bacterial pathogen. J Exp Med 2003;197:735e42. 50. Mazar J, Cotter PA. Topology and maturation of filamentous haemagglutinin suggest a new model for two-partner secretion. Mol Microbiol 2006;62:641e54. 51. Higgins CF. ABC transporters: physiology, structure and mechanism e an overview. Res Microbiol 2001;152:205e10. 52. Saurin W, Koster W, Dassa E. Bacterial binding proteindependent permeases e characterization of distinctive signatures for functionally related integral cytoplasmic membraneproteins. Mol Microbiol 1994;12:993e1004. 53. Linton KJ, Higgins CF. The Escherichia coli ATP-binding cassette (ABC) proteins. Mol Microbiol 1998;28:5e13.

A. Safarchi et al. 54. Kelly FJ. Gluthathione: in defence of the lung. Food Chem Toxicol 1999;37:963e6. 55. Stenson TH, Patton AK, Weiss AA. Reduced glutathione is required for pertussis toxin secretion by Bordetella pertussis. Infect Immun 2003;71:1316e20. 56. Komatsu E, Yamaguchi F, Abe A, Weiss AA, Watanabe M. Synergic effect of genotype changes in pertussis toxin and pertactin on adaptation to an acellular pertussis vaccine in the murine intranasal challenge model. Clinical and Vaccine Immunology: CVI 2010;17:807e12. 57. van Gent M, van Loo IH, Heuvelman KJ, de Neeling AJ, Teunis P, Mooi FR. Studies on Prn variation in the mouse model and comparison with epidemiological data. PLoS One 2011;6: e18014. 58. Lerat E, Ochman H. Recognizing the pseudogenes in bacterial genomes. Nucleic Acids Res 2005;33:3125e32. 59. Willems R, Paul A, van der Heide HG, ter Avest AR, Mooi FR. Fimbrial phase variation in Bordetella pertussis: a novel mechanism for transcriptional regulation. EMBO J 1990;9: 2803e9. 60. Riboli B, Pedroni P, Cuzzoni A, Grandi G, de Ferra F. Expression of Bordetella pertussis fimbrial (fim) genes in Bordetella bronchiseptica: fimX is expressed at a low level and vir-regulated. Microb Pathog 1991;10:393e403. 61. Gogol EB, Cummings CA, Burns RC, Relman DA. Phase variation and microevolution at homopolymeric tracts in Bordetella pertussis. BMC Genom 2007;8:122. 62. Bokhari H, Bilal I, Zafar S. BapC autotransporter protein of Bordetella pertussis is an adhesion factor. J Basic Microbiol 2012;52:390e6. 63. Brinig MM, Cummings CA, Sanden GN, Stefanelli P, Lawrence A, Relman DA. Significant gene order and expression differences in Bordetella pertussis despite limited gene content variation. J Bacteriol 2006;188:2375e82. 64. Caro V, Hot D, Guigon G, Hubans C, Arrive M, Soubigou G, et al. Temporal analysis of French Bordetella pertussis isolates by comparative whole-genome hybridization. Microbes Infect/Institut Pasteur 2006;8:2228e35. 65. Heikkinen E, Kallonen T, Saarinen L, Sara R, King AJ, Mooi FR, et al. Comparative genomics of Bordetella pertussis reveals progressive gene loss in Finnish strains. PLoS One 2007;2: e904. 66. Parkhill J, Sebaihia M, Preston A, Murphy LD, Thomson N, Harris DE, et al. Comparative analysis of the genome sequences of Bordetella pertussis, Bordetella parapertussis and Bordetella bronchiseptica. Nat Genet 2003;35:32e40. 67. Bouchez V, Caro V, Levillain E, Guigon G, Guiso N. Genomic content of Bordetella pertussis clinical isolates circulating in areas of intensive children vaccination. PLoS One 2008;3: e2437. 68. King AJ, van Gorkom T, Pennings JLA, van der Heide HGJ, He Q, Diavatopoulos D, et al. Comparative genomic profiling of Dutch clinical Bordetella pertussis isolates using DNA microarrays: identification of genes absent from epidemic strains. BMC Genom 2008;9:311. 69. Lam C, Octavia S, Sintchenko V, Gilbert GL, Lan R. Investigating genome reduction of Bordetella pertussis using a multiplex PCR-based reverse line blot assay (mPCR/RLB). BMC Res Notes 2014;7:727.

Please cite this article in press as: Safarchi A, et al., Genomic dissection of Australian Bordetella pertussis isolates from the 2008e2012 epidemic, J Infect (2016), http://dx.doi.org/10.1016/j.jinf.2016.01.005

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120

Genomic dissection of Australian Bordetella pertussis isolates from the 2008-2012 epidemic.

Despite high pertussis vaccination coverage, Australia experienced a prolonged epidemic in 2008-2012. The predominant Bordetella pertussis genotype ha...
703KB Sizes 0 Downloads 16 Views