ARTICLE Received 4 Sep 2014 | Accepted 6 Feb 2015 | Published 27 Mar 2015

DOI: 10.1038/ncomms7560

OPEN

Capturing the cloud of diversity reveals complexity and heterogeneity of MRSA carriage, infection and transmission Gavin K. Paterson1,*,w, Ewan M. Harrison1,*, Gemma G.R. Murray2, John J. Welch2, James H. Warland1, Matthew T.G. Holden3,w, Fiona J.E. Morgan1, Xiaoliang Ba1, Gerrit Koop4, Simon R. Harris3, Duncan J. Maskell1, Sharon J. Peacock3,5, Michael E. Herrtage1, Julian Parkhill3 & Mark A. Holmes1

Genome sequencing is revolutionizing clinical microbiology and our understanding of infectious diseases. Previous studies have largely relied on the sequencing of a single isolate from each individual. However, it is not clear what degree of bacterial diversity exists within, and is transmitted between individuals. Understanding this ‘cloud of diversity’ is key to accurate identification of transmission pathways. Here, we report the deep sequencing of methicillin-resistant Staphylococcus aureus among staff and animal patients involved in a transmission network at a veterinary hospital. We demonstrate considerable within-host diversity and that within-host diversity may rise and fall over time. Isolates from invasive disease contained multiple mutations in the same genes, including inactivation of a global regulator of virulence and changes in phage copy number. This study highlights the need for sequencing of multiple isolates from individuals to gain an accurate picture of transmission networks and to further understand the basis of pathogenesis.

1 Department of Veterinary Medicine, University of Cambridge, Cambridge CB3 0ES, UK. 2 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK. 3 Wellcome Trust Sanger Institute, Hinxton CB10 15A, UK. 4 Department of Farm Animal Health, Faculty of Veterinary Medicine, Utrecht University, Utrecht 3584 CL, The Netherlands. 5 Department of Clinical Medicine, University of Cambridge, Cambridge CB2 0QQ, UK. * These authors contributed equally to this work. w Present addresses: School of Biological, Biomedical and Environmental Sciences, University of Hull, Cottingham Road, Hull, UK (G.K.P.); School of Medicine, University of St Andrews, Fife, UK (M.T.G.H.). Correspondence and requests for materials should be addressed to M.A.H. (email: [email protected]).

NATURE COMMUNICATIONS | 6:6560 | DOI: 10.1038/ncomms7560 | www.nature.com/naturecommunications

& 2015 Macmillan Publishers Limited. All rights reserved.

1

ARTICLE

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms7560

T

he use of rapid whole-genome sequencing in clinical microbiology is showing considerable potential for the diagnosis, characterization and surveillance of pathogens1–3. The power of genome sequencing to identify outbreaks and transmission events and to track the source of bacterial pathogens has been exemplified by a number of studies and will become an invaluable tool to inform hospital infection control, particularly in view of the increasing problems posed by multidrug-resistant pathogens4–7. However, studies to date have largely relied upon sequencing of single colonies from individual hosts, while recent data indicate within-host populations of bacterial pathogens may be heterogeneous. Recent studies on sequencing multiple individual isolates from Staphylococcus aureus carriers have shown that individual colonies that differ by up to 40 single-nucleotide polymorphisms (SNPs) can be isolated from a single colonized individual5,8,9. Currently, little data are available about how this within-host ‘cloud of diversity’ fluctuates temporally, in newly, or long-term colonized or infected individuals. Importantly, no data are available regarding the degree of bacterial diversity that is transferred from one individual to another in a transmission event. To further understand the cloud of diversity that exists within colonized individuals and during transmission, we undertook deep sequencing of a methicillin-resistant S. aureus (MRSA) ‘outbreak’ at a veterinary hospital involving both staff and animal patients. We show that there is considerable within-host diversity during

carriage and infection, which may rise and fall over time, and can include multiple genotypes. Our data also provide new insights into the degree of diversity transmitted between individuals. Finally, we highlight the need for sequencing of multiple isolates for the accurate determination of transmission networks. Results Investigation of MRSA transmission in a veterinary hospital. The index case was a 4-year-old German Shepard dog admitted to a veterinary hospital with suspected toxic epidermal necrolysis manifesting in an open abdominal wound. A wound swab on admission produced a positive culture for Pasteurella multocida, which was fully susceptible to all antibiotics tested. Five days after admission, a second wound swab grew MRSA. The animal’s condition deteriorated and it died 11 days after admission. A post mortem was performed, confirming the cause of death as S. aureus sepsis. Additional swabs of the index case taken at 8 days and samples taken at post-mortem were all positive for MRSA. To identify the potential source of infection and other transmission events, we screened the hospital staff and animal patients for MRSA. A total of 97 members of staff and 158 animal patients from the veterinary hospital were screened for MRSA carriage. Any MRSA-positive staff members were re-swabbed at intervals to identify persistent carriers (Fig. 1). Seven staff members (Staff A–G) and three more animal patients (Dogs 30, 150 and 158) were

MRSA Clade 1 ST22 Clade 2 Clade 3 ST772

Pie chart shows proportions of each clade 5

No.of colonies isolated from swab: > 20 Day since index case admitted 0 1 2 3 4 5 6

7

8

1

9

10

11 12 13 14 15

16

17 18 19

20 21

22 23

24 25

26

27

51

68

69

70 71

Staff A A1

A3

A2 –ve swab

Staff B B1

B2 –ve swab

Staff C C1 Staff D D1

D3

D2

Staff E E2

E1 Staff F F2

F1 Staff G

F3 –ve swab

G1 2

–ve swab

Dog index

Post-mortem 4/5/6/7/8

1 3

9

Dog 30 30

Days 41 and 42 Day case

Dog 150

150 N 150 S

Day 58 Dog 158

158 Days 42 to 49 0

1

2

3

4

5

6

7

8

9

10

11 12 13 14 15

16

17 18

19

20 21

22 23

24 25

26

27

51

68

69

Day case 70 71

Figure 1 | Time line of the presence and swabbing dates of staff members and animal patients in the veterinary hospital. Each circle represents an individual swab, the size of the circle is proportional to the number of colonies recovered after culture and the colours represent the relative proportions of a particular clade in the colonies assessed by whole-genome sequencing. 2

NATURE COMMUNICATIONS | 6:6560 | DOI: 10.1038/ncomms7560 | www.nature.com/naturecommunications

& 2015 Macmillan Publishers Limited. All rights reserved.

ARTICLE

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms7560

positive for MRSA (prevalence of 7.2% and 6.8%, respectively). Six of the seven MRSA-positive staff members (Staff A, B, C, D, F and G) were involved in the direct care of the index case, whereas all three of the MRSA-positive animals first entered the hospital after the death of the index case (Fig. 1). Therefore, the availability of dense sampling of both staff and animals taken both during the initial case and in the time immediately afterwards presented the opportunity to investigate bacterial diversity present within and potentially transferred between each colonized host. Genomic characterization of MRSA populations. Whole-genome sequencing was carried out on 20 separate colonies (or all the colonies grown if the total was less than 20) from each MRSApositive swab from the staff and animals (Fig. 1). Multilocus sequencing types extracted from the genome sequence identified the presences of two different sequence types: ST22 (EMRSA15)10 and ST772 (Bengal Bay clone)11,12. However, ST772 was only found in a single member of staff, whereas ST22 was isolated from the index case, five members of staff (staff A, B, C, D and F) and three animal patients (dogs 30, 150 and 158; Fig. 1). A phylogenetic tree constructed using SNPs present in the core genome, identified that three distinct clades of ST22 were circulating in the veterinary hospital (clades 1, 2 and 3) separated by 4180 SNP from each other (Figs 1 and 2). The majority (141 of 143) of the isolates from the index case were clade 1, suggesting that organisms from this clade were causing the pathology. A swab taken from the axilla in the index case was also positive for two clade 2 isolates (Fig. 1). Co-colonization with isolates from both clades 1 and 2 was also seen in three other staff members (staff A (samples: A1 and A3), staff B (B1) and staff F (F1) in Fig. 1). In staff member A, the relative proportions of the two populations fluctuated over the three sample points taken over 57 days (% proportions of clade 1:clade 2 at day 12 ¼ 45:55, at day 22 ¼ 100:0 and at day 69 ¼ 35:65; Fig. 1). Staff member F’s first swab also produced colonies from both clade 1 and 2 (clade 1:clade 2 ¼ 15:85), although their second and third swabs only produced isolates from clade 2 (staff F2 and F3 in Fig. 1). Temporal changes in within host diversity in persistent MRSA carriers. We analysed the MRSA populations in the three staff members who were persistent carriers to define changes in diversity over time. The 34 clade 1 isolates from staff member A were differentiated by a total of 22 SNPs, 4 deletions (two intergenic) and 1 insertion (Figs 3a and 4). Diversity was observed across the three time points, with both unique and identical isolates (defined henceforth as the same genome-type) in the second and third swabs compared with the first (Fig. 3a). The 24 clade 2 isolates from staff member A were differentiated by 11 SNPs and two intergenic deletions (Fig. 3b). The 11 isolates from swab one were identical except for one SNP variant, but the 13 isolates from swab two split into two distinct sub-clades (Fig. 3b). The second distinct sub-clade of seven isolates shared a common ancestor with the first clade and was differentiated by five SNPs. The clade 2 isolates from staff member F were differentiated by a total of 19 SNPs and a single intergenic deletion, with changes in diversity observed over time. The 16 isolates from swab one (Staff F1), taken on day 14 varied by 16 SNPs and a single intergenic insertion (Figs 3c and 5a). The second swab (Staff F2) taken on day 19 produced 16 isolates, 14 of which were identical to isolates from the first swab, whereas the isolates from swab three taken on day 69 (Staff F3) were more homogenous, with 16 of identical genome-type and three isolates each differing by only a single SNP (Fig. 3c and Supplementary Fig. 2). The predominant genome-type from swab three was detected at

Clade 1

Clade 3

Clade 2

Figure 2 | Phylogenetic relationship of three distinct clades of ST22 isolates from the veterinary hospital in comparison to a global collection of ST22 isolates13,39. Isolates are coloured according to country of origin. UK ¼ red, Germany ¼ dark green, Australia ¼ dark blue, Portugal ¼ Purple, Denmark ¼ light blue, Czech republic ¼ light green, Sweden ¼ brown, Nigeria ¼ maroon, Hungary ¼ yellow, Singapore ¼ yellow and New Zealand ¼ pink.

lower frequency in the two previous swabs, indicating a shift over time in the predominant colonizing population. Staff member D was the only carrier of clade 3 isolates, which was more diverse than the other two clades combined (84 SNPs, 7 intergenic deletions, 6 deletions, 4 intergenic insertions and 1 insertion; Figs 3d and 5b). The higher diversity was confirmed by tests of pairwise diversity and Watterson’s theta (Supplementary Fig. 2). One explanation for this higher diversity is a higher mutation rate in clade 3 compared with clade 1 and 2, but this was not shown experimentally (Supplementary Table 1). Furthermore, maximum a posteriori estimates from the genome sequence data, which revealed that isolates from clade 3 had a mutation rate of 8.1  10  6 substitutions per nucleotide site per year (95% Bayesian credible intervals: 4.6  10  6–1.3  10  5), comparable to a previous measurements of ST22 population: 1.3  10  6 substitutions per nucleotide site per year (95% Bayesian credible intervals, 1.2  10  6 to 1.4  10  6)13. An estimate for time to most recent common ancestor was 179 days (95% highest posterior density (HPD): 177–288 days). The majority of the isolates (14 of 18) in the third swab were part of a distinct clade not seen in the first two swabs, although directly descended from populations that were (Fig. 3d). All 14 isolates in this clade had an 8.5-kb deletion between sdrC and sdrE (deleting sdrC, sdrD and sdrE) present in 14 isolates (Fig. 3d). A single isolate (Staff D_3_H) from this clade had also lost the fSa3 phage, which encodes modulators of the human innate immune response14,15. We also identified that all the isolates from staff member D had a premature stop codon (Trp76STOP) present in agrC (AgrC is the autoinducer sensor protein component of the accessory gene regulator (agr), a quorum sensing system and global transcriptional regulator of staphylococcal virulence16). The loss of agr function was confirmed by a lack of d-haemolytic activity (a proxy for agr activity17; Supplementary Table 2 and Supplementary Fig. 1). Analysis of within host diversity during infection. We next analysed the isolates from the index case. A total of 141 isolates were differentiated by 24 SNPs. In all, 57 of the 141 (40%) isolates were genetically indistinguishable (the same genome-type) on the basis of core genome SNPs (isolates identical to Dog_Index_1_C

NATURE COMMUNICATIONS | 6:6560 | DOI: 10.1038/ncomms7560 | www.nature.com/naturecommunications

& 2015 Macmillan Publishers Limited. All rights reserved.

3

ARTICLE

F_ 1_S

A_2_A A_3 _C A_ 2_ D A_ 1_ E

F_

F_1_A

_C F_ 3

1_S

H

F_

A_1_T

A_1_

F

*

Δ

A_2_T

A_

F _2_

A_1_O

A_

*

O _C 2_ A_1 _D A_1

_I

A_2

A_3_R

A_3_E

2_G

F

1_

E

F

F_

J

1_

G

1_

F_

G 3_

F_

L

_K

F_1_

F_1

Intergenic insertion

F_3_H

Insertion

_E

3_

F_

1_

F_1_P

Phase copy number decrease

F_3

F_

* F_

F_ 2_ J F_ 3_ L F_3 _K

A_3_J

A_3 _K

G

3_

A_ 3 _A

F_3_D

Δi

F_

Phase copy number increase

A_3_L

A_3_P

F_3_S F_3_T

T 1_

_S

A_3_T

A_ 3_ M

A_ 2_ N

_C

Q

F_3_

F_3_R

*

A_

A_ 2

P

F_3_

_F

* Δi A_3

Δi + Δi

A_3_O

Δ

_Q

3 A_

B 3_

A_

A_3

*

Δi

E

2_

A_

F_

F_1_M F_1_I F_1_H _B F_3 _O 2 F_ 2_H F_

A_3_D

2_M

A

*

F_1_Q

A_3_I

A_

O 3_

F F_ _2_ F_ 2_DE 2_ F_2_ C F_2_ B A

F_3_I

*

A_1_K

_A

A_1_G

F_3

_J

A_1_A

N

3_

F_

F_1_O F_2_P F_2_N _M F_2 2_L F_ 2_K F F_ _2_ F

H _3_ A_2

M

3_

A_

1_

A

*

_Q

I

A_

N

3_

A_

2_B

_L

R

1_

A_

*

1_

F_3_ J

H

A_

A_2

A_1_N

_J

P

2_

2_

A_

A_ 1

1 A_

A_1_P

A_2_R

S 2_ A_

A_

A_

A_1_M

_L A_1

2_

S

F_1

_B

_N

1 A_

F_2_Q

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms7560

Deletion

D_3_K

*

D_1_E

D_3_L

D_1_D

* M

E D_3_S

*

*

D_

3_T

D_1_F

D_1_S

Δ Δi +

M

D_

2_ D_

3_

*

D_

3_

M

A

6_K 6_L 6_M 6_N 6_O 6_R 6_S 6_ 4_ T 1_PN 1_ 3_ R 3 F 3_ _G 1 I 3_ _C P

2_

D 2 _A

Δ

3_

F

*

*

8 5 1

6 7

4

7_G 7_H 7_I 7_J 7_K 7_L

7_N 7_P 7_ 7 Q 7_ _S 3_ T 2_ L 2_ G I

3

1

Abdominal wound swab

2

Nasal swab

3

Axilla

4

Left hind leg

5

Subcutaneous penis

6

Left inner thigh

7

Left antebrachium

8

Prescapular lymph node

9

Catheter

Day 4

Day 8

7_M

Day 12post mortem

5_M

5_N 5_O 5_Q 5_R 5_T 5_P

P

3_

D_

D_

B

D_

1_C

A

D_

2_

E 2_ F 2_ 2_H J 2_ _F 1 _H B 1 J 4_ 1_ A 9_ 7_A 4_T 5_I 5_K 7_B 7_D 7_E

A 5_ _B 5 C 5_ D 5_ E 5_ F 5_ G 5_ 5_J 5_L

2_

D_

_D

* D_

D_2

* D_2_H D_2_K

M

* D_1_Q

2_S

sdr

1_

L

1_

D_

rC/

* *

2

7_F

4_ H 4_ F 4_ R 4_ P

O

1_

D_

D_3_R

4_

L

2_

D_

Δ

_J

D_2

Δi

_Q

D_3

Δsd

*

*

8_L 8_T

Δi

ΔφSa3D_3_H

a3

8_M 8_G 8_F 8_E 8_D 8_C

Δ Δi

D_

+

2_ F D_ 2_ C D_2 _B D_1 _T D_1_H

3_G

*

8_Q

_A _1 D

C

3_ D_

Δi Δφ S

4 4 _L 4_ _K 4_ Q O

2_ D_

T D_

D_

D_

*

+

Δ

I

3_

D_

3_O

*

Δi +

3_D

Δ

D_

2_R D_ _P 2 D_ _N _2 _I D _2 D

D_

2_

Δi

2_O

*

Q

D_1_N

D_1_B

Δi *

7 7_ _R O 3_ 3_ S 3_ C B 1_ 1_E I 1_Q 2_C 3_H 3_D 8_R * 8_N 8_J 8_I * 8_H 8_B 1_G 1_D 1_B 1_T 1_S 1_O 1_N 1_M 1_L 1_K E 4_ D 4_ C 4_

6_J 6_I 6_H 6_G 6_F 6_E 6_D 6_C 6_BJ 4_ _I 4 G 4_ H 5_ _S 4 _A 3 _A 1 _A 6 _A 4

D_3_N

D_3_B

D_1_R

Intergenic deletion

8_S

8_K

8_A 8_O

Figure 3 | Individual phylogenetic trees for Staff member A, F, D and Index Dogs isolates. Bayesian phylogenetic trees generated using SNPs, indels and the presence/absence mobile genetic elements (MGEs) to inform the topology for Staff member A’s: (a) Clade 1 isolates and (b) Clade 2 isolates, (c) Staff member F’s (Clade 2 isolates), (d) Staff member D (Clade 3 isolates) and (e) the Index Dog (Clade 1 isolates). Coloured branches represent strains that share the presence/absence of an indel/MGE. Some indels/MGEs are not coloured if they are not topologically informative, they share their presence/ absence with another indel or they are the ancestral state. Ancestral states are described at the root and at some nodes of the phylogeny. All nodes present in the trees have 40.5 support and starred nodes have 40.95 support. The presence of insertions, deletions and changes in phage copy numbers are shown on branches or individual taxa. The figure of the dog shows the location of the samples, the colours of the circles marked on the representation of the dog match those on the tip labels of the tree in e.

in Fig. 3e). Another 53 isolates only differed by a single SNP from this population, meaning that 78% of isolates from the index case varied by a maximum of two core genome SNPs. All the isolates from the first MRSA-positive swab taken on day 5, were either the predominant genome-type (7 isolates) or belong to one of two single SNP sub-clades (3 and 10 isolates; Fig. 3e). The isolates from the nasal swab taken on day 8 (Dog Index 2) contained the most SNPs (nine SNPs between ten isolates) and were the most diverse sample from the index case (Supplementary Fig. 2). Clusters of isolates from individual anatomical sites were differentiated into sub-clades that descended from the predominant 4

genome-type (Dog Index 1, 3, 4, 5, and 8 in Fig. 3e). Three isolates (Dog_Index_3B, C and S) from the axilla had also lost the fSa3 phage, interestingly, like in staff member D, two of three isolates that had lost fSa3 phage also had increased fSa2HO 5096 0412 copy number (Fig. 3e and Supplementary Fig. 4). Unlike the broad distribution of isolates with altered fSa2HO 5096 0412 copy number in staff A, F and D, all but two of the isolates from index case with an increased fSa2HO 5096 0412 number were phylogenetically distinct isolates from the axilla (Dog Index 3, 8 of 11 isolates) and the precapsular lymph node (Dog Index 8, 14 of 19 isolates; Fig. 3e and Supplementary Fig. 4). Furthermore, all

NATURE COMMUNICATIONS | 6:6560 | DOI: 10.1038/ncomms7560 | www.nature.com/naturecommunications

& 2015 Macmillan Publishers Limited. All rights reserved.

ARTICLE

Dog_30_M Dog_30_Q Dog_30_O Dog_30_N _1_D Dog_Index _1_G ex Dog_Ind _1_O ex Dog_Inddex_1_B n Dog_I dex_1_T In _L Dog_ ex_1 _K _Ind Dog Index_1 _N _ _1 Dog M dex In _ _1_ Dog Index _1_S _30_E x A _ g Dog g_Inde Do ex_4__R nd _1 _I Do g_I ex _3 Do g_Indndex _2_J I Do g_I ex _4_ H Do g_Ind dex _5_ H n x _ Do g_I de _6 _A Do g_In dex x_6 6_E J n Do g_I Inde ex_ _1_ _A Do g_ nd ex _2 I Do g_ Ind f_B Do g_ taf _B Do S _2 ex nd _I og 2

D

D

og

L N_ _Q 8_ _1 _E 15 ex x_1 C N_J og_ _O d n e _ _ D _N Q _I nd _N 58 _ 8 N 15 _N og _I 8 _1 D og _15 og g_ 58 N_G _N_ D og D Do og_1 58_ 158 _ 1 D _ D g og _S D o N D _ 58 _D g_1 8_N Do g_15 _N_B 8 Do g_15 _N_P Do _158 N_M g Do g_158_ _E Do _158_N _F Dog _158_N R Dog _158_N_ Dog 58_N_I 1 Dog_ 58_N_K 1 Dog_ 58_N_H Dog_1 _N_T 58 Dog_1

p

utgrou

2

2

D D og Do og _In Do g_ _Ind dex Do g_ Ind ex _5 Do g_I Ind ex_ _5_ _T Do g_In nde ex_ 5_N L g Do _I dex x_5 5_F Do g_In ndex _5_ _G g Do d _ Q g_ Dog _Ind ex_ 5_R Ind _In ex 5_ _ P e d Do x_5_ ex_ 5_A Do g_In M 5_O g d Do _Inde ex_8 g_ x_ _ Dog Index 8__RJ _ _ Dog Index_ 8_N Dog _Index 8_B _ _ Dog Index_8 8_I _ Dog_ Index_8 _H In _ Dog_I dex_8__ E L n Dog_I dex_8_M ndex Dog_Ind _8__T ex_8_D Dog_Ind ex_8 Dog_Index_8 _G _C Dog_Index_8_F Dog_Index_8_K Dog_Index_8__S

2

2

O dex_2_A In Dog_C _1_ Staff_A Staff_A_2_O Dog_Index_2_D Staff_A_1_D Staff_A_1_K Staff_A_1_O 3 Staff_A_2_DStaff_A_1_G Staff _3 _N Staff__A A_1_E Sta A _3 Staffff_ _A_2 _C Sta Staffff_A_2__J _A_3 A Sta ff Sta _A_ _H ff_A 2_M _ S 2_G S taff_ S taff_ A_2 S taff_ A_1 _R Stataff_AA_2__P Sta ff_ _2 P ff_ A_2 _B A_ _S 2_ H S S taf St taff_f_A_ St aff A 2_ S a _ _ T St taf ff_AA_2 2_I 5 af f_A _2 _F f_ _ _ St A 2 E aff S _ _N _A St taf 2_C _1 af f_A _B f_ _ A_ 3 _ 3_ O L

_M _7 _F ex _7 nd x _I de P og _In _N 3_ D og x_7 7_H D de _ E A_ f_ In ex 7_ af g_ Ind ex_ 7_B St Do og_ Ind ex_ D og_ Ind 3_A _K D g_ A_ _5 G Dotaff_ ndex x_2_ S g_I nde _7_S Do og_I ndex _7_K D g_I dex _L Doog_In dex_3 _T D g_In ex_7 J Do _Ind x_7_ Dogg_Inde x_5_I Do _Inde _7_P Dog Index _A _ Dog _Index_7 _L Dog _Index_7 Dog ndex_7_D _7_I Dog_I ex Dog_Ind ex_7_Q Dog_Ind _7_G Dog_Index I Dog_Index_2_ Dog_Index_8_A Dog_Index_8_O Dog_Index_8_Q

_I nd D ex Do og _4 Do g_ _In _M D g_ In de D og_ Ind dex x_6 Do og_I Ind ex_ _4_ _J g Do _I nde ex_ 7_R E Do g_I nde x_3 4_C Do g_In ndexx_6_ _G g_ de _6 O In x _ Do Sta dex _6_BS Do g_In ff_F__4_T d g Do _Ind ex_31_C g e Dog _Inde x_3__A C _ Dog Indexx_2_E Dog _Inde _1_P x _ _6 In d Dog _Ind ex_1_A_I ex_4 Dog_ _Q In Dog_I dex_6_D Dog_I ndex_3_F nd Dog_Ind ex_6_G Dog_Ind ex_4_S ex_4_D Staff_C_1_ Dog_Index_6_ A C Dog_Index_6_K Dog_Index_3_S Dog_Index_1_F Dog_Index_4_R Dog_Index_4_H _F Dog_Index_4 _B _4 Dog_Index 4_P x_ Dog_Inde A_2_L Staff_ _3_H Index Dog_ dex_2_C D In L Dog_ Index_3_dex_4_ H _ Dog_ Dog_In dex_2 _K In ex_4 O _ g Do _Ind _7_ x P g Do _Inde x_3_ N e _ Dogg_Ind ex_62_F Do g_Ind dex_ 9_A Do g_In dex_ 4_GC Do g_In dex_ _1_ R Do g_In dex x_6_ _L Do g_In nde ex_6 Do g_I Ind B Do og_ _5_ E D ex _5_ C _ d In ex 5 _J g_ Ind ex_ _5 D Do og_ Ind dex _5_ D og_ _In dex D og _In D og D

Dog_30_K Dog_30_P Dog_30_T Dog_30_I Dog_30_J Dog_30_ Dog_3 H Dog_3 0_S Dog 0_G Dog _30_C Dog _30_L Dog _30_B Do Dog _30_F D g_In _ Doog_In dex_ Dog_ 30_D S g_ de 6_T 30_R Stataff_ Indexx_4_N D ff F_ _4 D og _F_ 1_D _O D og _In 1_ S og _In dex B St taff _In dex _6_ M D a _ de _ D og_ff_A B_1 x_1 4_J o g I _ _A _ H _I nd 1_A n e D d e x_3 og x_ _ _I 6_ B nd F ex _1 _I

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms7560

2

2

8 5

3 1

6 7

4 1

Abdominal wound swab

2

Nasal swab

3

Axilla

4

Left hind leg

5

Subcutaneous penis

6

Left inner thigh

7

Left antebrachium

8

Prescapular lymph node

9

Catheter

Day 4 Day 8

Day 12 Post mortem

2

Figure 4 | Maximum likelihood tree generated from SNPs in the core genome for isolates from Clade 1. Numbers above the branches indicate the number of differentiating SNPs. Isolates from repeated swabs from same individual staff member or animal are numbered sequentially and coloured, with darker tones representing later isolates. The location and time of the swabs from the index case are highlighted in the graphical representation of a dog.

isolates from the only invasive site sampled, the prescapular lymph node (Dog Index 8) were differentiated from the predominant genome-type of the index case by two different SNPs present in the same gene: agrR (AgrR is the response regulator component of the accessory gene regulator (agr)18. The first SNP, present in 13 isolates, was a non-synonymous mutation causing a Ser202Asn substitution (isolates around Dog_Index_8_F in Fig. 3e). The second mutation, a premature stop codon (Lys236STOP) at the C-terminal end of AgrA was present in six isolates (isolates around Dog_Index_8_B in Fig. 3e). The loss of d-haemolytic activity was confirmed in multiple isolates with both of these agrA mutations, demonstrating the mutations caused a loss of AgrA function (Supplementary Table 2 and Supplementary Fig. 1). Furthermore, two of the isolates (Dog_Index_8_S and K) with the S202N substitution had two further non-synonymous mutations (Gln1256Glu and Asn1329Asp) in the same gene: fmtB, a surface anchored protein associated with methicillin resistance19. Interpretation of transmission pathways for clade 1. Next we analysed the entire clade 1 data set to elucidate possible

transmission events. The isolates from staff member A were consistently the most basal, with identical and unique basal isolates being present in all three swabs (Figs 3a and 4 and Supplementary Fig. 2). Furthermore, the isolates from staff member A’s first swab (Staff A1 in Figs 3a and 4) had the greatest pairwise diversity of all clade 1 samples (Supplementary Fig. 2a). For the index case, all isolates except for two basal isolates (Dog_Index_2_D and Dog_Index_2_A) in the nasal swab taken on day 8 were descended from the basal population present in the staff member A (Fig. 4). Directly basal isolates to the sub-clade of isolates from the left antebrachium (forearm; Dog_index_7 in Fig. 4) were present only in staff member A (Staff_A_2_N and C in Fig. 3). At days 12 and 22, staff member A also carried isolates that were either identical (Staff_A_1_A) or that differed by a single SNP (Staff_A_2_L) to the predominant genome-type in the index case (Fig. 4). Interestingly, all the isolates from the three other clade 1-positive staff members (Staff B, C and F) were identical to the predominant genome-type found in the index case. Whereas the two clade 1-positive animal patients dog 30 (positive on day 16) and dog 158 (positive on day 71–59 days after the death of the index case) were both colonized by distinct sub-populations of isolates

NATURE COMMUNICATIONS | 6:6560 | DOI: 10.1038/ncomms7560 | www.nature.com/naturecommunications

& 2015 Macmillan Publishers Limited. All rights reserved.

5

ARTICLE

Dog_150_N_l

St

_D_3

_F

Staff_D_3_ A

ff_ D_ 3_ H

Sta

St aff _D

Sta

ff_D

_3_

K

Staff

Staff_D_3_P

St

D_

2_

M

8 ff_D

Sta 2

_1_

_C

D_3 Staff_

_1_

Staff_D

_2_D

R

F 2

Staff_D_3_G

Staff_D_3_N Staff_D_3_B

Outgroup 2

Staff_D_1_T

3

_A

Q 2__Q D__2 __D O Staff _2_ ff_D 2 Sta T _ 2 _ 2 _D f f Sta _B _1 _N _D f 1 f _ a D St f_ af St

ff_

Staff_D

3 5

10

2 3

_1_

D_

aff

af

ff_D

ff_

St St

_2_L

Sta

Sta

Staff_D_1_L

R

Staff_D_1_

3_ _D_

ff

Sta

S

ff_D

T

3_

D_

ff_

2

Sta

_D

O

2_

l

_1

_E

f_

D_

2_

C

T

Staff_A_3_M

Staff_A_3_G

Sta

_L

_3

_D

aff

f_

Sta

S

3_

D_

f_

af

St

Staff_D_1_D

2

Staff_F_1_

2

af

_S

2

2

R 2_ D_ f_ _N af _2 St _D B aff 2_ St D_ ff_ Sta 2_ P _ D_ Staff _D _D_3 Staff _H Staff_D_1

2

_2

4

2

St

D_ 1

3

Q E 1_ 3_ N F_ x_ 3_ f_ de _ af in ex A St og_ _ind _1_ D og _G D taff D S _1_ ff_B 1_M Sta ff_F__2_O Sta ff_F 1_H Sta ff_F_ 1_1 Sta _F_ Staffff_F_2_H Sta F_3_B Staff_

_E _A_3 1 Staff 3_ _A_ S Staff _1_ ff_A 1_F Sta _A_ _J ff Sta _A_1 M ff _ Sta _A_1 _Q _1 aff St ff_A 3_Q l _ 1_ a St ff _A A_ _T a _ St taff A_1 S f_ af St

St a St ff_A af f_ _1 aff A_1 _H _A _R ff_ A_ _ Sta 1_ 1_N ff L Sta _A_3 _R ff_A Sta _3_ ff_A D _3_ B Staff _A_ 3_K Staff _A_3 _S Staff_ A_3_J Staff_A_3 _T Staff_A_3_F

S

4

_D

St a

Staff_F_2_Q 2_S Staff_F_ _1_S taff_F

aff

_D _3 _Q Sta ff_D _2_ J Sta ff_D _2_ F Staff_D _1_M

Outgroup

Staff_F_2 Staff_F _B Staff _2_N Sta _F_1_O Staffff_F_2_P Sta _F_2 S ff_F _F S taff_ _2_ S taff_ F_2_ E S taff F_ C S ta _F 2_A St taff ff_B _2_K S af _F _1 2 taff_ f_F_ _2_ _C F_ 2_ L 2_ M D

Staff_F_1_N

St

6

2

_3 St _l af f_ D_ 3_ O

J F_ f_ _3_ _O af N F St aff_ 50_ St g_1 _3_K Do ff_F 3_A a St ff_F_ S_C _ Sta _150 N_G g Do _150_ Dog F_1_L _ ff Sta N_R 150_ Dog_ F_3_E Staff_ 3_N Staff_F_ Q Staff_F_3_

Do

aff

f_ St F a St ff_F _1_ a E g_ ff_F _3_L 15 _3 _ 0 _S T g_1 _ 50 Do _N D g_ Dog 150_N _B _15 _ 0_N F Dog _K _15 0_N _Q Staff _F_3 _S Dog_1 50_N _C Dog_15 0_N_L Staff_F_3_H Staff_F_1_P 2 Dog_150_N_M

Do

K 1_

St

Staff_F_3_M

af

Staff_D_3_M _C _D _1 Staff A _2_ ff_D Sta K 2_ D_ _H ff_ _2 S ta Q _D aff 1_ St D_ f_ af St

Staff_F_3_D S_E Dog_150_ 50_S_B Dog_1 _3_P Staff_F _G _F_3 Staff _H 0 _N _1 5 3_F Dog ff_F_ _J 1 Sta F_ A ff_ _ Sta _1 E _ _F aff 0_N A St 5 S_ P 1 g_ _ 0_ Do _15 0_N _B g 1 15 _ Do g_ ff_B Do Sta

St

Dog_150_N_D Dog_150_N_A Staff_F _2_J Dog_1 50_N _S Dog _150 _N_J Sta ff Sta _F_3_O ff_F St a _3_ R f Sta f_F_3 _C St ff_F a _ St ff_B 1_F _1 Do aff_ _E St g_ F_1 _G af 15 f_ F_ 0_N 3_ _N l

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms7560

Figure 5 | Phylogenetic trees of Clade 2 and Clade 3 isolates. Maximum likelihood trees generated from SNPs in the core genome for isolates from (a) Clade 2, (b) Clade 3. Numbers above the branches indicate the number of differentiating SNPs. Isolates from repeated swabs from same individual staff member or animal are numbered sequentially and coloured, with darker tones representing later isolates.

that displayed very low levels of diversity (differing by up to 4 SNPs) and that descended from the predominant genome-type in the index case (Fig. 4 and Supplementary Fig. 2). Dog 30 also produced a single isolate identical to the predominant genometype of the index case (Fig. 4). Evidence for transmission of clade 2 isolates. As for in clade 1, one staff member was persistently colonized with diverse basal isolates of clade 2, namely, staff member F (Fig. 5a and Supplementary Fig. 2). All the clade 2 isolates from staff member A (the other persistent carrier that was co-colonized with clades 1 and 2) from both positive swabs were descended from the basal population present in staff member F (Staff A1 and A3 in Fig. 5a). Furthermore, an isolate from staff member F (Staff_F_1_T) was directly basal to one of the two sub-clades from staff member A’s third swab (Fig. 5a). Dog 150, which was both nasally colonized and had wound infected by clade 2 isolates, first entered the veterinary hospital on day 58, 10 days before the third swab was taken from staff member F (Fig. 1). Dog 150 was populated with isolates that were identical to the predominant genome-type (or that differed by a single SNP) of isolates from staff member F in their third swab (Staff_F_3; Fig. 5a). In contrast to the homogeneity of clade 2 isolates seen in the dog 150, the four clade 2 isolates from staff member B (co-colonized with clade 1 and clade 2) differed by a total of ten SNPs (with two identical isolates from the predominant sub-clade seen in dog 150 and staff F3 in Fig. 5a). Despite the degree of diversity present being equivalent to staff member F, staff member B was only positive for clade 2 isolates on day 12 and was negative 2 days later, suggesting they were only a transient carrier (Figs 1 and 5a and Supplementary Fig. 2). Finally, the clade 2 isolates from staff member G (single isolate: Staff_G_1_A) and the index case (two isolates: Dog_Index_3_E and N) both differed by a single SNP from the genometype of one of the basal clades that made up the majority of staff member F’s population at their first two swabs taken at the same time (Staff_F1 and Staff_F2 in Fig. 5a). 6

Discussion We have used whole-genome sequencing of multiple individual colonies to investigate MRSA carriage and transmission among human staff and animal patients at a veterinary hospital. Using a combination of epidemiological and genome sequence data it is possible to elucidate the probable MRSA transmission events. For the clade 1 isolates (the cause of the index case infection), the most parsimonious explanation for the source was staff member A, because: (i) their isolates were consistently the most basal in the phylogenetic tree, (ii) the isolates in their first swab exhibited the greatest pairwise diversity, (iii) they possessed isolates representing the predominant genome types in the index case, (iv) they were the only known persistent carrier of clade 1 and (v) they were directly involved in care of the index case (Fig. 6). Similar reasoning suggests that for staff members B, C and F it was most likely that they acquired their isolate by transmission from the index case (Fig. 6). In the case of dog 30, either staff members A, B and F or the index case (indirectly through environmental transmission) might have been the source (Figs 4 and 6). For dog 158, although colonized by a population derived from the predominant genome-type of the index case, it first entered the hospital 32 days after the death of index case (Fig. 1). The only known carrier at this time was staff member A, who was still populated with basal isolates, therefore the chain of transmission is not clear (Fig. 6). We also found that at the same time, a second distinct clone of ST22 (clade 2) had been transmitted between staff and animal patients. Staff member F was the most likely source of clade 2 as they were a persistent carrier, and were populated with the most diverse and basal isolates. Staff member A most likely acquired their clade 2 isolates from staff member F, as staff members F’s isolates were basal to the population in staff member A (Figs 5a and 6). For staff member B, the direction of transmission was less clear, as their four isolates were as diverse and representative of the population present in staff member F’s at approximately the same time (Staff B cf. Staff F1 and F2 in Fig. 5a and

NATURE COMMUNICATIONS | 6:6560 | DOI: 10.1038/ncomms7560 | www.nature.com/naturecommunications

& 2015 Macmillan Publishers Limited. All rights reserved.

ARTICLE

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms7560

Pie chart shows proportions of each clade

ST22

No. of colonies isolated from swab: > 20

5

= Strong evidence for direction of transmission

Clade 1 Clade 2

= Multiple potential sources/indirect transmission

1 Unidentified source

Dog 30

Source of clade 1

Staff C Dog 158

Staff A

Staff B Index dog

Staff G Staff F

Dog 150

Source of clade 2

Days 0 – 16

Days 40 – 71

Figure 6 | Summary of inferred transmission pathways. Figure shows the inferred transmission pathways between Staff members and Dogs based on combined genomic and epidemiological evidence. The size of the circle is proportional to the number of colonies recovered after culture from their first swabs. Except for the Index Dog where the number of Clade 2 isolates is not proportional to allow for visualisation. The colours represent the relative proportions of a particular clade in the colonies assessed by whole-genome sequencing.

Supplementary Fig. 2). However, staff member B was not a persistent carrier, and most likely acquired a diverse population from staff member F (Figs 1 and 6). This clade was subsequently acquired by staff member G and the index case, again, most likely from staff member F as their isolates were identical to or differed by a single SNP to staff member F’s contemporaneous isolates. Clade 2 isolates were also found in dog 150. These isolates were identical or varied by a single SNP to the predominant genome type in staff member F’s third swab (Fig. 5a). Dog 150 first entered the veterinary hospital close to the time of staff member F’s third swab (Fig. 1) suggesting that staff member F was a likely source of the clade 2 isolates in dog 150 (Fig. 6). This study yielded some interesting insights concerning withinhost bacterial diversity during colonization and following transmission. Co-colonization by distinct spa-types has been described20 and here we provide evidence of colonization by separate sub-populations of the same multi-locus sequence type. The almost equal proportions of clade 1 and 2 isolates in staff member A’s initial swab indicate that the possibility of correctly identifying their involvement in clade 1 transmission using a single isolate chosen at random would have been only 45% (Staff A 1 in Fig. 1). The implications of this finding are not only limited to S. aureus, as co-infection or co-colonization by a number of bacterial and viral pathogens has been reported10,21–28. The deep sequencing of temporally spaced samples identified that varying degrees of diversity are present in colonized individuals and that the composition of colonizing populations can shift dramatically, including becoming less diverse over time (as in the case of the third swab from Staff member F; Fig. 5a and Supplementary Fig. 2). This situation may be more complex than reported here using only nasal swabs, as variation in populations from different body sites has been reported for S. aureus29–31. The extent of the diversity of S. aureus in host populations will be influenced by the amount of diversity transferred during a transmission event and by the duration between transmission

event and sampling. The data presented here suggest that the populations present in post-transmission recipients can either be homogenous (Clade 1: Staff B, C, F, Dog 30 and 158, Clade 2: Staff G and Dog 150) or heterogeneous (Clade 1: Index case, Clade 2: Staff B) (Figs 4 and 5a). Of course, no data were available about the number of transmission events or their exact timing, or the nature of population bottlenecks occurring post transmission. Three staff members (staff A, D and F) were persistent MRSA carriers but had different rates of transmission. Staff member D, although an active member of the clinical team looking after the index case, neither transmitted their clade (clade 3) nor acquired any of the other clades. Staff member A, who was the apparent source of clade 1 also carried clade 2, but did not transmit this clade. Behavioural characteristics, the nature of the contact, adherence to hygiene and aseptic technique, all may be expected to influence the likelihood of transmission and acquisition. Furthermore, biological characteristics of the individual bacterial lineages and hosts also likely influence the likelihood of transmission and colonization. Studies using theoretical modelling have highlighted caveats in the use of sequencing data from single colonies alone to infer transmission32,33. Our data provide empirical evidence from a real world situation that supports the concerns raised by modelling work. The selection of 20 individual isolates to sequence was arbitrary; is there an optimal number of isolates? A rarefaction and extrapolation analysis of isolates obtained from the index case (Supplementary Fig. 3) suggests that there would be a near linear increase in the number of genomic variants obtained with increased sampling. One answer might be the use of shotgun metagenomics34–37 to assess the degree of diversity present in clinical samples (with microbial DNA enrichment38) or cultured isolates to provide an empirical basis for the selection of the number of colonies to sequence. Previously, we have shown a shared population of CC22 MRSA circulates in humans, cats and dogs39. Here, we demonstrate direct transmission occurring in both directions, substantiating

NATURE COMMUNICATIONS | 6:6560 | DOI: 10.1038/ncomms7560 | www.nature.com/naturecommunications

& 2015 Macmillan Publishers Limited. All rights reserved.

7

ARTICLE

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms7560

the view that ST22 has a broad host range and behaves as a nosocomial pathogen within a veterinary health-care setting just as it does within human hospitals39,40. An investigation of mutations among isolates within the index case sheds light on the evolution of the pathogen over the course of infection. We identified that the population present in the lymph node of the index case all had one of two different mutations causing inactivation of the same gene: agrA. No other clade 1 isolates had the same mutations, suggesting they were present at very low frequency or they were generated de novo during infection. Selection of agr mutants by the presence of antibiotics (the index case was treated throughout its time in hospital) has been reported, probably due to the fitness cost of RNAIII expression, providing agr defective ‘cheaters’ a distinct fitness advantage41–44. agr dysfunction has also been identified as risk factor for persistent bacteremia in human patients and is associated with a poor clinical outcome45–47. Our findings, extending these observation to a canine host suggest that inactivation of the agr system might be advantageous for survival irrespective of the host species. The isolates from lymph node and axilla all had an increased copy number of fSa2HO 5096 0412. Given the proximity of the axilla to the prescapular lymph node this might suggest that the increase in phage copy occurred in the ancestral population in the axilla and that this population then seeded the invasive infection. Further work is required to investigate the role of both fSa2HO 5096 0412 and phage-copy number variation in S. aureus pathogenesis. In conclusion, this study confirms the value of whole-genome sequencing in the epidemiological investigation of nosocomial transmission of MRSA while identifying potential pitfalls. It highlights the potential complexity of transmission networks and the potential limitations of obtaining sequencing data from a single isolate from a host. Although this study was performed in a veterinary context its results are highly relevant to any health-care setting. Methods Ethical review. The proposed study was submitted to the Ethical Review Committee at the Department of Veterinary Medicine at the University of Cambridge. It was approved on 22 April 2013 (CR88). MRSA screening and bacterial isolation. Sterile Ames media transport swabs (Medical Wire) were used to sample both anterior nares of staff members and animal patients. The personnel who conducted the swabbing and processing of samples were also swabbed and found to be negative for MRSA. Swabs were then inoculated into 4 ml of Mu¨ller Hinton broth with 6.5% NaCl and grown statically for 24 h at 37 °C in air. One hundred microlitres of broth were then plated onto MRSA Brilliance Agar 2 plate (Oxoid) and incubated for 24 h at 37 °C in air. Plates with no growth were incubated for a further 24 h to confirm the negative result. Putative MRSA were confirmed by PCR for mecA and femB48. Whole-genome sequencing. Genomic DNA was extracted from overnight cultures grown from single colonies in 5 ml of tryptic soy broth overnight at 37 °C using the MasterPure Gram Positive DNA Purification Kit. Illumina library preparation was carried out as described by Quail et al.49, and Mi-seq or Hi-seq sequencing was carried out following the manufacturer’s standard protocols (Illumina, Inc.). Nucleotide sequences been deposited in the European Nucleotide Archive (Supplementary Data set 1). Bioinformatic analysis and phylogenetics. Fastq files for the isolates were mapped against the ST22 MRSA reference genome HO 5096 0412 (EMBL accession code HE681097) using SMALT (http://www.sanger.ac.uk/resources/software/ smalt/) in order to identify SNPs, as previously described (Supplementary Data sets 2, 3 and 4)50. SNPs located in mobile genetic elements or low-quality regions (insertions and deletions (indels)/low coverage/repeat regions) were identified by manual inspection and removed from the alignments (Supplementary Data sets 5, 6 and 7). High-quality SNPs were then manually inspected in the alignment and using BAM files mapped on the reference genome. Any isolates with large numbers of N’s (uncalled SNPs) in the alignment were removed from the analysis as potentially contaminated. The maximum likelihood tree was generated from the 8

resulting SNPs in the core genome using RAxML51. Indels were identified as previously described52 (Supplementary Data sets 2, 3 and 4). Indels were manually assessed using BAM files mapped on the reference. Comparison of the mobile genetic content of the isolates was assessed by BLAST analysis against Velvet de novo assemblies53. Experimental measurement of mutation rate. The mutation rates for representative isolates from each clade were measured using the methods described by O’Neill and Chopra54. Briefly, three independent colonies were picked for three representative isolates distributed throughout the phylogeny of each ST22 clade (Clades 1, 2 and 3) and grown in 5 ml Isosensitest broth (Oxoid) at 37 °C at 200 r.p.m. until they reached an optical density of B1 at 595 nm. One hundred microlitres of each culture was then serially diluted and plated onto either Iso-Sensitest agar plates (Oxoid) containing 4  minimum inhibitory concentration of rifampicin (as determined by agar dilution55) or on Iso-Sensitest agar plates. Mutation rates were calculated as the number resistant colonies recovered on the rifampicin plates as a proportion of the total population as determined on Iso-Sensitest plates (Supplementary Table 1). A previously described mutS-knockout strain RN4220mutS and its isogenic wild-type RN4200 (ref. 54) were included as controls with mutation rates of 3.02(±0.13)  10  7 and 3.19(±0.13)  10  6, respectively, a 9.5-fold difference, and similar to that previously reported for these strains50. Haemolysis assay. A haemolysis assay was carried out as previously described17. A single colony of S. aureus RN4200 was streaked down the centre of a Columbia agar with sheep blood plate (Oxoid). Single colonies of representative isolates were then cross-streaked horizontally up to the RN4220. Plates were then incubated for 18 h at 37 °C followed by 6 h at 4 °C. Isolates that produced enhanced haemolysis in the area close to RN4220, which only produces b-haemolysin were scored as positive for production of d-haemolysis (Supplementary Table 2). Representative photographs are shown in Supplementary Fig. 1. BEAST analysis. The presence of temporal signal was tested for separately in the isolates sampled from staff A, D, F and the index dog, through fitting a regression of root-to-tip genetic distance against sampling date using Path-O-Gen v1.4 (http:// tree.bio.ed.ac.uk/software/pathogen/). Temporal signal was only judged as present in the isolates from staff D. For the samples from this individual, the fit of linear regression of root-to-tip distance against sampling time to the data had a large and positive correlation coefficient (r ¼ 0.68). For other individuals, this was not the case (staff A, r ¼ 0.11; staff F, r ¼  0.26; index dog, r ¼ 0.06). The date of the most recent common ancestor and the evolutionary rate were estimated using BEAST for isolates from staff D56. A HKY substitution model with a 4-category gamma distribution of rates and no partitioning of sites was used. Because of low levels of divergence, a strict molecular clock model and constant population size model were used. Estimates were similar to those obtained from a relaxed clock model. Bayesian phylogenetic analysis. SNPs, indels and mobile genetic elements (MGEs) from the index dog and staff members A, D and F were used to estimate the topologies of phylogenies of the strains found within these individuals. Separate analyses were performed for the two clades of MRSA found within staff member A. The data sets for each individual were divided into two partitions: (i) SNPs and (ii) indels and MGEs (Supplementary Data sets 2, 3 and 4). Evolutionary rates were estimated separately for these two partitions. A GTR model with a gamma distribution of rates over sites was used for the SNP partition. A two-state F81 model with no rate variation over sites was used for the indel/MGE partition. MrBayes was used to estimate the phylogenies57. No information about fixed sites was included in the models. Convergence and burnin were established through use of Tracer v1.6 (http://beast.bio.ed.ac.uk/Tracer). All MrBayes runs were well converged with estimated sample size (ESS) values 4200. The topologies that resulted from the analysis of SNPs, indels and MGEs are described in Fig. 4. Trees represent consensus tree topology, with nodes present in o50% trees collapsed to polytomies, and rooting according to the Maximum likelihood (ML) topology. Calculation of diversity. The genetic diversity within populations of sequenced isolates was measured using standard population genetic statistics, namely, Watterson’s theta (a measure based on the number of segregating sites) and nucleotide diversity, pi, (the mean proportion of sites that differ between pairs of sequences58; Supplementary Fig. 2). Rarefaction and extrapolation. To estimate the number of additional variants that would be obtained from the further sequencing of isolates from a single host, a rarefaction and extrapolation analysis were performed using data from the nine swabs taken from the index dog59 (Supplementary Fig. 3). This analysis was performed using EstimateS 9.1.0 (EstimateS: Statistical estimation of species richness and shared species from samples. Version 9. http://purl.oclc.org/ estimates)60.

NATURE COMMUNICATIONS | 6:6560 | DOI: 10.1038/ncomms7560 | www.nature.com/naturecommunications

& 2015 Macmillan Publishers Limited. All rights reserved.

ARTICLE

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms7560

Copy number variation. To screen genomes for copy number variation, we used the cn.MOPS R package61 using the haplocn.mops algorithm adjusted for haploid genomes. All genomes were screened visually for regions with potentially changed copy number. Next, the mean read count per base was generated and an R script was then used to all regions to identify regions with an increase in copy number relative to the mean copy number of the entire genome. Regions identified to have an increased copy number were then manually inspected in the annotated genome sequence and by comparative genomics to identify the boundaries of mobile genetic elements. One region encoding a likely bacteriophage (fSa2HO 5096 0412, present at coordinates 1520142-1566315 in the reference genome HO 5096 0412 (ref. 13)) was identified to be variable in isolates in all three clades. To take into account the difference in coverage due to proximity to the origin of replication and terminus, the fold difference copy number of fSa2HO 5096 0412 sequences between isolates was calculated by dividing the mean read count of the entire region spanning fSa2HO 5096 0412 over the mean read count of 100 kb upstream and 100 kb downstream. Isolates with a fSa2HO 5096 0412 copy number Z10% above or below the median (median copy numbers: 1.05, 1.36 and 1.19 for clade 1, 2 and 3, respectively) for the entire clade deemed to have a changed copy number (Supplementary Figs 4 and 5).

References 1. Koser, C. U. et al. Routine use of microbial whole genome sequencing in diagnostic and public health microbiology. PLoS Pathog. 8, e1002824 (2012). 2. Didelot, X., Bowden, R., Wilson, D. J., Peto, T. E. & Crook, D. W. Transforming clinical microbiology with bacterial genome sequencing. Nat. Rev. Genet. 13, 601–612 (2012). 3. Bryant, J. M. et al. Whole-genome sequencing to identify transmission of Mycobacterium abscessus between patients with cystic fibrosis: a retrospective cohort study. Lancet 381, 1551–1560 (2013). 4. Reuter, S. et al. Rapid bacterial whole-genome sequencing to enhance diagnostic and public health microbiology. JAMA Intern. Med. 173, 1397–1404 (2013). 5. Harris, S. R. et al. Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphylococcus aureus: a descriptive study. Lancet Infect. Dis. 13, 130–136 (2013). 6. Snitkin, E. S. et al. Tracking a hospital outbreak of carbapenem-resistant Klebsiella pneumoniae with whole-genome sequencing. Sci. Transl. Med. 4, 148ra116 (2012). 7. Lewis, T. et al. High-throughput whole-genome sequencing to dissect the epidemiology of Acinetobacter baumannii isolates from a hospital outbreak. J. Hosp. Infect. 75, 37–41 (2010). 8. Golubchik, T. et al. Within-host evolution of Staphylococcus aureus during asymptomatic carriage. PLoS ONE 8, e61319 (2013). 9. Young, B. C. et al. Evolutionary dynamics of Staphylococcus aureus during progression from carriage to disease. Proc. Natl Acad. Sci. USA 109, 4550–4555 (2012). 10. Didelot, X. et al. Genomic evolution and transmission of Helicobacter pylori in two South African families. Proc. Natl Acad. Sci. USA 110, 13880–13885 (2013). 11. Richardson, J. F. & Reith, S. Characterization of a strain of methicillin-resistant Staphylococcus aureus (EMRSA-15) by conventional and molecular methods. J. Hosp. Infect. 25, 45–52 (1993). 12. Afroz, S. et al. Genetic characterization of Staphylococcus aureus isolates carrying Panton-Valentine leukocidin genes in Bangladesh. Jpn J. Infect. Dis. 61, 393–396 (2008). 13. Holden, M. T. et al. A genomic portrait of the emergence, evolution, and global spread of a methicillin-resistant Staphylococcus aureus pandemic. Genome Res. 23, 653–664 (2013). 14. Sako, T. & Tsuchida, N. Nucleotide sequence of the staphylokinase gene from Staphylococcus aureus. Nucleic Acids Res. 11, 7679–7693 (1983). 15. van Wamel, W. J., Rooijakkers, S. H., Ruyken, M., van Kessel, K. P. & van Strijp, J. A. The innate immune modulators staphylococcal complement inhibitor and chemotaxis inhibitory protein of Staphylococcus aureus are located on beta-hemolysin-converting bacteriophages. J. Bacteriol. 188, 1310–1315 (2006). 16. Novick, R. P. Autoinduction and signal transduction in the regulation of staphylococcal virulence. Mol. Microbiol. 48, 1429–1449 (2003). 17. Traber, K. E. et al. agr function in clinical Staphylococcus aureus isolates. Microbiology 154, 2265–2274 (2008). 18. Geisinger, E., Muir, T. W. & Novick, R. P. agr receptor mutants reveal distinct modes of inhibition by staphylococcal autoinducing peptides. Proc. Natl Acad. Sci. USA 106, 1216–1221 (2009). 19. Komatsuzawa, H. et al. Tn551-mediated insertional inactivation of the fmtB gene encoding a cell wall-associated protein abolishes methicillin resistance in Staphylococcus aureus. J. Antimicrob. Chemother. 45, 421–431 (2000). 20. Votintseva, A. A. et al. Multiple-strain colonization in nasal carriers of Staphylococcus aureus. J. Clin. Microbiol. 52, 1192–1200 (2014). 21. Smith, J. A. et al. Dynamic coinfection with multiple viral subtypes in acute hepatitis C. J. Infect. Dis. 202, 1770–1779 (2010).

22. Jost, S. et al. A patient with HIV-1 superinfection. New Engl. J. Med. 347, 731– 736 (2002). 23. Eyre, D. W. et al. Clostridium difficile mixed infection and reinfection. J. Clin. Microbiol. 50, 142–144 (2012). 24. Forbes, K. J. et al. Campylobacter immunity and coinfection following a large outbreak in a farming community. J. Clin. Microbiol. 47, 111–116 (2009). 25. Lynn, F. et al. Genetic typing of the porin protein of Neisseria gonorrhoeae from clinical noncultured samples for strain characterization and identification of mixed gonococcal infections. J. Clin. Microbiol. 43, 368–375 (2005). 26. Wright, M. S. et al. New insights into dissemination and variation of the health care-associated pathogen Acinetobacter baumannii from genomic analysis. mBio 5, e00963–13 (2014). 27. Brugger, S. D., Frey, P., Aebi, S., Hinds, J. & Muhlemann, K. Multiple colonization with S. pneumoniae before and after introduction of the sevenvalent conjugated pneumococcal polysaccharide vaccine. PLoS ONE 5, e11638 (2010). 28. Koser, C. U. et al. Whole-genome sequencing for rapid susceptibility testing of M. tuberculosis. New Engl. J. Med. 369, 290–292 (2013). 29. Rosenthal, A., White, D., Churilla, S., Brodie, S. & Katz, K. C. Optimal surveillance culture sites for detection of methicillin-resistant Staphylococcus aureus in newborns. J. Clin. Microbiol. 44, 4234–4236 (2006). 30. Lauderdale, T. L. et al. Carriage rates of methicillin-resistant Staphylococcus aureus (MRSA) depend on anatomic location, the number of sites cultured, culture methods, and the distribution of clonotypes. Eur. J. Clin. Microbiol. Infect. Dis. 29, 1553–1559 (2010). 31. Hombach, M., Pfyffer, G. E., Roos, M. & Lucke, K. Detection of methicillinresistant Staphylococcus aureus (MRSA) in specimens from various body sites: performance characteristics of the BD GeneOhm MRSA assay, the Xpert MRSA assay, and broth-enriched culture in an area with a low prevalence of MRSA infections. J. Clin. Microbiol. 48, 3882–3887 (2010). 32. Worby, C. J., Lipsitch, M. & Hanage, W. P. Within-host bacterial diversity hinders accurate reconstruction of transmission networks from genomic distance data. PLoS Comput. Biol. 10, e1003549 (2014). 33. Didelot, X. et al. Bayesian inference of infectious disease transmission from whole genome sequence data. Mol. Biol. Evol. 31, 1869–1879 (2014). 34. Doughty, E. L., Sergeant, M. J., Adetifa, I., Antonio, M. & Pallen, M. J. Culture-independent detection and characterisation of Mycobacterium tuberculosis and M. africanum in sputum samples using shotgun metagenomics on a benchtop sequencer. PeerJ 2, e585 (2014). 35. Loman, N. J. et al. A culture-independent sequence-based metagenomics approach to the investigation of an outbreak of Shiga-toxigenic Escherichia coli O104:H4. JAMA 309, 1502–1510 (2013). 36. Hasman, H. et al. Rapid whole-genome sequencing for detection and characterization of microorganisms directly from clinical samples. J. Clin. Microbiol. 52, 139–146 (2014). 37. Wilson, M. R. et al. Actionable diagnosis of neuroleptospirosis by nextgeneration sequencing. N. Engl. J. Med. 370, 2408–2417 (2014). 38. Seth-Smith, H. M. et al. Generating whole bacterial genome sequences of lowabundance species from complex samples with IMS-MDA. Nat. Protoc. 8, 2404–2412 (2013). 39. Harrison, E. M. et al. A shared population of epidemic methicillin-resistant Staphylococcus aureus 15 circulates in humans and companion animals. mBio 5, e00985–13 (2014). 40. Vincze, S. et al. Molecular analysis of human and canine Staphylococcus aureus strains reveals distinct extended-host-spectrum genotypes independent of their methicillin resistance. Appl. Environ. Microbiol. 79, 655–662 (2013). 41. Butterfield, J. M. et al. Predictors of agr dysfunction in methicillin-resistant Staphylococcus aureus (MRSA) isolates among patients with MRSA bloodstream infections. Antimicrob. Agents. Chemother. 55, 5433–5437 (2011). 42. Mwangi, M. M. et al. Tracking the in vivo evolution of multidrug resistance in Staphylococcus aureus by whole-genome sequencing. Proc. Natl Acad. Sci. USA 104, 9451–9456 (2007). 43. Paulander, W. et al. Antibiotic-mediated selection of quorum-sensing-negative Staphylococcus aureus. mBio 3, e00459–12 (2013). 44. Pollitt, E. J., West, S. A., Crusz, S. A., Burton-Chellew, M. N. & Diggle, S. P. Cooperation, quorum sensing, and evolution of virulence in Staphylococcus aureus. Infect. Immun. 82, 1045–1051 (2014). 45. Schweizer, M. L. et al. Increased mortality with accessory gene regulator (agr) dysfunction in Staphylococcus aureus among bacteremic patients. Antimicrob. Agents. Chemother. 55, 1082–1087 (2011). 46. Park, S. Y. et al. agr dysfunction and persistent methicillin-resistant Staphylococcus aureus bacteremia in patients with removed eradicable foci. Infection 41, 111–119 (2013). 47. Fowler, Jr. V. G. et al. Persistent bacteremia due to methicillin-resistant Staphylococcus aureus infection is associated with agr dysfunction and low-level in vitro resistance to thrombin-induced platelet microbicidal protein. J. Infect. Dis. 190, 1140–1149 (2004).

NATURE COMMUNICATIONS | 6:6560 | DOI: 10.1038/ncomms7560 | www.nature.com/naturecommunications

& 2015 Macmillan Publishers Limited. All rights reserved.

9

ARTICLE

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms7560

48. Paterson, G. K. et al. The newly described mecA homologue, mecALGA251, is present in methicillin-resistant Staphylococcus aureus isolates from a diverse range of host species. J. Antimicrob. Chemother. 67, 2809–2813 (2012). 49. Quail, M. A. et al. A large genome center’s improvements to the Illumina sequencing system. Nat. Methods 5, 1005–1010 (2008). 50. Koser, C. U. et al. Rapid whole-genome sequencing for investigation of a neonatal MRSA outbreak. N. Engl. J. Med. 366, 2267–2275 (2012). 51. Stamatakis, A., Ludwig, T. & Meier, H. RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21, 456–463 (2005). 52. Croucher, N. J. et al. Rapid Pneumococcal evolution in response to clinical interventions. Science 331, 430–434 (2011). 53. Zerbino, D. R. & Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008). 54. O’Neill, A. J. & Chopra, I. Insertional inactivation of mutS in Staphylococcus aureus reveals potential for elevated mutation frequencies, although the prevalence of mutators in clinical isolates is low. J. Antimicrob. Chemother. 50, 161–169 (2002). 55. A guide to sensitivity testing: report of the working party on antibiotic sensitivity testing of the British Society for Antimicrobial Chemotherapy. in J. Antimicrob. Chemother. 27, 1–50 (1991). 56. Drummond, A. J., Suchard, M. A., Xie, D. & Rambaut, A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–1973 (2012). 57. Huelsenbeck, J. P. & Ronquist, F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics. 17, 754–755 (2001). 58. Watterson, G. A. On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7, 256–276 (1975). 59. Colwell, R. K., Mao, C. X. & Chang, J. Interpolating, extrapolating, and comparing incidence-based species accumulation curves. Ecology 85, 2717–2727 (2004). 60. Colwell, R. K. et al. Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages. J. Plant Ecol. 5, 3–21 (2012). 61. Klambauer, G. et al. cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate. Nucleic Acids Res. 40, e69 (2012).

Cambridge (S.J.P.), the Moredun Research Institute and the Wellcome Trust Sanger Institute (J.P. and S.J.P). S.J.P. receives support from the NIHR Cambridge Biomedical Research Centre. M.T.G.H., S.R.H. and J.P. were funded by Wellcome Trust grant no. 098051. G.G.R.M. was funded by an MRC studentship.

Author contributions G.K.P. contributed to the initiation and design of the study, the collection and processing of samples, the interpretation of data and wrote the manuscript. E.M.H. designed and conducted the bioinformatics analysis, analysed the data and wrote the manuscript. G.G.R.M. and J.J.W. carried out some phylogenetic analyses and contributed to the manuscript. J.H.W. collected clinical isolates and provided epidemiological data. M.T.G.H. conducted bioinformatic analysis and contributed to the manuscript. F.J.E.M. carried out microbiological isolations of the isolates. X.B. carried out experimental validation of mutations. G.K. contributed to the analysis and wrote coding scripts. S.R.H. wrote coding scripts, contributed to the analysis of the data and the manuscript. D.J.M. contributed to the study design and the manuscript. S.J.P. contributed to the interpretation of the data and the manuscript. M.E.H. contributed to the collection of isolates, provided epidemiological data and contributed to the manuscript. J.P. contributed to the initiation and design of the study, interpretation of the data and contributed to the manuscript. M.A.H. initiated and coordinated the study, contributed to the collection of isolates, analysed data and wrote the manuscript.

Additional information Accession numbers: Nucleotide sequences have been deposited in the European Nucleotide Archive under the study number: ERP003396. Individual isolate accessions are in Supplementary Data set 1. Supplementary Information accompanies this paper at http://www.nature.com/ naturecommunications Competing financial interests: There are no competing financial interests. Reprints and permission information is available online at http://npg.nature.com/ reprintsandpermissions/ How to cite this article: Paterson, G. K. et al. Capturing the cloud of diversity reveals complexity and heterogeneity of MRSA carriage, infection and transmission. Nat. Commun. 6:6560 doi: 10.1038/ncomms7560 (2015).

Acknowledgements Thanks to Dr Alex O’Neill, University of Leeds and Dr Matthew Ellington, Public Health England for provision of RN4220 and RN4200mutS. We thank the core sequencing and informatics team at the Wellcome Trust Sanger Institute for sequencing of the isolates described in this study. This work was supported by a Medical Research Council Partnership grant (G1001787/1) held between the Department of Veterinary Medicine, University of Cambridge (M.A.H.), the School of Clinical Medicine, University of

10

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

NATURE COMMUNICATIONS | 6:6560 | DOI: 10.1038/ncomms7560 | www.nature.com/naturecommunications

& 2015 Macmillan Publishers Limited. All rights reserved.

Capturing the cloud of diversity reveals complexity and heterogeneity of MRSA carriage, infection and transmission.

Genome sequencing is revolutionizing clinical microbiology and our understanding of infectious diseases. Previous studies have largely relied on the s...
2MB Sizes 1 Downloads 7 Views