Virus Research 202 (2015) 144–150

Contents lists available at ScienceDirect

Virus Research journal homepage: www.elsevier.com/locate/virusres

Re-emergence of a genetic outlier strain of equine arteritis virus: Impact on phylogeny F. Steinbach, D.G. Westcott, S.L. McGowan, S.S. Grierson, J.P. Frossard, B. Choudhury ∗ Department of Virology, Animal and Plant Health Agency, Weybridge, Surrey KT15 3NB, United Kingdom

a r t i c l e

i n f o

Article history: Available online 17 December 2014 Keywords: Equine arteritis virus Phylogeny Classification Full genome sequencing

a b s t r a c t Equine arteritis virus (EAV) is the causative agent of equine viral arteritis (EVA), a respiratory and reproductive disease of equids, which is notifiable in some countries including the Great Britain (GB) and to the OIE. Herein, we present the case of a persistently infected stallion and the phylogenetic tracing of the virus strain isolated. Discussing EAV occurrence and phylogenetic analysis we review features, which may aid to harmonise and enhance the classification of EAV. Crown Copyright © 2015 Published by Elsevier B.V. All rights reserved.

1. The EAV genome EAV is an enveloped, linear, positive sense, single stranded RNA virus representing the prototype virus of the Arteriviridae family, in the order Nidovirales. This family also includes simian haemorrhagic fever virus, porcine reproductive and respiratory syndrome virus (PRRSV) and lactate dehydrogenase-elevating virus (Cavanagh, 1997). The genome of EAV (Fig. 1) is approximately 12.7 kb in length and consists of 10 overlapping open reading frames (ORFs), which are flanked by 5 and 3 untranslated regions (UTRs) (Snijder and Meulenberg, 1998; Firth et al., 2011; Balasuriya et al., 2014). ORFs 1a and b occupy approximately 75% of the genome and both are translated to produce polyproteins, which are then extensively processed into 13 non-structural proteins (Table 1) (Snijder and Meulenberg, 1998). At the 3 end, occupying approximately 25% of the genome, the remaining eight ORFs, 2–7 are nested and encode seven envelope proteins and the viral nucleocapsid (Table 1) (Snijder and Meulenberg, 1998; Firth et al., 2011). Although there is only one serotype of EAV, variation exists between field strains of the virus and their neutralising properties, i.e. neutralising epitopes are not shared across all strains as evident from the use of monoclonal antibodies (Glaser et al., 1995; Balasuriya et al., 1997). ORF5 encodes the major envelope protein, GP5, which contains the best characterised neutralisation determinants. This is also a most variable region within the genome and hence is commonly used for phylogenetic analysis (Balasuriya et al.,

∗ Corresponding author. Tel.: +44 01932 357559. E-mail address: [email protected] (B. Choudhury). http://dx.doi.org/10.1016/j.virusres.2014.12.009 0168-1702/Crown Copyright © 2015 Published by Elsevier B.V. All rights reserved.

1995; Stadejek et al., 1999; Hornyak et al., 2005; Mittelholzer et al., 2006; Metz et al., 2011). Phylogenetic analysis has also been conducted using ORFs 3, 6 and 7 (Lepage et al., 1996; Hedges et al., 2001; Chirnside et al., 1994). Despite variations in gene, phylogenetic analysis models and number of sequences used the general consensus of such phylogenetic studies show a segregation of sequences into European and North American lineages of which the European lineage can be further divided into two clusters. Unsurprisingly, some European lineage viruses have been documented in North America and vice versa, highlighting the transport of viruses between the two continents via trade of horses and semen. Excluding patent applications and molecular clones there are 800 plus EAV sequences available in GenBank (as of June 2014). Thirty-two of these are full genomes: two (accession numbers: NC002532 and DQ846750) are different iterations of Bucyrus, sharing 99.6% homology at the nucleotide level; five are from the French outbreak in 2007 and the other 25 are from the USA. Of the remaining sequences, around 25% of sequences deposited are of North American origin, another 25% were reported in Western Europe and a further 25% from Eastern Europe. There is only one Asian sequence, from China (JX868590) and three from the Middle East (Qatar; accession no. AY453329-32). The African continent is also poorly represented with only 16 sequences, all reported from South Africa. There are around sixty sequences for which the location was not stated in the GenBank record or associated manuscripts. As mentioned the majority of phylogenetic analysis so far has been conducted on ORF5. Of the other segments used for phylogenetic analysis, ORF3 offers over 250 sequences, ORF6 over 210 sequences and ORF7 over 280 sequences, compared to nearly 600

F. Steinbach et al. / Virus Research 202 (2015) 144–150

145

Fig. 1. EAV genome organisation (adapted from Snijder and Kikkert, 2013 and GenBank’s genome section on EAV). The ORF1ab (blue) is expressed as a fusion protein, from the genomic RNA, despite the −1 programmed ribosomal frameshift required for ORF1b (light blue). The genes encoding the structural proteins (red) are translated from subgenomic RNAs. Polyprotein cleavage sites are depicted above ORF1ab. Blue arrowheads: sites cleaved by PLP domains of nsp1/nsp2; red arrowheads: sites cleaved by the nsp4. The graph displays the relative sizes and locations of the resulting proteins. Proteins expressed in frame to each other are displayed on the same axis. NSP: non-structural protein; GP: glycoprotein; M: membrane protein; E: envelope protein; N: nucleocapsid protein. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

sequences which will allow use of ORF5 e.g. including those which cover ORFs 2–7. However, where the focus is singularly on ORF5 only, the majority (188/278) sequences deposited in GenBank are partial sequences which are 518 nucleotides in length and cover genome positions 11296–11813 (Bucyrus) representing the most variable region within EAV ORF5. 2. EAV distribution and occurrence EAV was first isolated in Bucyrus, OH, USA in 1953, from foetal lung tissue during an outbreak of respiratory disease and abortions in horses (Doll et al., 1957; reviewed in Balasuriya et al., 2014). The virus is thought to have an almost global distribution: via serological surveys EAV has been noted in equine populations in Europe, Africa, Asia, the Americas and Australia (Timoney and McCollum, 1993). The lack of unified control measures, poor biosecurity and the increased international trade in horses has assisted in the spread of EAV. The only countries considered to be disease free are Iceland and Japan. Recently, McFadden et al. (2013) demonstrated a lack of EAV prevalence in the general horse population in New Zealand. A hallmark of EAV is its persistence in and spread through stallions. Larger clinical outbreaks occur only periodically and genetic markers for pathogenicity have been suggested, but lack further characterisation (Balasuriya et al., 2014). The first clinical appearance of EAV in the United Kingdom was in 1993 (Wood et al., 1995). The source of the outbreak was a stallion imported from Poland. The outbreak began on a non-thoroughbred stud and spread to five other premises via chilled semen used for artificial insemination. Around 100 animals became infected and the outbreak was

contained by means of voluntary movement restrictions. Subsequently EAV was made notifiable in the UK (EAV order 1995). Since then, there have been sporadic cases up to and including the case of the imported Spanish stallion in 2012, which is described in this manuscript. During the last decade clinical outbreaks of EAV occurred in several countries on both sides of the Atlantic. In the summer of 2007, the largest EVA outbreak in France occurred causing mortality and economic loss due to disruption of race schedules (Pronost et al., 2010). The outbreak affected 18 premises in 5 counties in western France, which represented the index, 8 primary and 9 secondary premises. Artificial insemination in draught horses was responsible for the index case. In a similar timeframe (2006–2007), a multistate US outbreak occurred and for the first time disease was observed in Quarter Horses (Powell and Timoney, 2006). Due to the nature of EAV dissemination through infectious semen, EVA prevalence can show disparity between horse breeds e.g. prior to that outbreak seroprevalance in Quarter Horses was recorded at 0.6% (Anonymous, 2000). There is limited information regarding EAV prevalence in South America. Echeverría et al. (2003) reported the first isolation of EAV in Argentina, and perhaps South America, in 2001. The virus was isolated from the semen of an imported seropositive stallion held in isolation at a breeding farm in the Buenos Aires Province. Three additional isolates were obtained in 2007 with phylogenetic analysis suggesting an European lineage (Echeverría et al., 2007). Shortly thereafter in 2010 another significant outbreak occurred in Argentina, which was linked to a semen import from the Netherlands (Barrandeguy, 2010).

Table 1 EAV genes, proteins and their functions. ORF

Protein

Function

1a

nsp1

Accessory protease: important role in virion biogenesis and crucial in processing of the replicase polyprotein and the production of subgenomic RNA (Tijms et al., 2007) Accessory-protease: auto-protease activity between nsp2 and 3. Co-factor for the successful processing of nsp4/5 (Snijder et al., 1994). Deubiquitinating enzyme activity (van Kasteren et al., 2013). Transmembrane non-structural protein thought to be involved in double-membrane vesicle formation (Posthuma et al., 2008). Main protease: responsible for cleaving nsp3–8 and nsp3–12 polypeptides (Snijder et al., 1994). Suggested role in the membrane association of the EAV replication complex. (Snijder et al., 1994). Function not determined. Further cleaved into nsp7␣ and nsp7␤, function not fully determined, critical role in RNA synthesis (Manolaridis et al., 2011). Function not determined. RNA-dependent RNA polymerase (Snijder et al., 1994). Helicase (van Dinten et al., 1997). Multifunctional including endoribonuclease activity, critical role in RNA synthesis (Posthuma et al., 2006). Function not determined. Predicted to be a membrane protein which is essential for the production of infectious progeny (Snijder et al., 1999). Minor envelope protein: membrane protein which is essential for the production of infectious progeny (Snijder et al., 1999). Membrane associated protein: function not fully determined but thought essential for viral replication (Wieringa et al., 2002). Membrane glycoprotein: function not fully determined but thought essential for viral replication (Wieringa et al., 2002). Membrane protein: function not fully determined but thought important for viral replication (Firth et al., 2011). Major envelope protein, location of neutralisation domains (Balasuriya et al., 1997) Envelope protein suggested to be involved in virus budding (de Vries et al., 1995) Nucleocapsid protein suggested to bind to envelope protein domains during budding (Tijms et al., 2002).

nsp2

1b

2a 2b 3 4 5a 5 6 7

nsp3 nsp4 nsp5 nsp6 nsp7 nsp8 nsp9 nsp10 nsp11 nsp12 E GP2 GP3 GP4 ORF5a GP5 M N

146

F. Steinbach et al. / Virus Research 202 (2015) 144–150

GQ903803_S3854_USA 74 GQ903804_S4333_USA

GQ903805_S3886_USA 93 GQ903806_S4421_USA.

GQ903794_S3685_USA GQ903795_S3861_USA

91

GQ903802_S4227_USA GQ903800_S4007_USA

66

95 GQ903799_S3712_USA 90 GQ903801_S3711

GQ903807_S3943_USA 94 98

GQ903808_S4445_USA GQ903796_S3699_USA GQ903797_S3961_USA

100

GQ903798_S4417_USA GQ903809_S3583_USA 90 GQ903810_S3817_USA

100

GQ903811_S4216_USA 100 EU252113_EAVP35_USA 99

EU252114_EAVP80_USA DQ846750_Bucyrus

100 98

NC_002532_Bucyrus EU586273_HK25_USA

54 EU586274_HK116_USA 85

EU586275_ARVAC 100

AY349167_CW96_USA AY349168_CW01_USA

JN211316_F27_France

100 100 65

JN211320_F63_France JN211319_F62_France JN211317_F60_France

68 JN211318_F61_France

GB_Glos_2012 0.02

Fig. 2. Phylogenetic comparison of EAV full length genome sequences. GB Glos 2012 was compared against against all of the full genome sequences (n = 32; excluding molecular clones) available in GenBank. The arrows indicate the strains selected from each cluster for genomic comparison (Table 2) with GB Glos 2012. The Maximumlikelihood phylogeny was constructed using the GTR +G+ I model. Sequences are listed by their accession numbers followed by strain and country of isolation. Percentage of a 1000 bootstrap replicates are cited at nodes for values ≥50%.

Scant information is only available regarding the disease situation in Africa. Paweska et al. (1996) reported high seroprevalence of an asinine strain of EAV in South African donkeys. Guthrie et al. (2003) reported lateral transmission among Lipizzaner stallions caused by an imported stallion from Yugoslavia in 1981. A few years later, Stadejek et al. (2006) reported the isolation of a highly divergent EAV strain from a South African donkey. No internationally published information appears to be available for remainder of Africa, Asia and the Middle East; however the fact there are sequences from these regions available on GenBank shows the disease to be prevalent there. 3. Methods 3.1. Virus isolation In September 2012 we analysed a semen sample from a 9-yearold Andalusian stallion which was imported to Gloucestershire, UK from Spain. Virus was isolated following the protocol described in chapter 2.5.10 of the OIE Terrestrial Manual 2013 (Anonymous,

2013). The resultant isolate was named GB Glos 2012, reflecting the country, province and year of isolation. 3.2. Molecular testing RNA was extracted from the semen using the QIAamp Viral RNA Mini Kit (Qiagen Ltd., Manchester, UK) according to the manufacturer’s instructions. A 564 nucleotide portion of the ORF5 gene was amplified using the CR2 and EAV32 primers described by Stadejek et al. (1999). In parallel, a 395 nucleotide portion of the ORF7 gene was amplified using the 14a and 15 primers described by Belak et al. (1994). Following purification, the PCR products were Sanger sequenced using the same primer pair on ABI PRISM 3130xl Genetic Analyser machine (Applied Biosystems Inc.). For phylogenetic analysis sequences which represent all described EAV genetic groups were downloaded from GenBank and compared to the sequence (GB Glos 2012) generated in this study. The sequences were aligned and maximum likelihood and neighbour joining phylogenies constructed using the MEGA5 programme (Tamura et al., 2011).

F. Steinbach et al. / Virus Research 202 (2015) 144–150 Table 2 The percentage sequence homology between GB Glos 2012 and the other selected full genomes.

The GB Glos 2012 isolate was also used as the starting material for next generation sequencing via the Roche 454 FLX platform following protocols described by Rasmussen et al. (2008, 2010). 4. Results The semen sample tested positive for EAV both by virus isolation and PCRs targeting ORF5 and ORF7. 4.1. Comparison of GB Glos 2012 isolate to other full length sequences It was possible to recover the full genome of GB Glos 2012 (GenBank: LC000003) via NGS. All of the full genome sequences available on GenBank (n = 32, excluding molecular clones) were compared phylogenetically. As expected comparing these genomes did not elucidate any details regarding origins as the limited geographic distribution of the available data hinders any meaningful discovery. However, GB Glos 2012 fell outside all existing full genome sequences and by using the existing clusters it was possible to take a reference sequence from each to compare genome features against GB Glos 2012 (Fig. 2). The GB Glos 2012 isolate is 12,702 nucleotides in length, slightly shorter than Bucyrus (12,704 nucleotides), CW01 (12,708 nucleotides), F27 (12,710 nucleotides) and S4445 (12,731 nucleotides). The isolate shares greatest homology, 85.5% at the nucleotide level with Bucyrus and 84.4%, 84.7% and 84.6% and homology with CW01, F27 and S4445 respectively (Table 2). Analysing homology at the amino acid level for specific ORFs least similarity was noted across ORF3 at 78.5% with F27 and greatest similarity was across ORF7 at 99.1% with Bucyrus, CW01 and S4445. GB Glos 2012 had a three nucleotide deletion in nsp2 (codon 459 of protein 1ab) apart from which no other indels were noted. After a thorough analysis of GenBank entries, phylogenies were initially constructed using 211 sequences. Sequences were selected to cover greatest geographic and time differences, thus molecular clones, Bucyrus derivatives and laboratory strains were not selected for the analysis. Reference sequences from each of the manuscripts cited in Fig. 3 were used to inform clusters. Phylogenies were constructed in both neighbour-joining and maximum likelihood methods, which made little difference to topology (data not shown). For ease of viewing Fig. 3 contains an abridged version of the phylogenetic tree containing 101 sequences on display. Analysis of the GB Glos 2012 ORF5 PCR product sequence (Fig. 3) revealed the isolate to be an outlier, only clustering with isolates KY63 (GenBank: U81014), H10 (GenBank: AY453306) and H21 (GenBank: AY453306) supported by bootstrap values of 70% and 99% respectively. GB Glos 2012 shared 87%, 94.6% and 95% nucleotide homology across ORF5 GL segment with KY63, H10 and H21 respectively. The next closest neighbours were strains isolated in South Africa (GenBank: AY453337 and AY4533379) which were

147

thought to be of Yugoslavian origin and Hungarian strains (GenBank: JN314862 and JN314866). Following tree construction 7 clusters alphabetically labelled A–G, became apparent (Fig. 3). In our hands three new clusters A, B and G were visible, which were supported by bootstrap values of 96%, 70% and 100% respectively. Clusters A and B have previously been included in group EU1 and cluster G within group EU2 (Zhang et al., 2007). However taking the bootstrap values and the within between group variation (shown in Table 3) into account separate clusters appear warranted. There were three clusters, C, D, and F in which the majority of isolates segregated. The existence of one cluster, C, however, was not supported via bootstrap analysis, and regardless of the phylogenetic scheme used it has the greatest within group variation with 9.9%/10.10%/9.9% and 13.41% respectively (Table 3). This, along with the lack of bootstrap support suggests this cluster could be further broken down. Comparing the previously published phylogenies the IA and IB groupings cited in Stadejek et al. (1999) were not sustainable with current information and thus merged into one group as already suggested by Mittelholzer et al. (2006) and Zhang et al. (2007), here labelled group F in Fig. 3. Regardless of the phylogenetic scheme used there are isolates which did not fit into any of the defined clusters (Fig. 3): the use of full genome data and/or further sampling may aid to resolve this, either by generating further clusters or including the viruses into existing ones.

5. Discussion Reviewing the relation of GB Glos 2012 with other EAV strains, we focused our phylogeny on ORF5, as did previous authors (Balasuriya et al., 1995; Stadejek et al., 1999; Hornyak et al., 2005; Mittelholzer et al., 2006; Metz et al., 2011). Whilst it would have been interesting to compare further gene segments appropriate geographic and time point distribution were observed across ORF5 sequences only. As seen from Fig. 2 limited information could be gleaned from phylogeny of full genome data as these data are limited in number and location. It is anticipated that with the development of full genome sequencing into a routine technology, more viruses from all currently existing clades will be added soon to allow for a wider comparison. With regards to GB Glos 2012 it would be particularly interesting to compare it against the South African and Hungarian strains with which the isolate clustered as here we were limited to the 518 nucleotides of the ORF5 available for the other strains. As with PRRSV, variability of nsp2 is also noted for EAV. We reported a three nucleotide deletion at codon 459 of protein 1ab and comparison with other full genome sequences highlighted numerous alterations in this region e.g. strain S4445 contains a 5 amino acid insertion between codons 448 and 489 of protein 1ab. As with the phylogeny, we were again limited in our ability to collate such observations as the publicly available sequence data for the nsp2 region is restricted to data from the USA or France with the majority of these sequences being linked to a single outbreak in each of these countries. The re-appearance of a rare genotype of EAV provided us with reason to re-visit the relationship of EAV strains and their nomenclature. GB Glos 2012 demonstrates the re-emergence of a genotype across decades and around the globe. KY63 is a field strain isolated in Kentucky, USA in 1963; this isolate has always clustered separately from the other US isolates of the same time period suggesting a different origin. Strains H10 and H21 appear to have been isolated in the early 2000s in Hungary and the “South African” strains were isolated in the mid-1990s with a history of Eastern European imports, whereas the isolated characterised here (GB Glos 2012) is from a Spanish stallion. Even considering a potential lack of sampling the case demonstrates that established

148

F. Steinbach et al. / Virus Research 202 (2015) 144–150

Fig. 3. Phylogenetic analysis of ORF5. Maximum-likelihood tree using GTR +G+ I model with 101 EAV sequences shown. Phylogenetic grouping as suggested by Stadejek et al. (1999), Mittelholzer et al. (2006) and Zhang et al. (2007) are labelled on the right hand side of the tree. Phylogenetic scheme suggested from this work is displayed on the left of the tree. Sequences are listed by their accession numbers followed by strain name, country of isolation and where available year of isolation. Percentage of a 1000 bootstrap replicates are cited at nodes for values ≥50%.

F. Steinbach et al. / Virus Research 202 (2015) 144–150

149

Table 3a The average percentage pairwise nucleotide distances within clades. Stadejek et al. (1999)

Mittelholzer et al. (2006)

Zhang et al. (2007)

IA IB IC IIA IIB

EAV-1 EAV-2 EAV-3

NA EU1 EU2

7.04% NA 5.98% 6.97% 10.10%

6.93% 8.25% 9.90%

This work 8.23% 9.87% 13.41%

A B C D E F G

4.76% 7.85% 9.9% 6.93% 5.6% 8.96% 0.13%

Table 3b The average percentage pairwise nucleotide distances between clades. Stadejek et al. (1999)

Mittelholzer et al. (2006)

Zhang et al. (2007)

This work

G1

G2

Dist

G1

G2

Dist

G1

G2

Dist

G1

G2

Dist

IA IA IA IC IIA IIA IIA IIA IIB IIB

IB IC IIB IB IA IIB IB IC IB IC

9.4 15.34 16.77 14.29 15.64 14.56 14.32 14.15 16.25 13.58

EAV-1 EAV-2 EAV-2

EAV-3 EAV-1 EAV-3

14.35 15.48 16.44

NA NA EU1

EU1 EU2 EU2

15.84 17.84 16.2

B B B B C C C D D D D D E E F F F F F F G

A C E G A E G A B C E G A G A B C D E G A

15.44 15.44 15.37 26.93 16.08 13.34 25.18 15.46 14.93 14.35 13.95 26.60 16.31 27.80 16.88 17.43 16.26 15.37 14.32 27.06 26.45

NA: not applicable; G1: group 1; G2 group 2; Dist: distance expressed as a percentage.

strains do not become extinct and through global trade may reappear unexpectedly. This is surprising since Arteriviruses, such as EAV or PRRSV are known for their rapid evolution, linked to the lack of a proofreading enzyme such as the exoribonuclease ExoN, which Coronaviruses for example contain and which is vital for their genetic integrity (Minskaia et al., 2006; Nga et al., 2011). With viruses being subjected to host pressure the general notion was that they are constantly changing, perhaps as is reflected in the phylogenetic trees for EAV and PRRSV getting ever more complicated and seemingly highlighting new clades and genotypes. However, this case as well as the recent analysis of PRRSV and Bovine viral diarrhoea virus (BVDV; another persistent RNA virus of ruminants) in the United Kingdom demonstrates the persistence of strain variations in animal populations over decades (Frossard et al., 2013; Strong et al., 2013). This significantly broadens our understanding of the evolution of such viruses: in the absence of appropriate control measure to eradicate certain viruses (strains), host pressure will lead to a continuous divergence of viruses, but with the parallel persistence of “old” variants. If phylogenies are to be used for epidemiological tracing further data points to fill in sampling gaps need to be included. That will require a more active sampling and control policy than currently established in most countries. The fact that this case (as some recent cases) was detected in a horse of some economic value, which was supposed to be used for breeding purposes, emphasizes the need for further vigilance, in particular in relation to the health horse concept advocated by the OIE. Additionally, while ORF5 represents a good starting point with its gene product GP5 being directly under host-pressure from the antibody response other genes are important for the pathogenicity too and recombination has been described for Arteriviruses, thus

full genome data certainly represents the state of the art for the description of index cases in outbreaks. The available data needs to be labelled appropriately with the sequence record containing the location and date of isolation: as mentioned above there are multiple sequences at present in GenBank without any such information. Similarly manuscripts citing sequences should state accession numbers and particularly if presenting novel sequences provide details in the manuscript. Following the emergence of the Asian lineage of highly pathogenic H5N1 influenza, the WHO along with OIE and FAO created a working group to implement a unified nomenclature system (Anonymous, 2012). The system set out the minimum requirements to define a new clade including minimum bootstrap value after 1000 bootstrap replicates (≥60) and the average percentage pairwise nucleotide distances between and within clades (>1.5% and

Re-emergence of a genetic outlier strain of equine arteritis virus: Impact on phylogeny.

Equine arteritis virus (EAV) is the causative agent of equine viral arteritis (EVA), a respiratory and reproductive disease of equids, which is notifi...
1MB Sizes 1 Downloads 9 Views