Journal of General Virology (1990), 71, 1433-1441. Printed in Great Britain

1433

RNA2 of grapevine fanleaf virus: sequence analysis and coat protein cistron location M . A. Serghini, 1 M . Fnchs, 2 M. Pinck, ~ J. Reinbolt, 3 B. Walter 2 and L. Pinck 1. llnstitut de Biologie MolOculaire des Plantes du' C N R S et Universit~ Louis Pasteur, Laboratoire de Virologie, 12 rue du GOn~ral Zimmer, 67084 Strasbourg, 2Station de Recherches Vigne et Vin, Laboratoire de Pathologie V~g~tale I N R A , 28 rue de Herrlisheim, 68021 Colmar and 3Institut de Biologie Molkculaire et Cellulaire du C N R S , 15 rue Descartes, 67084 Strasbourg, France

The nucleotide sequence of the genomic RNA2 (3774 nucleotides) of grapevine fanleaf virus strain F13 was determined from overlapping cDNA clones and its genetic organization was deduced. Two rapid and efficient methods were used for cDNA cloning of the 5' region of RNA2. The complete sequence contained only one long open reading frame of 3555 nucleotides (1184 codons, 131K product). The analysis of the Nterminal sequence of purified coat protein (CP) and identification of its C-terminal residue have allowed the CP cistron to be precisely positioned within the polyprotein. The CP produced by proteolytic cleavage at the Arg/Gly site between residues 680 and 681

contains 504 amino acids (Mr 56019) and has hydrophobic properties. The Arg/Gly cleavage site deduced by N-terminal amino acid sequence analysis is the first for a nepovirus coat protein and for plant viruses expressing their genomic RNAs by polyprotein synthesis. Comparison of GFLV RNA2 with M RNA of cowpea mosaic comovirus and with RNA2 of two closely related nepoviruses, tomato black ring virus and Hungarian grapevine chrome mosaic virus, showed strong similarities among the 3' non-coding regions but less similarity among the 5' end non-coding sequences than reported among other nepovirus RNAs.

Introduction

protein is translated from this RNA2 in the presence of amino acids analogues, which were used to inhibit proteolytic cleavage. In the absence of these analogues, a specific protease induced by RNA1 catalysed the cleavage of the 125K protein into two proteins of 68K and 58K. Peptide mapping after partial proteolysis of the 58K protein strongly suggested that it was the viral coat protein (Morris-Krsinich et al., 1983). To substantiate these reports the sequence of RNA2 and the location of GFLV coat protein in the polyprotein translation product have been determined. Sequence comparisons between RNA2 of nepoviruses, GFLVF13, tomato black ring virus (TBRV; Meyer et al., 1986), Hungarian grape chrome mosaic virus (GCMV, Brault et al., 1989) and RNA M ofa comovirus, cowpea mosaic virus (CPMV; Van Wezenbeek et al., 1983) are presented.

Grapevine fanleaf virus (GFLV) is a member of the nepovirus group and is responsible for an economically significant disease in vineyards. The genome is composed of two single-stranded positive-sense polyadenylated RNAs which carry a genome-linked protein (VPg) at their 5' ends (Pinck et al., 1988) and each R N A is separately encapsidated in isometric particles. The capsid is composed of a single protein species of 54K (Quacquarelli et al., 1976). The lengths of the two genomic RNAs of GFLV strain F13 (GFLV-F13; RNA1 and RNA2) were estimated as 6800 nucleotides (nt) and 3900 nt respectively and an additional smaller satellite RNA (RNA3, 1114 nt) has also been identified (Pinck et al., 1988). In vitro translation of each species of virion R N A in a wheatgerm extract induces the synthesis of a polyprotein corresponding to its entire coding capacity, that is 225K for protein P1 of RNA1 and 127K for protein P2 of RNA2 (Pinck et al., 1988). In vitro protein synthesis under the direction of genomic RNAs of a non-specified GFLV isolate and processing studies carried out in rabbit reticulocyte lysates have shown that a 125K 0000-9451 © 1990 SGM

Methods Virus purification and nucleic acids extraction. GFLV-F13, originally collectedfrom a grapevine Vitis vinifera cv. Muscatnear Frontignan in the south of France, was propagated on Chenopodiurn quinoa after mechanical inoculation and virus particles were purified as described

1434

M. A. Serghini and others

previously (Pinck et aL, 1988). RNA was extracted from purified virions by the conventional SDS-phenol method and concentrated by ethanol precipitation. Plasmid DNA was prepared either from minilysates (Serghini et al., 1989) or by alkaline lysis and purification on a CsC1 gradient containing ethidium bromide (Maniatis et al., 1982). Synthesis and cloning o f double-stranded cDNA. Partial clones containing T-terminal sequences of GFLV-F 13 RNA2 were constructed by cDNA synthesis using oligo(dT)-tailed pUC9 primer extension (Heidecker & Messing, 1983). cDNA copies of the 5' part of GFLVF13 RNA2 were synthesized using a synthetic oligonucleotide (18-mer) primer P638 (5' GGTTGAGGGTCCCCTCCA 3'), complementary to .nt 1286 to 1303. The first strand and the double-stranded cDNA were synthesized by primer extension (Rutledge et al., 1988) in a one tube reaction: 7 ~tg of purified total virion RNA was incubated with 0.25 ~tg of primer P638 in a final volume of 50 ~tl containing 100 mM-Tris-HCl pH 8.3, 130 mM-KC1, 10 mM-MgC12, 2-5 mM-DTT, 1 mM of each dNTP and avian myeloblastosis virus (AMV) reverse transcriptase (Life Sciences) to a final concentration of 800 units/ml. Ten ~tl of this mixture was immediately transferred to a separate tube containing 1 ~tl of [a-32p]dATP (3000 Ci/mmol, 10 mCi/ml) and both tubes were incubated at 37 °C for 5 min and then at 42 °C for 25 min. The labelled first-strand cDNA was used as a reference in 1.4% alkaline agarose gel electrophoresis (Maniatis et al., 1982) or for parallel migration with heterologous sequencing products in an 8 ~ polyacrylamide sequencing gel. To synthesize the second strand of the cDNA, 1 unit of RNase H (BRL), 20 units of DNA polymerase I (Kornberg polymerase, Boehringer Mannheim), 4 units ofT4 DNA ligase (Pharmacia) and 4 gl of [c~-3zp]dATP (3000 Ci/mmol, 10 mCi/ml) were added directly to the tube containing 4081 of the first-strand cDNA. The mixture was incubated for 2 h at 15 °C. The reaction was stopped by extraction with phenol :chloroform :isoamyl alcohol (25 : 24 : 1, v/v/v) and the doublestranded cDNA (ds cDNA) was then precipitated in the presence of 2 M-ammonium acetate and ethanol (Okayama & Berg, 1982) before gel filtration. In a second step the longest cDNA molecules were selected by Sephacryl $500 gel filtration; ds cDNA was pelleted, solubilized in 4081 elution buffer (10mM-Tris HC1, 50mM-NaC1, 2mM-EDTA pH 7.8) containing 10% glycerol and 0.05% bromophenol blue, loaded on a Sephacryl $500 column packed in a 5 ml plasfc pipette and equilibrated with the elution buffer. The fractions containing the longest DNA molecules, as judged by monitoring the radioactivity and by gel electrophoresis of aliquots, were pooled and precipitated. In a third step the ds cDNA was cloned. Recombinant clones were obtained by overnight ligation of the selected ds cDNA molecules with 5' dephosphorylated HinclI-cut pUC9 in 10 gl of the mixture described in Rutledge et al. (1988). Aliquots of the ligation medium were used to transform Escherichia coli C6005K (Hubacek & Glover, 1970). To clone recombinant cDNA plasmids extending further toward the 5' terminus of RNA2, the primer P915 (5' GCGCAAAATAAAAGAAC Y) complementary to nt 142 to 158 was chosen. The first strand of cDNA was synthesized as described before, purified in an 8% polyacrylamide gel and dC-tailed. The dC-tailed cDNA molecules were hybridized with dG-tailed PstI-cut pUC9 and further digested with SmaI in order to obtain one blunt extremity. This hybrid was treated with DNA polymerase I (Klenow fragment, Pharmacia) to synthesize the secondstrand cDNA. After ligation, the mixture was used to transform E. eoli C6005K. Screening o f the recombinant clones. Plasmids were screened for the presence of viral cDNA inserts by in situ hybridization using either a nick-translated RNA2-specific cDNA probe (Pinck et al., 1988) or the P638 and P915 primers 5'-32p-labelled using T4 polynucleotide kinase. DNA of mini-preparations from recombinant plasmids were screened by double enzymic digestion (EcoRI-HindIII) followed by electrophoresis in 4% non-denaturing polyacrylamide gels or in 1% agarose gels.

Nucleotide sequence analysis, cDNA inserts were sequenced both by partial chemical degradation (Maxam & Gilbert, 1980) and by dideoxynucleotide chain termination (Sanger et al., 1977) after subcloning the appropriate restriction fragments into pUC9. The sequence of both strands of the denatured recombinant plasmid DNA was determined using direct or reverse-sense synthetic oligonucleotide sequencing primers and the modified bacteriophage T7 DNA polymerase (Tabor & Richardson, 1987; USB or Pharmacia). Additional sequence information was also obtained from primer extension using reverse transcriptase and the P638 or P915 primers as previously described (Fuchs et aL, 1989). Sequence data were analysed using the UWGCG programs (Devereux et al., 1984) on a Microvax II computer. The COMPARE and DOTPLOT algorithms were used for RNA and protein sequence comparisons. Alignments of homologous nucleotide or amino acid sequences were obtained using the GAP and BESTFIT algorithms. N- and C-terminal amino acid sequence analysis o f the viral coat protein. Particles of the fastest sedimenting bottom component of GFLV were purified by centrifugation through a 10 to 50% (w/v) linear sucrose density gradient prepared in P buffer (10mM-Na2HPO4, 10mMKH2PO4, pH 7) for 5 h at 25000 r.p.m, in an SW28 Beckman rotor. The fractions containing bottom particles were diluted with 2 vol. of P buffer and pelleted by a 5 h centrifugation at 1450008 (.Pinck et al., 1988). Coat protein (50 pmol) was purified on a 10% SDS-PAGE gel, transferred onto Immobilon PVDF membrane according to the manufacturer's Immobilon Tech Protocol (MiUipore) and directly subjected to automated Edman degradation using an Applied Biosystems 470A protein sequencer (Hewick et al., 1981). For digestion of the coat protein with carboxypeptidase A (Sigma), the coat protein prepared from purified bottom component by the acetic acid procedure (Fraenkel-Conrat, 1957), was incubated overnight at 37 °C in 0.1 M-Nethylmorpholine pH 7.8 buffer. The amino acid content was analysed using a Pico-Tag analyser (Waters) after a 24 h hydrolysis with 6 MHC1.

Results and Discussion Determination

of the sequence of RNA2

T h e c l o n e s c o r r e s p o n d i n g to t h e 3' r e g i o n o f R N A 2 w e r e generated using PstI-cut oligo(dT)-tailed pUC9 plasmid D N A to p r i m e t h e first s t r a n d s y n t h e s i s o n t h e 3' p o l y ( A ) s e q u e n c e o f total G F L V - F 1 3 v i r i o n R N A ( H e i d e c k e r & M e s s i n g , 1983). S c r e e n i n g o f t h e r e c o m b i n a n t c l o n e s w a s b a s e d o n in s i t u h y b r i d i z a t i o n w i t h a n i c k - t r a n s l a t e d p a r t i a l c D N A p r o b e for R N A 2 as d e s c r i b e d p r e v i o u s l y ( P i n c k e t al., 1988) a n d o n r e s t r i c t i o n e n z y m e analysis. Nearly 30% of the transformants obtained were RNA2specific a n d h a d a n a v e r a g e size o f 1500 nt. T h e l o n g e s t c D N A i n s e r t ( p l a s m i d p G 3 8 ) o b t a i n e d by this p r o c e d u r e was 2552 nt in l e n g t h a n d c o n t a i n e d a 22 nt l o n g p o l y ( A ) tail at o n e e x t r e m i t y w h i c h a l l o w e d t h e o r i e n t a t i o n o f the sequence. The pG38 cDNA insert was entirely s e q u e n c e d o n b o t h s t r a n d s by t h e p a r t i a l c h e m i c a l degradation method and the primary structure determ i n e d w a s in p e r f e c t a g r e e m e n t w i t h t h e p a r t i a l sequence information determined from independent c l o n e s c o r r e s p o n d i n g to t h e 3' r e g i o n o f R N A 2 ( d a t a n o t

shown).

G F L V R N A 2 sequence and coat protein location

The 5' region of RNA2 was cloned using the synthetic 18-mer P638, located 2543 nt from the 3' end, to prime cDNA synthesis on total virion RNA of GFLV-F13. Of the recombinant clones obtained 29~ hybridized with the 5' end-labelled primer P638 used as a probe. The cloning method used proved very rapid and efficient. The longest cDNA insert (1226 nt) was in clone pS38. No sequence heterogeneity was detected upon analysis, of four clones corresponding to parts of clone pS38. The assembly of clones pG38 and pS38 corresponded to 3697 nt, i.e. about 95~ of the length expected from size estimation of the RNA2. Direct sequencing of RNA2 using the primer P915, located 3632 nt from the 3' end, showed many stops across all four lanes of sequencing gels, suggesting that the reverse transcriptase had difficulty in extending synthesis across this region. This may be the reason for the absence of cDNA copies longer than that in clone pS38. The sequence of the Y-terminal part of RNA2 was therefore obtained by primer extension and quasi-end sequencing using the synthetic oligonucleotide P915 and reverse transcriptase. In order to eliminate the ambiguities due to the stops present in the reverse-transcribed cDNA, the full-length cDNA that migrated as a single band in a polyacrylamide gel after extension of the 5'labelled primer P915 was analysed in two ways; the material was eluted from the gel and either sequenced by partial chemical degradation or dC-tailed and cloned into PstI-cut, oligo(dG)-tailed pUC9 which had been further digested by SmaI to yield clone pW5. From the direct sequence analysis, the sequence was determined up to the 5' penultimate nucleotide, at which point the intense radioactive band of non-degraded cDNA extended across all four lanes of the sequencing gel and obscured the pattern. Sequence analysis of clone pW5 yielded the same sequence with AT in the first and second positions respectively. The complete sequence of GFLV-F13 RNA2 contains 3774 nt excluding the 3'-terminal poly(A) tail (Fig. 1) which is slightly smaller than the 3900 nt estimated by electrophoresis in agarose gels containing formaldehyde (Pinck et al., 1988). Analysis of the putative open reading frames (ORFs) shows a single large ORF of 3555 nt from nt 8 to the UAG termination codon at nt 3560 to 3562 in the positive orientation of RNA2. The cistron of the resulting 1184 amino acid polypeptide (Mr 131607) represents 94.1~ of the RNA2 sequence. There is a second in-phase AUG at nt 233 to 235. If this second AUG codon acts as initiation codon the resulting 1109 amino acid polypeptide would have an Mr of 122706 (122K). Both Mr values are close to the value of 127K determined by 10~ SDS-PAGE for the major translation product of RNA2 synthesized in wheatgerm extracts

1435

(Pinck et al., 1988). Thus the in vitro translation data do not allow us to choose which AUG is used to initiate the 127K protein. However, three arguments favour the second AUG as the functional initiation codon for the 127K protein. (i) The G + C content of the sequence between these two putative initiation codons is 38~ which is lower than the overall G + C content of the rest of the RNA2 coding region (46.8 ~) and is similar to the low G + C content in the non-translated leader sequences of TBRV-S RNA2 (37.8~; Meyer et al., 1986) and GCMV (40.1~ Brault et al., 1989). (ii) The sequence flanking the second AUG codon is in better agreement with the consensus sequence determined by Kozak (1981) for translation initiation codons than is that flanking the first AUG. (iii) After subtraction of 56K corresponding to the coat protein from the 122K or the 131K polyprotein either 66K or 75K remain for the Nterminal part of this protein. The value of 66K is in good agreement with the 68K protein found in the translation products of GFLV RNA2 in reticulocyte lysates by Morris-Krsinich et al. (1983). Computer analysis of the RNA sequence revealed no other long ORFs in either the positive- or negative-sense orientation of the RNA. Only short ORFs of no more than 61 codons (369 nt to 551 nt) in the positive-sense and of less than 130 codons (1702 nt to 1312 nt) in the negative-sense orientation of the RNA were found. The total base composition of RNA2 is 24.2~ A, 21-5~ C, 25.3~ G and 29~ T. These values are similar (within 5 ~) to those of CPMV M RNA and RNA2 of TBRV-S and GCMV. Location o f the coat protein cistron

By analogy with CPMV RNA M (Van Wezenbeek et al., 1983) and as suggested for RNA2 of TBRV-S (Meyer et al., 1986) and GCMV (Brault et al., 1989), the coat protein of GFLV is likely to be encoded in the 3' part of RNA2. However the products obtained from in vitro translation of purified RNA2 in wheatgerm extracts did not react with polyclonal antibodies in immunoblots. The location of the coat protein cistron on RNA2 and the exact site of cleavage at which the coat protein is released from the polyprotein precursor can therefore be deduced only by the determination of the amino acid sequence at the N terminus and identification of the C terminus of the coat protein. First, the sequence of 27 residues at the N terminus was determined by automated Edman degradation. This sequence is identical to that located between residues 681 to 708 of the polyprotein (Fig. 1), indicating that the first amino acid at the coat protein N terminus (Gly) resulted from cleavage at the Arg/Gly site located at residues 680 to 681. The high yield of amino acid derivatives recovered during this analysis indicates

1436

M. A. Serghini and others

i ~~ACGUU~CUUACC~ACCGUGAU~CACU~UC~AGCC~CAGUUU~CUC~~~~ M

F

T

F

Y

V

V

I

S

L

F

F

Q

E

K

K

L

K

S

E

T

A

K

$

L

R

N

31.

101 UCACAAAGCGAAGAGUUUAAGAAACuCAUCAUUGCUUuUUUGUUCUUUUAUUUUGCGCUDUAHUuGUUUAGcAUUUUAUUUAGUUCUUUUAAAAAGCUUU S

Q

S

E

E

F

K

K

L

I

I

A

F

L

F

F

Y

F

A

L

¥

L

F

I

L

F

S

$

F

K

K

L

L

65

201

~GUUUUGUUUUUCUUUUUCUUUACUUGCCcUUAUGGGCAAAUUUUAUUAUUCCAACAGG~GGCUUGCCUGUUGGGCUGCUGGGAAGAACCCUCAUCUUGG F C F S F $ L L A M G K F Y Y $ N R L A C W A A G K N H L G

301

GGGU~J~GUUG/~C/~UGG6UGGCGGCCAUU.~.CACUGAUCCCUCCUUC6GCC/V~.CUGUUAAGGAGGAUGUCC~G/V~ACCGAG~.~%C~C~.CUGCU

401

GUUCG~UGUUUUCUuGG.a~AGUUGGGUcCGGGCCCAUUGAC1~UCCCGAG1~d~UGCGACUGGcAWJUUGUCCrd~IACC-~`~GAGAG~?.cAGCC~2C~CCC

G

V

$

R

V

M

E

Q

F

W

$

A

W

K

A

V

I

G

$

N

T

G

D

P

P

I

S

D

F

N

Q

P

E

T

K

V

C

K

D

E

W

D

H

V

F

Q

V

E

L

N

T

R

G

E

E

Q

R

P

P

T

A

A

P

98

131

$

R

165

501

GGCC GGUUAAAGC CGA•GAGGUUGUGGUGGUGCCACAACCGAAGAAGGUGGUGAUUCCAACACCACCUCCUCCCCCAGCUCCCUAcUUUAGGGCUGUUGG P V K A D E V V P Q P K K V V I P T P P P P P A P Y F A V G

601

GGCUUUUGCACCAAC~CGGUCCGAGUUUGUUCGGGCCAUUGUGGAAAGGC~CAcCCGGCUAcGGGAGGAGUCGAGAGCUGCGGCAC~CUUUGCCGAAUUG

701

CCAUUGGAGUACCCUCAGGGUGCUCCUCUGAAGUUGAGCCUGGCGGCGAAAUUCGCCA~GCUCAAACAUACCAcUUGGAGGAAGUGGUAUGAcACUAGUG P L E P Q G A P L K L S A A K F A M L K H T T W R K W Y D T S D

265

801

AU GAGCGCCULVOUGGAGGCUCAUCCUGGUGGUCCUUGUCUUCCUCCCCCUCCCCCAAUCCAAAAUCCUCCCUCCUUCCAGGAGAGGGUGAGGGAGUUUUG E R L L E A H P G G P C L P P P P P I N P P $ F Q E R V R E F C

298

901

CAGGAUG.~GUCCUGCACC/~GGCUUUCGCCUUGG/~C6UCCCUAGGUCUC'~U~GGCCUGGGUAGGU~AGUGGACAUCCCCAGUACUU~UGUGU~

A

F

R I001

A

M

P

K

T

S

R

C

E

T

K

F

A

V

F

R

A

A

L

I

E

V

T

E

$

R

L

L

G

T

L

R

N

L

K

R

A

E

W

E

V

R

G

L

A

V

A

A

D

L

P

F

$

A

T

E

$

L

V

198

231

C

331

UGUGCGGAUGGG1~GACuAccGGUGGGcAGACAAI~UGCCCAGGAAGCuGAUCcUUUGC1~cAUAGGAUCAGUACGUCAGUAGCccCCGGUAC43GCAc1~U C

A

D

K

T

T

G

G

Q

T

I

A

Q

E

A

D

P

L

Q

H

R

I

T

$

V

A

P

G

A

Q

W

365

ii01

GGAUCUCCGAGCGCAGAC/~GCUCUGCGGAGGAGAGAGc/~GC~UAGCUUCGAAGGU6UUGCuGCUC.~&CCGAUAUGACU~1JGA~A~;CCA~ I~

1201

UGCUUAUCUUGGUGCUGCC GACAUGAUUGAGCAAGGCCUACCGCUG~UUCCCCC~UGCGCAGCG~UUACGCCCCUAGGGGUUUGUGGAGGGGACcCUCA A Y L A A M I E Q G L L L P L R S A Y A P R L W R G P $

I

1301

S

E

R

R

Q

A

L

R

R

R

E

Q

A

N

S

F

E

G

L

A

A

Q

T

D

M

T

P

E

Q

A

R

N

398

431

ACCAGAGCCAAUUAcACGcUAGAUUUUAGGCuCAAUGGUAUUCCGACCGGGACAAACACAUUGGAAAUAUUGUAUAAUCcUGUGUCGGAGGAAGAGAUGG T

R

A

N

Y

T

L

D

F

R

L

N

G

I

P

T

G

T

N

T

L

E

I

Y

N

P

V

S

E

E

E

M

465

1401 AAGAGUAcCGGGACAGGGGCAUGUCAGCUGUGGUAAUUGAUGCGCUAGAAAuAGCcAUAAACCcAUUUGGcAUGCCUGG;~%AUCC~AcGGAcUUGAcUGU E ~501

1601

Y

R

D

R

G

M

S

A

V

V

I

D

A

L

E

I

A

I

N

P

F

G

M

P

G

P

T

D

L

T

V

498

CGUAGCGAC ~UAUG~GCAU6AGCGCGACAUGACGCGCGCCUUUAUUGGAUCUGCUUCCACAUUCUUAGGGAAUGGGUUAGCUAGAGCCAUUUUCUUUCC6 V A T Y G H R D M T R A F I G A $ T F L G N G L A R A I F F P

531

GGUUUGCAAUAUAGCCAGGAGGAACCAAGGCGCGAAUCUAUAAUUCGCcUAUAUGUUGCCUCUACCAAUGCCACUGuGGAUACUGAUUCAGuCUUGGCAG G

L

Q

Y

$

Q

E

E

P

R

R

E

S

I

R

L

Y

V

A

T

N

T

V

D

T

D

S

V

L

A

A

565

1701

CCAUUAGUGUUGGCACUUUGCGUCAACAUGUUGGUUCCAuGCAcUACCGGAcAGuGGCUAGUAcCGUGCACcAGGCUCAGG•GCAAGGAACGACGCUCAG

1801

GGCUACUAUGAuGGGU~CACUGUCGUAGUAUCACcUG~GGl~GCCUGGUUACuGGi~d~CCCcUGAAGC~GAGUUG1~UAGGGGGCGGUUcUA~AI.F~

1901

AGGAUGGUGGGA•CUCUACAGUGGGAAAGUGUGGAGGAACCAGGGCAAA••UUCUCUAU•AGAAGCCGUUCACGGUCUGUGAGGAUUGAUAGAAACGUUG

I

A

R

S

V

T

M

M

G

T

M

V

L

G

Q

N

P

L

T

H

V

Q

G

V

W

V

E

$

S

S

V

M

P

E

H

E

E

Y

G

P

R

S

G

T

V

L

Q

T

T

A

S

G

Y

T

T

$

V

P

I

H

E

R

Q

A

A

S

Q

V

R

V

Q

E

R

G

$

V

G

T

G

R

T

G

I

L

S

D

R

$

R

598

I

N

631

V

D

665

2001

AUCUUCCUC/~CUUGAGGCUGAGC CCAGACUGAGC UC~.CCGUGAGAGG/~r~AGCUGGUAGAGGAGU/~UCUACAUUCC6.~C-K;AUUGCCAGC.~A-~UAG

2101

AU ACUU GGG CACCCU G AAUAUACGUGAUAUGAUC UCAGACUUCAAGGGUGUCCAGUAUGAAAAGUGGAUAACUGCAGGAUUAGUCAUGCCUACUUUCAAG Y L G T L N I R D M I S D F K G V Q Y E K W I A G V M P T F K

2201

AU AGUUAUU ~GGCU ACCUGCAAAUGCCi~JUACUGGAUUGACAUGGGUGAUGAGCUUUGAUGCUUAUAACCGGAUAACUAGUAGAAUUACUGCUAGUGCGG I V I R L P A N A F G L T W V M S F D A Y N R I T S R I T A S A D J

g

2301

P

Q

L

E

A

E

P

R

g

$

S

V

R~G. .L

A

G

R

G

V

I

X

I

P

K

D

C

Q

A

N

R

698

731

765

AUccUGUAUACACCUUGUCAGUCCCACAUUGGCUW~AuCCAcCAuAAGUUGGGcAcGUUUUcAUGUGAGAUAGACUAUGGAG1~UUGuGUGGUCAUGcUAU P

V

Y

T

L

$

V

P

H

W

L

I

H

H

K

L

T

F

C

E

D

Y

G

E

L

C

H

A

M

798

2401

GUGGUUUAAAUCAACCACAUUUGAAUCUCCAAGGUUGCAUUUCACGUGUUUAACGGGCAACAACAAAGAGUUAGCG GCAGACUGGCAAGCUGUCGUAGAA W F K S T T F E S P R L H F T C L T G N N K E L A A W Q A V V E

831

2501

CUAUAUGCC~AUUGGAAGAGGCCACUUCUUUCCUUGGGAAACCAACUUUGGUUUUUGACCCAGGUGUUUUCAAUGGCAAAUUUCAAU~CUUACUUGCC L Y A L E E A T S L G P T L V F D P G V F N G K F Q F L T C P

865

2601

CUCCcAUAUUCUUUGALrtJU~CGGCCGuCACGGCCCUUAGGAGUGCUGGGCuGACAUIJGGGGCl~GUccC1~UGGUUGGCACcACU`~. UUUAUI~I~.GU6CU P

2701

I

F

F

D

L

T

A

V

T

A

L

R

$

A

G

T

L

Q

V

~

M

V

G

T

T

K

V

Y

N

L

898

AAACAG~ACUCUUGUGAGUuGUGUUUUGGG~AUGGGAGGUACUGUUAGAGGGAGGGUGCACAUUUGUG~GCCAAUCUUcUACAGUAUUGUUUUAUGGGUC N S T V S C V L G M G G T V R G R V H I C A I F S I V L W V

931

2801

GUUAGUGAGUGGAACGGGAC~ACUAUGGACUGGAAUGAACUUUUUAAG~AUC~CGGGGUGUAUGUGGAAGAGGACGGAAGUUUUGAAGU~AGAUUCGcU V $ E W N G T T M D W N ~ F K Y P G V Y V E ~ D G S F E V K I R S

965

2901

~UCCAUAu~ACCGAACUCC6GCCAGAUUGCUUGCUGGUCAAAGUCAGAGAGA~AUGAGCUCUCUUAAUUUUUAUGCAAUAGCAGGACCUA~CG~CCUU~ Y H R T P A R L L A G Q $ Q R D M $ L N F Y A I A G P A P S

998

3001

GGGUGAGACUG~GCAACUU~C~AUUGUUGUGCAGAUAGAUGAAAUUGUGCGCCCAGAUCUCUCUUUACCAAGUUUUGAAGAUGACUAUUUCGUAUGGGUG G E T A Q L I V V Q I D E I V R P D L S L P $ F E D Y F V W V 1031

3101

GAUUUUUCUGAAUUCACUCUUGAUAAAGAAGAAAUUGAGAUUGGUUCUCGUUUCUUCGAUUUCACUU~GAA~ACUUGUAGGGUAUCUAUGGGUGAAAA~C D F S E F T L D K E E I E I G S R F F D F T $ N T C R V S M G E N P 1065

G F L V R N A 2 sequence and coat protein location

3201

1437

CG~UUGCU~CAA~GAUUGC6UGCCAUGGAUUGCAUAGUG6UG~AUUAGA6CUCAAACUc6AAUGGAGucUGAACACCGAA~UCGGCAAGAGCAGCGGGA6 F

A

A

M

I

A

C

H

G

L

H

S

G

V

L

D

L

K

L

Q

W

S

L

N

T

E

F

G

K

S

S

G

S

1098

3301

CGUUA~AU6A~GAAGCUG6UGGGUGAUAAGGC~AUGGGU~UGGA~GGA6~UU~U~A~G6UUUUGC~AUA~AAAAA~UA6AGGGAA~UA6AGAGUUGUU6 V T I T K L V G D K A M G L D G P S I~ V F A I Q K L E G T T E L L 1131

34oi

G~GC~GCAGGA&~ecc;~6AeucG~cectr~AUA6ueGcuc~U6~UU~6~CCAUC~&;~G~GUAt~JAeue6 V

G

N

F

A

G

A

N

P

N

T

R

F

S

L

Y

S

R

WM

A

I

K

L

D

O

A

K

S

I

K

V

L

R

I165

3501 GcG~u~JGu~c~GccccG~cc~GGc~Gc~uc~uAG~ccA~uucccA~c~GGGu~ucu~Act~u~6Accc~Gu~auAuAuG~G6 1174 V

L

C

K

P

R

P

G

F

S

F

Y

G

R

T

S

F

P

V

*

~vo~ ~ u e c ~ r ~ 6 ~ c ~ c ~ c ~ 6 e c u ~ c ~ c ~ l ~ t r ~ c c ~ c u ~ u u ~ c u 6 u t ~ r ~ c ; ~ O t ~ u

3~7a

Fig. 1. Complete sequence o f G F L V - F 1 3 RNA2. The bar over nucleotides 2 to 9 indicates the correspondence with the 5' consensus. The first two putative initiation codons are indicated by boldface M. The underlined amino acids correspond to the four peptides homologous with the polyprotein encoded by T B R V R N A 2 (see text). The black arrow denotes the Arg/Gly cleavage site. The boldface a m i n o acids correspond to those determined by microsequencing of the coat protein N terminus.

Table 1. Amino acid composition o f the coat protein A m i n o acid

Experimental

Expected

D + N E+ Q S G H R T A P Y V M I L F K

43.8 + 4.4* 41.4 _+ 4-1 37.7 + 3.8 57-9 _+ 5.8 11-5 + 1.1 27.9 + 2-8 34.7 _+ 3.5 37.7 _+ 3.8 31.1 + 3.1 14.0 _+ 1.4 39.5 + 3.9 11-8 + 1.2 28.2 + 2.8 54.7 + 5.5 37.7 + 3.8 24.4 + 2-4

41 37 37 41 10 22 37 33 26 16 39 12 28 48 34 22

* U p p e r and lower values represent the limit of confidence of the Pico-Tag method.

that true coat protein has been sequenced: 30 pmol of PTH-Gly, the first amino acid, were recovered from an input of 50 pmol coat protein bound onto the PVDF membrane and subjected to sequencing, i.e. a yield of 60~. In addition, the amino acid composition data obtained from the acid hydrolysis of purified coat protein (Table 1) are in good agreement with the values expected from the RNA sequence. The coat protein could therefore contain as many as 504 residues (amino acids 681 to 1184) with a calculated Mr of 56 019 if it extends to the C terminus of the polyprotein. This value is in good agreement with the Mr of 57000 estimated from the mobility of the GFLV-F13 capsid protein in a 10~ SDSpolyacrylamide gel (not shown). However, the coat protein of tobacco ringspot nepovirus, another nepovirus, although migrating mainly as a 57K polypeptide on PAGE has been proposed to be a tetramer of 13K protein difficult to dissociate into its monomeric form

(Chu & Francki, 1979). This raises the possibility that a similar situation might exist for the coat protein of GFLV. If the GFLV coat protein extends all the way to residue 1184 then digestion of the coat protein with carboxypeptidase A should liberate the C-terminal valine (1184) adjacent to the proline (1183). Analysis of the amino acids released by the digestion of coat protein prepared from purified bottom component with carboxypeptidase A proved unambiguously the presence of valine as the C-terminal residue of the coat protein (data not shown). For TBRV-S, the cleavage site at which coat protein is released from polyprotein precursor has not been precisely determined (Meyer et al., 1986). For GCMV, cleavage at an Arg/Ala site has been proposed (Brault et al., 1989). To be certain that the Arg/Gly site proposed here is the actual cleavage site would require that the sequence be determined at both sides of the cleavage but this is not possible at present as the non-structural protein ending at this site cannot be isolated. However, an additional point in favour of the Arg/Gly cleavage site is that the 58K protein observed after translation of GFLV RNA in reticulocyte lysates and which would correspond to the N-terminal part of the polyprotein precursor was absent from translation products when canavanine, an amino acid analogue of Arg, was used to inhibit maturation (Morris-Krsinich et al., 1983). Incorporation of canavanine instead of Arg into the polyprotein could affect the correct protein folding needed for recognition by the protease. Except in the case of the flaviviruses, no other Arg/Gly site has been reported as a polyprotein cleavage site (Wellink & van Kammen, 1988). The Arg/Gly cleavage site was identified by N-terminal sequencing of the NS5 non-structural protein for the yellow fever flavivirus (Rice et al., 1985) and in the carboxy-terminal~sequence of the C structural protein of West Nile virus (Nowak et al., 1989). For other flaviviruses, non-structural proteins

1438

M. A. Serghini and others 3562

GGGUAUCUGACUUUAAAAGA III I III III 4362 G U A / / 5 5 n t / / C C A U U U G G U G U U G A G A U A A C C A A A U U G A A A U A A i I ii f I il I 4193 G C A U U U C U U G A A G A G A A U A U C C A U C C C G

GFLV TBRV GCMV

GFLV

CPMV

CCCA ......... AGUGUAUAUAUGUGUUUUGUCAGUAGCAUGU.AUUAUUUUGUGUUAUAAUUUGUUUUAACUU. I li I II li i i J il I Ilt I I I I ~ I IIIII CUCAACUUUGAGCAAUGCUUAGACCUUCGUGGUUGCUCUCAUAU.UUAAGGUCAUUGUGAAAUUUUCUUUUGUUUU II II II ~l II II111 II 1 1 1 1 I ~I CUUGACAGGGAUUUCUGUUUGUCAAGCUAGAAAAGCUCUAAUCUAGUCAAAUAACGAGCAUUGUUGUUUUUGCUUU I Illl I 1 I 1 i II I I i 3302 C U C U G G U U U C A U U A A A U U U U C U U U A G U U U G A A U U U A C U G U U A U U U G G U G U G C A U U U C U A U G U U G G

GFLV

...........

TBRV GCMV

li Ill

lilllIili

II] I i

i111

Illi~ll~

I

3621

~11

11

IIIil

III

4490 4295

I

I

li

I

3367 3709

I

I

t

tl

I

t I

LI I ~l II

tl

I ~l

Illll~l

I

I I III I I t L I I I

I I I~

I I I

CMPV

C~UUAGUUUAGAUUUGUUUCUGUAA~G~GUGUUUAAUU~C~UGU~UUUCAGUGGCG~UAACAU~GG[GUUUGUcCUU~U I II II I il II III if Jill Ill I i UGAGCGGUU...UUCUGUGCUCAGAGUGUG.UUUAUUUUAUGUAAUUUAAUUUCUUUGUGAGCUCCUGUUUAGCAG

GFLV

GACACACUUGCCUAGUU~-~ZAAAAAGAUUUUUCCUUUCUUUUUAC

I TBRV

4220

IIll]llilrljl

C~UAAU~UAGG~U~AAUUUCUGUU~-~.GUGUGUUUAAUU~C.UGU~UuUCAGUGGCG~.UGCAUAGGGUUUGUCCUU~U t

GCMV

4443

GUUUUCCGCUUUU.GUGUGUUUAG~_~UCAUGC~AGUGGCGACAGUGUGUUGUUUGUCCUU~G

Jl

TBRV

3581 }

ill

llliilll

llillilil[[Eil

I I/t f l l l ~ l ~

4590

I/I 4367

I

i 3439

U G U U. U U G C A A A U U U A U - p o l y ( A )

iJIl

C C C U G U G G U G C U A U G U U [ G G A C A C A A A AAA G A U U U U I C U C U U U U G U A A A U G l I I I I I I Itlt I I t L I I I I I I I I I I I l| I l

I

[I

II

ArT, A A A A U G U U U U C U U . CAAAAAGC-poI IIIIIII IIII III II IIIIII

GCMV

CUCA~G~UUGC~UUGUU~GGACACAAAAAGAUUUU~AUUUCUUAAAUGUUAAAACCUUUCUUUUGGAAAAGC-p~|y(A

CPMV

II 1 i I lllilil]llIl[IIIlili Ill li GUCGUCCCUUCAG,CAAIGGACACAAAAAGAUUUU~AAUUUU/kUU-po~y(A])

(A)

Y

)

Fig. 2. Sequence homologies in the 3' non-coding region of R N A 2 from nepoviruses GFLV-F13, TBRV-S, G C M V and from comovirus C P M V R N A M.

may also be produced after processing at Arg/Gly cleavage sites as shown by sequence comparisons between homologous proteins (Rice et al., 1986; Sumiyoshi et al., 1987). Comparison of the amino acid sequence flanking the Arg/Gly sites in the polyproteins of flaviviruses and GFLV RNA2 reveals no significant sequence homologies that could account for the specificity of the protease. Furthermore, in the nepovirus group the RNA2 polyprotein is cleaved by.a specific proteolytic activity associated with translation products of RNA1 (Morris-Krsinich et al., 1983; Forster & Morris-Krsinich, 1985; C. Fritsch, personal communication). No information is available about the cleavage sites or the location of the protease cistron in RNA 1. The Arg/Gly cleavage site reported here is therefore the first reported for proteolytic processing of a plant virus polyprotein, assuming that no trimming occurs of part of the N terminus by an exopeptidase after the release of the coat protein from the precursor.

Comparison between non-coding regions of genomic and satellite RNAs A close similarity was found between the 3' non-coding regions of RNA2 molecules of G FLV-F 13, TBRV-S and GCMV. Some 60~o of the sequences are identical and many long stretches are exactly conserved as shown in

Fig. 2. Comparison with the 3' non-coding region of CPMV RNA M showed a less significant homology. However, a stretch of 17 nt with a single nucleotide mismatch was found at different distances from the 3' ends of these four RNAs. There is no significant homology between the RNA2 and satellite RNA of GFLV-F13, except at the 5' end where a consensus sequence U . G / U GAAAAU/AU/AU/A has-been reported by Fuehs et al. (1989). This is in line with the fact that satellites generally have little if any sequence homology with the genomes of their helper viruses (Murant & Mayo, 1982). In the 5' leader regions, homologies between RNA2 of GFLV-F13, TBRV-S and GCMV and M RNA of CPMV are restricted to the sequence UGAAAAU of the consensus sequence adjacent to the VPg (Fig. I). The 5'terminal nucleotide is A for GFLV-F13 RNA2 and C is the ultimate nucleotide for two satellite RNAs (RNA G and RNA E) of TBRV (Hemmer et al., 1987). This nucleotide is unambiguously present on the cloned cDNA of RNA2 of GFLV-F13 (results not shown). This provides additional evidence that the nucleotide sequence deduced from the cDNA clones in fact extends up to the 5' end of the RNA2. To show that the 5' terminal A is the ultimate nucleotide linked to the VPg, rather than the U more frequently found in this position, sequence determination of the 3' ends of double-stranded RNAs is needed.

GFLV RNA2 sequence and coat protein location

1439

+2

+1

-1 o

-i" i'

_

~z -2 Coat protein

II'IiII"lIIIitl IIll'"liIlJlIII1';JtlI'III' Jr]1,11,~II,l,J~,1,,,11,~,III,IIl,,r,,1,11ll,IJI,,,,]Jll ,,IIlIIJ,II1100

300

500 700 Amino acid number

900

1100

Fig. 3. Hydrophilicity profile of the polyprotein encoded by RNA2 of GFLV-F13. The hydrophilicity values correspondingto the averagevaluesof heptapeptidesare plotted against the positionof the central aminoacid. The mean valueof hydrophilicityfor the wholeprotein is indicated by the line of small squaresand the 0.7 standard deviationof the mean by the dotted lines. The parts of the peaks filled in black represent 50~ of the total.

In addition to the homologies reported above, an octanucleotide (5' U U U C U U U U 3') was found once, at variable distances from the 3' terminus, in the 3' noncoding region of R N A 2 from GFLV-FI3, TBRV-S and GCMV. This sequence was also present once and four times respectively in the 5' non-coding regions of GFLVF13. a n d T B R V - S R N A Z

Properties of the 131K polyprotein The polypeptide of 1184 amino acids encoded by R N A 2 contains 10-9~ acidic and 11.8~ basic amino acids and has a neutral charge overall. The distribution of triplets used to encode the polyproteins in GFLV-F13 RNA2, TBRV-S RNA2, G C M V R N A 2 and CPMV M R N A showed that the frequency of occurrence of each amino acid is remarkably similar in the four viruses. The frequency of N C G is low in Ser, Pro, Thr and Ala codons. The hydrophilicity plot of this polyprotein (Fig. 3) can be divided into three regions: the first 80 residues at the N terminus have mainly hydrophobic properties, the region between residues 80 and 680 is predominantly hydrophilic and the domain of the coat protein from residue 681 to the C terminus is mainly hydrophobic in the

N-terminal moiety while hydrophilic and hydrophobic peaks alternate in the C-terminal region after residue 930.

Comparison of the 131K polyprotein with other viral proteins The 131K polyprotein is clearly shorter than the corresponding polyproteins of the other nepoviruses. Although analysis of the polyprotein with the B E S T F I T program indicated that TBRV-S and G C M V share an overall homology of 5 8 ~ , the homology at this level was reduced to 27~o between G F L V and TBRV-S, to 19~o between G F L V and G C M V and to 16~ between G F L V and CPMV, which corresponded to increasingly distant relationships between these viruses. Comparison among the amino acid sequences of polyproteins encoded by the R N A 2 of GFLV-F13 (131K), of TBRV-S (150K), of G C M V (146K) and R N A M of CPMV (105K) revealed only restricted similarities. The C O M P A R E program allowed the detection of four regions of homology that fell on a diagonal line within the coat protein domain of GFLV-F13, G C M V and TBRV-S. Analysis of those regions (underlined in Fig. 1) revealed two homologous hexapeptides with the sequences L P A N A F (residues 736 to 741 for GFLV-F13 and residues 896 to 901 for TBRVS) if P and A residues were considered similar and

1440

M. A. Serghini and others

(a)

131K GFLV-F13 RNA2

-°°°%%%-~ °°*Oooooo. "°°°°..°°o.,.~n¢ ""-........ ~

.

,

(b)

250

146K GCMV RNA2

.....

........

1000

................

(c)

500

1250

250

~

"--

500

0

150K TBRV-S RNA2

.............

750

2

Fig. 4. Vect•rdiagrams•fRNA2-enc•dedp••ypr•teinf•r(a)GFLV-Fl3•(b)GCMVand(e)TBRV-S.Thezer•p•siti•n corresponds to the N terminus of the polyprotein. The numbers correspond to the position of the residues in the polypeptide. The arrows point to the N termini of coat protein cistrons. The dotted lines underline the coat protein domain.

L G M G G T (residues 907 to 912 for GFLV-F13, 1068 to 1083 for TBRV-S and 1042 to 1047 for GCMV) if M and I were considered similar. Two tetrapeptides (FDAY and FYGR) also exist in the polyproteins encoded by RNA2 of GFLV-F13 (residues 750 to 753 and 1176 to 1179), RNA2 of TBRV-S (residues 668 to 671,911 to 914 and 1334 to 1337) and GCMV RNA2 (residues 883 to 886 and 1312 to 1315). In addition to the comparisons described above, the structure of the polyproteins encoded by RNA2 of GFLV-F13, GCMV and TBRV-S was analysed using the 'Vectorial Representation of Protein' (V.R.P.) algorithm (Poch et al., 1988) which gives a twodimensional vectorial representation of a protein sequence by a one to one amino acid versus vector

correspondence as deduced from French & Robson (1983). The average slope resulting from the sum of several vectors, and the analysis of any important change between two average slopes allow delimitation of different domains. V.R.P. diagrams of isofunctional proteins frequently yield analogous patterns and slope break points for the different functional domains (Poch et al., 1988). In Fig. 4(b) and (c) the overall V.R.P. profiles of the polyproteins encoded by RNA2 of TBRVS and GCMV are nearly identical. The domain corresponding to the coat protein in the polyproteins of GFLV, TBRV and GCMV RNA2 has a similar slope and is clearly distinguishable (dotted line in Fig. 4). However, despite these similarities between the coat protein domains, polyclonal antibodies against GFLV-

GFL V R N A 2 sequence and coat protein location

FI3 coat protein do not cross-react with TBRV-S and vice versa. The same lack of cross-reactivity is observed with antibodies against TBRV and GCMV. The remaining part of the polyprotein showed a complex pattern without distinct features indicating clearly the limits of the additional domains. We are very grateful to A. M. Loudes for skilful technical assistance and to M. Le Ret for amino acid analysis. We thank Dr K. Richards for improving the manuscript. This work was supported in part by a grant of R6gion Alsace. The EMBL Databank accession number for grapevine fanleaf virus RNA2 is X16907.

References BRAULT, V., HIBRAND, L., CANDRESSE, T., LE GALL, O. & DUNEZ, J. (1989). Nucleotide sequence and genetic organization of Hungarian grapevine chrome mosaic nepovirus RNA2. Nucleic Acids Research 17, 7809-7819. CHU, P. W. G. & FRANCKI, R. I. B. (1979). The chemical subunit of tobacco ringspot virus coat protein. Virology93, 398-412. DEVEREUX, J., HAEBERLI, P. & SMITHIES, O. (1984). A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Research 12, 387-395. FORSTER, R. L. S. & MORRIS-KRSlNICH, B. A. M. (1985). Synthesis and processing of the translation products of tobacco ringspot virus in reticulocyte lysates. Virology 144, 516-519. FRAENKEL-CONRAT, H. (1957). Degradation of tobacco mosaic virus with acetic acid. Virology4, 1-4. FRENCH, S. & ROBSON, B. (1983). What is a conservative substitution? Journal of Molecular Evolution 19, 171 175. Fucrls, M., PINCK, M., SERGHINI, M. A., RAVELONANDRO, M., WALTER, B, & PINCK, L. (1989). The nucleotide sequence of satellite RNA in grapevine fanleaf virus, strain FI 3. Journal of General Virology 70, 955-962. HEIDECKER, G. & MESSING,J. (1983). Sequence analysis ofzein cDNAs obtained by an efficient mRNA cloning method. Nucleic Acids Research 11, 4891 4906. HEMMER, O., MEYER, M., GREIF, C. & FRITSCH, C. (1987). Comparison of the nucleotide sequences of five tomato black ring virus satellite RNAs. Journal of General Virology 68, 1823-1833. HEW1CK, R. M., HUNKAP1LLER,M. W., HOOD, L. E. & DREYER, W. J. (1981). A gas-liquid solid phase peptide and protein sequenator. Journal of Biological Chemistry 256, 7990-7997. HUBACEK, J. & GLOVER, S. W. (1970). Complementation analysis of a temperature-sensitive host specificity mutation in E. coll. Journal of Molecular Biology 50, 111-127. KOZAK, M. (1981). Possible role of flanking nucleotides in recognition of the AUG initiator codon by eukaryotic ribosomes. Nucleic Acids Research 9, 5233-5252. MANIATIS, T., FRITSCH, E. F. & SAMBROOK, J. (1982). Molecular Cloning: A Laboratory Manual. New York: Cold Spring Harbor Laboratory. MAXAM, A. M. & GILBERT, W. (1980). Sequencing end-labeled DNA with base-specific chemical cleavages. Methods in Enzymology 65, 499-560.

1441

MEYER, M., HEMMER, O., MAYO, M. A. & FRITSCH, C. (1986). The nucleotide sequence of tomato black ring virus RNA-2. Journal of General Virology 67, 1257-1271. MORRIS-KRSINICH, B. A. M., FORSTER, R. L. S. & MOSSOP, O. W. (1983). The synthesis and processing of the nepovirus grapevine fanleaf virus proteins in rabbit reticulocyte lysate. Virology130, 523526. MURANT, A. F. & MAYO, M. A. (1982). Satellites of plant viruses. Annual Review of Phytopathology 20, 49-70. NOWAK, T., FARBER, P. M., WENGLER, G. & WENGLER, G. (1989). Analyses of the terminal sequences of West Nile virus structural proteins and of the in vitro translation of these proteins allow the proposal of a complete scheme of the proteolytic cleavages involved in their synthesis. Virology 169, 365-376. OKAYAMA,H. & BERG, P. (1982). High-efficiency cloning of full-length cDNA. Molecular and Cellular Biology 2, 161-170. PINCK, L., FUCHS, M., PINCK, M., RAVELONANDRO,M. & WALTER, B. (1988). A satellite RNA in grapevine fanleaf virus strain F 13. Journal of General Virology 69, 233-239. POCH, O., DANEYDE MARCILLAC,G., EXlNGER, F., RoY, A. & LOSSON, R. (1988). Functional domains of the regulatory protein PPR1 : use of the V.R.P. computer program. Yeast 4, $416. QUACQUARELLI, A., GALLITELLI, D., SAVINO, V. & MARTELLI, G. P. (1976). Properties of grapevine fanleaf virus. Journal of General Virology 32, 349-360. RICE, C. M., LENCHES, E. M., EDDY, S. R., SHIN, S. J., SHEETS, R. L. & STRAUSS,J. H. (1985). Nucleotide sequence of yellow fever virus: implications for flavivirus gene expression and evolution. Science 229, 726-733. RICE, C. M., AEBERSOLD, R., TEPLOW, D. B., PATA, J., BELL, J. R., VORNDAM, A. V., TRENT, D. W., BRANDRISS, M. W., SCHLESINGER, J. J. & STRAUSS, J. H. (1986). Partial N-terminal amino acid sequences of three nonstructural proteins of two flaviviruses. Virology 151, 1-9. RUTLEDGE, R. G., SEL~GY, V. U, COTE, M. J., DIMOCK, K., LEWIN, L. L. & TENNISWOOD, M. P. (1988). Rapid synthesis and cloning of complementary DNA from any RNA molecule into plasmid and phase M13 vectors. Gene 68, 151-158. SANGER, F., NICKLEN, S. & COULSON, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, U.S.A. 74, 5463-5467. SERGHINI, M. A., RITZENTHALER, C. & PINCK, L. (1989). A rapid and efficient 'miniprep' for isolation of plasmid DNA. Nucleic Acids Research 17, 3604. SUMIYOSHI,H., MORI, C., FUKE, I., MORITA,K., KUHARA,S., KONDOU, J., KIKUCHI, Y., NAGAMATU,H. & IGARASHI, A. (1987). Complete nucleotide sequence of the Japanese encephalitis virus genome RNA. Virology 161, 497-510. TABOR, S. & RICrtARDSON,C. C. (1987). DNA sequence analysis with a modified bacteriophage T7 DNA polymerase. Proceedings of the National Academy of Sciences, U.S.A. 84, 4767-4771. VAN WEZENBEEK, P., VERVER, J., HARMSEN, J., VOS, P. & VAN KAMMEN, A. (1983). Primary structure and gene organization of the middle-component RNA of cowpea mosaic virus. EMBO Journal 2, 941-946. WELLINK, J. & VAN KAMMEN, A. (1988). Proteases involved in the processing of viral polyproteins. Archives of Virology 98, 1-26.

(Received 4 January 1990; Accepted 12 March 1990)

RNA2 of grapevine fanleaf virus: sequence analysis and coat protein cistron location.

The nucleotide sequence of the genomic RNA2 (3774 nucleotides) of grapevine fanleaf virus strain F13 was determined from overlapping cDNA clones and i...
856KB Sizes 0 Downloads 0 Views