Eur. J. Biochem. 53, 397-404 (1975)

A Prediction of the Structure of Tobacco-Mosaic-Virus Protein Anthony C. H. DURHAM Department of Biochemistry and Biophysics, University of California, San Francisco

P. Jonathan G. BUTLER Medical Research Council Laboratory of Molecular Biology, Cambridge (Received November 23, 1974/January 21, 1975)

The location of amino acid residues within the tobacco mosaic virus protein subunit is discussed. Sequence data, X-ray crystallographic measurements, and the availability of specific residues for enzymic, immunological or chemical reaction are amongst the information used to trace roughly how the tobacco mosaic virus polypeptide chain winds in and out from the virus axis. Published rules for predicting secondary structure are then applied to obtain a diagram of the course of the polypeptide chain. This map should be useful for the interpretation of X-ray diffraction data and already permits an outline of the main features of the inner third of subunit to be suggested. Tobacco mosaic virus (TMV) protein presents a difficult problem for X-ray crystallographic structure determination. It can be crystallised as disks [1,2] but the asymmetric unit is very large (about 600000 daltons) and oriented gels of the virus itself are not sufficiently ordered to yield information to atomic resolution. Therefore, as an aid to the interpretation of current X-ray results, we attempt here to make deductions about the structure of the protein subunit from other available evidence. Such deductions impose limitations on the range of interactions between amino acid residues that may be proposed to explain other experimental data and, of course, the work also represents an exploration of the current state of the art of protein structure prediction. TMV protein is an unusually favourable subject for structure prediction because so much is already known about it. The information summarised here was obtained by many methods, the most important of which is X-ray crystallography, but also including electron microscopy, amino acid sequencing, titration studies and studies of the chemical, enzymic and immunological reactivity of various amino acid residues. The secondary structure of the polypeptide chain is then predicted, using the rules of Chou and Fasman [3,4]. Together these results enable us to propose a diagram of the approximate path of the polypeptide chain. ____ Abbreviation. TMV, tobacco mosaic virus. Enzyme. Carboxypeptidase A (EC 3.4.12.2).

Eur. J. Biochem. 53 (1975)

Earlier attempts at partial TMV protein structure prediction have been made by Schiffer and Edmundson [ 5 ] , Fraenkel-Conrat [6] and Leberman [7]. We gladly acknowledge that our thinking is based upon the labours of many people, not always explicitly mentioned here. In particular we acknowledge our debt to the crystallographic studies of Dr K. C. Holmes and his colleagues, started in Cambridge and continued, more recently, in Heidelberg. Low-Resolution Structural Information

The TMV protein subunit contains 158 amino acid residues and has a molecular weight of 17493 [8,9]. Electron microscope and X-ray studies upon the virus (reviewed by Klug and Caspar [lo] and more recently by Barrett et al. [ll]) show that each subunit can be contained within a sector-shaped domain, like a slice of cake, within an angle of about 22”. It extends radially from about 20 8, to about 90 8, from the virus helix axis, and is about 25 8, high. The RNA fits between the protein subunits in a groove at a radius of about 40 A, with threenucleotides per protein subunit. The radial density distribution of the protein helix alone shows slight troughs at radii of about 40 8, and 60 A. Electron density maps at resolutions approaching 10 A are available from the X-ray work of K. C. Holmes and his colleagues in Heidelberg on oriented gels of the virus [ l l ] (and personal communications) and of P. F. C. Gilbert, J. N. Champ-

398

ness and A. C. Bloomer in Cambridge on the protein crystals [12] (and personal communication). The physical chemistry of TMV protein in solution can be interpreted on the assumption that the subunit is essentially rigid and always forms approximately the same intersubunit contacts [13], though small changes can be seen to accompany the allosteric transition from the helical (i.e. virus-like) to the disk or two-layer mode of association [14,15]. Many spectroscopic techniques confirm this conclusion [16,17]; Sarkar [I81 and Rentschler [19] have shown that proteins from four TMV strains can copolymerise in vitro. Thus it seems likely that most TMV strains and mutants have fundamentally similar structures, an observation which may greatly improve the accuracy of any structural predictions. Optical rotatory dispersion [20], circular dichroism [21] and hydrogen exchange measurements [22] have been interpreted to mean that TMV protein contains 30 I 5 % a-helix. Unfortunately, no redeterminations of this figure have been made in the light of more recent theory. M. A. Lauffer and his colleagues in Pittsburgh have shown that the bonding between subunits is entropic, presumably due to the matching up of hydrophobic regions on the surfaces of the subunits (reviewed in ~31). Locations of Individual Amino Acids The radial locations of various heavy metal atoms or ions attached to the virus particle have been determined by X-ray crystallography. These and other reasonably certain position assignments are summarised in Fig. 1. The mercury of the methyl mercury derivative of cysteine 27 lies at radius 57 A, and the dimercury acetic acid derivative of cysteine 139 in the mutant Ni 2068 [24] lies at radius 72 [ l l ] . A derivative of lysine-68 has been located at radius 70 8, and the N-terminus of strain U2 lies at radius 90 [25,26]. Some other heavy metal atoms bind to the virus or protein crystals, but the actual residues involved have not been determined. The 25 lead-binding site described by Caspar [27] probably involves the two aspartic acid residues 115 and 116 [28]. The second lead ion site at radius 84 A, also reported by Caspar, is very likely to involve glutamic acid-145, and we now feel that glutamic acid-131 is probably the other half of the carboxylcarboxylate pair there. However, other workers have had difficulty in locating this 84 A lead site (K. C. Holmes, personal communication), perhaps for technical reasons, although the evidence for a second p K 7 group is strong [29]. The analogy of the raised carboxylic acid pK values in ionophores, such as nigericin, monensin and X-537A [30], suggests that a

Predictions of Structure of Tobacco-Mosaic-Virus Protcin 115 116

Residues

I 20

(90,92,113?)

I1

30

I;

40

68 139

27

RNA

(131?) 145 1

I

50

60

Radius

70

80

90

(A)

Fig. 1. Diagram of the radial positions uf umino acid residues known from X-ray cr~atalbgraphp

carboxyl . hydroxyl hydrogen bond might also be possible in TMV. We suspect that these lead-binding, raised-pK carboxyls in TMV could also interact with calcium ions. The trigger for the intracellular uncoating of TMV (and possibly also of some other plant viruses) might then be the dissociation of Ca2+ from the protein in the low Ca2+ concentration of the cytoplasm. This argument will be developed in a later publication. Surface Availability of Residues It is possible to infer which residues are exposed on the surface of the protein subunit, either in the intact virus or in the dissociated protein, from experiments with enzymes, specific chemical reagents, antibodies or various other analytical techniques. The C-terminal threonine residue is available in the virus for digestion with carboxypeptidase A [31], and in a mutant lacking proline at position 156 the digestion can proceed through two further residues [32]. Removal of the terminal residue exposes an extra binding site for the cationic dye phenosafranin [33], which has been interpreted as being due to the unmasking of a carboxyl group [34]. The lack of reactivity of the virus towards other proteolytic enzymes suggests that few peptide groups are exposed and freely accessible on the external surface. Only two side-chain carboxylic acid groups (on residues 64 and 66) out of 15, together with the Cterminus, are readily available in the virus for reaction with a water-soluble carbodiimide [35], so presumably only these two are on the outside surface. Some carboxylic acid residues in the peptide from 93 to 112 also react slowly, perhaps because they face the central hole of the virus rod. Lysine-68 reacts with a variety of reagents, even in the virus, but access to it seems to be partially hindered [34,36- 391. Lysine-53 is not available in the virus, but becomes available for reaction in the dissociated protein [36,37]. In mutants, lysines at positions 9 and 140 are accessible for reaction in the virus, but 33 is accessible only in the dissociated protein [40]. Of the four tyrosine residues, only 139 is readily available for iodination or acetylation [41,36]. TyroEur. J. Biochem. 53 (1975)

399

A. C. H. Durham and P. J. G. Butler

sine-2 is slowly available for reaction, while tyrosines70 and 72 are unreactive. None of the three tryptophan residues is available in the virus either for reaction [42,43] or for interaction with the solvent [44], but residues 17 and 52 become available upon dissociation of the protein. Serological studies upon TMV mutants have been described by Von Sengbusch [45] and Van Regenmortel [46]. Mutations of residues 65, 66, 107, 136, 138, 140, 148 and 156 produce serologically detectable changes in the virus, whereas mutations of residues 20, 21, 25, 33, 46, 59, 63, 81, 97, 99, 126 and 129 are not detectable in the virus. These observations, except that concerning residue 107, are all readily interpreted on the simple hypothesis that a change must be in a residue very close to the outside surface of the virus in order to be detected. Antibodies to dissociated Vulgure protein appear to react especially strongly with the tripeptide Ala-Thr-Arg, cdrresponding to residues 110 to 112 [47]. The charge change caused by mutation of residue 97 from glutamic acid to glycine is detectable by electrophoresis of the dissociated protein but not of the intact virus [45]. Sequence Infbrmution

Complete amino acid sequences of four TMV strains are available: see [48] and [49] for details and references to their determination. Partial sequences of several others are also available, and many point mutations have been isolated and their proteins characterised, principally by the Berkeley and Tubingen groups. A full list is deposited with the Centre de Documentation du C.N.R.S. and is available on request. Only two extensive regions of completely conserved sequence exist, extending in the four wholly sequenced strains between residues 87 and 94, and between 113 and 122 [48]. Following this idea, various authors have pointed out that these regions probably comprise the RNA-binding site, since they contain four of the six conserved positive charges in the protein (on arginine residues 90, 92, 113 and 122), which could neutralise the negative charges of the RNA chain phosphate backbone. Apart from these regions, only residues 2, 4, 18, 31, 36-38, 61, 62, 79, 102, 128, 131, 132, 137, 144, 145 and 152 are unchanged in all the strains and mutants studied. (Any such list is automatically biased by the uneven availability of sequence data for different parts of the molecule.) The most mutable regions of the chain seem to be from residue 19 to 28, from 49 to 68, from 97 to 101, and from 138 to the C-terminal end (158). Since these regions also include a high proportion of hydrophilic residues, it seems likely that they are in contact with colvent in the intact virus. Compared with Vulgare

Eur. J. Biochem. 53 (1975)

protein, other strains have insertions after residues 66 and 70, and deletions after residues 145 and 148 (see list and [49a, 49b]), suggesting that the regions around these residues are not constrained either by interactions with other subunits or by requirements for the folding of the polypeptide chain. Other Injormation

There are many other weaker, and even less direct, pieces of information bearing on the chain folding. From optical rotatory dispersion data it has been postulated that an aromatic residue, possibly a tryptophan, was situated near the RNA in the virus [50] and it has also been suggested that the unreactivity of TMV RNA in the virus towards formaldehyde was a chemical, rather than a steric, consequence of the RNA . protein interaction [51]. TMV strains differ in the ability of their proteins to protect the RNA from damage by absorbed ultraviolet light [52]. Also, the trinucleotide-binding site on a protein subunit must have some partial sequence specificity to explain the specificity of nucleoprotein assembly [53,54]. All of these observations imply that the protein must interact with the nucleotide bases and not just with the sugar-phosphate backbone of the RNA. Birefringence measurements show that the RNA bases are aligned with the particle axis [55] and this is confirmed and extended to some of the aromatic amino acid side chains as well by flow dichroism measurements [56 - 581. Infrared dichroism measurements suggest that some a-helical regions are aligned approximately perpendicular to the virus axis [59]. Besides the hydrogen-bonded carboxyls there must be other ionic interactions (i.e. salt links or hydrogen bonds) either within or between protein subunits. In the virus, though not the protein, about six carboxylic acid groups fail to titrate with their expected pK values [60], and, of these, only two can be the member of a carboxyl-carboxylate pair with a lowered pK. A free amino group on lysine-53 is essential for helix formation [36,37]. Many groups of workers have suggested that the tyrosine residues are involved in special interactions [44,43,61,16,60, 621. Paulsen’s comparison of the strains showed that a tyrosine at position 139 is titratable in the virus, but one in position 67 is titratable only in the dissociated protein [60]. Her data are also consistent with tyrosines at positions 2 and 68 being titratable in the virus, whereas of the pairs 12 and 17, and 70 and 72, only one of each pair can be titrated in the dissociated protein, while the other member of each pair requires denaturation of the protein to be titrated. A number of point mutations at certain sites of thc protein, in pditiculai rzbidues 19 and 20, correlate

400

Predictions of Structure of Tobacco-Mosaic-Virus Protein

A

20

30

40

50

20

30

40

50

60

70

80

90

60

70

80

90

c .-

Lo

0” I

Radius

(A)

Fig. 2. Radial distribution of’ the T M V polypeptide chain. (A) Path of the chain deduced from the evidence collated in this paper, with the exception of the secondary-structure assignments. (B) Radial

density profile for the protein in the helical rods calculated by Franklin [72]

with characteristic changes in the symptoms in host plants and also temperature-sensitive behaviour in the isolated protein [45,63]. A mutation of residue 112 to cysteine from arginine, in the mutant PM5, renders the protein unable to coat the viral RNA, whereas it can still form the protein helix [64] and this is further discussed below. Cysteine-27 reacts anomalously with fluorodinitrobenzene, which led to the suggestion that it exists as a thiazoline or thiazolidine ring in the protein [65]. Proteolytic cleavage in the region of residues 85, 99 and 110 was detected in protein polymerised “irreversibly” into stacked-disk rods [66]. The observation that proteolysis can also lead to protein rods with unusually wide central holes [67] weakly suggests that these residues may lie near the centre of the virus.

Chain Tracing

It is not yet possible to give a complete account of the charges on TMV. The isoelectric point of the virus is at about pH 3.2 [68], but, from osmotic pressure data, the net charge on TMV is nearly zero at neutral pH, and removal of Ca2+makes it substantially negative at this pH [69]. TMV was found to bind approximately two Ca2 ions per protein subunit with a high affinity and considerable selectivity over other cations [70], while it was also found that about twenty molecules of a polyamine bind to each virus particle [71]. Clearly the net charge on particles in any preparation of TMV will be markedly dependent upon the conditions used both for its isolation and for any subsequent treatment.

From all this information, it is possible to draw a tentative path for the polypeptide chain, as shown in Fig. 2A. Although this map really contains only one-dimensional information about the probable radii of various residues (in part because of the lack of angular resolution at low radii of the present X-ray data), it has been drawn in two dimensions to show the space inside the sector shape available to the protein subunit. Also shown (Fig. 2B), on the same radial scale, is the radial density distribution calculated from X-ray diffraction from the nucleic acid free helical rods [72]. Heavy atom locations are shown as well. The general distribution of the polypeptide chain is reasonably delineated by the known constraints, except for an uncertainty about the region between residues 30 and 50. Notice the considerable extension of the region between residues 116 and 131 involved in the carboxylate pairs, and the more gentle extension between 66 and 90.

Secondary Structure

+

Having summarised all the chain location information available to us, we next attempt to predict the secondary structure of the polypeptide chain. Applying the rules of Chou and Fasman [4] to all the available sequences of TMV strains yields predictions of a-helix, B-sheet and p-turns. Only the Vulgare sequence and prediction are shown here (Table I), but a full listing Eur. J. Biochem. 53 (1975)

40 1

A. C. H. Durham and P. J. G . Butler Table 1. .4mino acid sequence and secondary structure predictions j o v the protein of T M V Vulgare Structural predictions were made using the rules of Chou and Fasman [4] and the places where these were overridden are discussed in the text. The resulting structure assignments are shown. in block form, in Fig. 3A. Structure predictions are shown as: CI = a-helix; 0 = P structure; T = /I-turn and - = random structure Number Residue -

Prediction

1 2

Acetyl Ser TYr

T T T

3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

Ser Ile Thr Thr Pro Ser Gln Phe Val Phe Leu Ser Ser Ala TrP Ala ASP Pro Ile Glu Leu I le Asn Leu CYS Thr Asn Ala Leu GIY Asn Gln Phe Gln Thr Gln Gln Ala Arg Thr Val Val Glll Arg Gln Phe Ser Glu Val Trp LY s Pi0

T

T T T T

P 11

P B T T T T -

a a

a a I

a

a T T

T T T T 7 T

P I;

B P P //

B

P

P P B P P P B

-

T T

Eur. 3 . Biochem. 53 (1975)

Number Residue

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110

Ser Pro Gln Val Thr Val Arg Phe Pro ASP Ser Asp Phe LYS Val TYr Arg TYr Asn Ala Val Leu Asp Pro Leu Val Thr Ah Leu Leu GlY Ala Phe Asp Thr Arg Asn Arg Ile I le Glu Val Glu Asn Gln Ala Asn Pro Thr Thr Ala Glu Thr Leu ASP Ala

Prediction

T T

B P P B P T T T T -

T T T T

Table 1 (continued) Number Residue

113 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134

Thr Arg Arg Val ASP ASP Ala Thr Val Ala Ile Arg Ser Ala Ile Asn Asn Leu Ile Val Glu Leu Ile '4%

Prediction

Number Residue

135 136 137 138 139 140 141 142 3 43 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158

GIY Thr GlY Ser TYr Asn Arg Ser Ser Phe Glu Ser Ser Ser GlY Leu Val TrP Thr Ser GlY Pro Ala Thr

Prediction

T T T T -

T T T T -

7 T T T

P B P P P T T T T

-

a a a a c1

a a a a ci

a T T T T a c1

a CI

a a ci

a T T T T G!

a G! G!

a a

is available on request from the Centre du Documentation de C.N.R.S. The other predictions are reasonably consistent, which suggests a common overall structure for the various strains and also that the prediction rules are reasonably valid. Furthermore, most of the known point mutations are compatible with these secondary structure assignments. We have chosen to overrule the naive secondary structure predictions for Vtllgare protein in only three minor details. From 78 to 88 the choice between a and /? structure is marginal and the opportunity for hydrogen bonding to other p regions suggests that B is the more reasonable assignment, but the alternative a-structure could readily be accommodated. Likewise from 23 to 30 we choose p structure, which also enables cysteine 27 to lie at its correct radius. From 125 to 128 there is a weakly predicted turn which must be overruled to allow the necessary extension between carboxyls 116 and 131. With these modifications, the whole secondary structure prediction for Vulgare protein can be fitted into the chain path already deduced, as shown in block form in Fig. 3 A. In addition Fig. 3B shows the resulting distribution of charged, hydrophilic and hydrophobic residues in Vulgare protein. The arrangement of the regions of /?-structure into the large and small /?-sheets is, of course, largely arbitrary and therefore can only be a suggestion of one possible arrangement. The basis for our sugges-

402

Predictions of Structure of Tobacco-Mosaic-Virus Protein

affect the validity of either the secondary structure predictions or of the overall distribution of the chain and the resulting discussion.

A

DISCUSSION

1

Radius

(A)

B

@-

I 1

Fig. 3. Final prrdiction of’ T M V poljprptide chain folding, ivith secondarj-structurr assignmmts. Radial positions of residues are probably reliable, but azimuthal arrangements were chosen arbitrarily to illustrate the space available for the chain. Some compromises have been necessary t o represent the three-dimensional structure on two dimensions. (A) Block diagram of assignments and locations, with heavy dots representing heavy atom positions and amino acid side chains extending towards these indicated by broken lines. (B) Nature of the individual amino acid residues. Open circles represent hydrophobic residues ( A h , Ile, Leu, Phe, Trp, Val), and filled circles represent hydrophilic residues (Am, Gln, Ser, Thr). Charged residues are indicated. Note the localisation of the charged residues o n the aqueous surfaces of the RNA-binding site and the hydrophobic nature of the protein between 40 A and 60 A, where the few hydrophilic residues present are largely amides

tion, as also for the assignment of 78 to 88 as p rather than a-structure, has been an attempt to maximise the possible hydrogen bonding between regions of the peptide chain which are predicted as having fi-structure and which also have to run in approximately parallel directions, as shown by the constraints leading to the i m p in Fig. 2A. While this seems to us to be a reasonable basis upon which to formulate a working hypothesis, this hypothesis will clearly need modification in the light of any direct experimental data with which it is not consistent. Indeed, preliminary results from a new electron density map for the virus helix suggest that the B-sheet predicted here would be too wide and therefore that the chains cannot in fact hydrogen bond as closely as is suggested here (K. C. Holines and G. J. Stubbs, personal communication). Such a modification does not, however,

So many diverse pieces of information have fitted together harmoniously that we are hopeful Fig. 3 will turn out to be a realistic outline of the basic feature of the protein structure. Though it would not be impossible to try building a three-dimensional model based on this map, we will merely comment here on some general features and show that the inner third of the molecule, including the RNAbinding site, is quite strongly delineated. A backbone of B-structure runs radially along the molecule, providing rigidity and linking the two main reactive regions, exemplified by the anomalously titrating groups involved in the protein’s “allosteric” behaviour. However, there are only the four strands of D-structure in the region between radii 45 and 65 A, and this requires a substantial penetration of solvent into this region (i.e. a large hole). Such penetration is indeed observed in the three-dimensional electron density map for the protein crystals [12]. The residues predicted to occur in this region are largely hydrophobic and could thus contribute to the observed inter-subunit bonding. While most of the charged residues lie at the hydrophilic surfaces, arginines-41 and 122 d o lie in this middle segment of the molecule. Presumably they could be neutralised in the final structure by ions in the penetrating solvent and, if the elongation of the nucleoprotein helix does occur with disks as the protein source [54], these positive charges might assist in the intercalation of the RNA [73]. From 93 to 112 the structure prediction as helixturn-helix is strikingly consistent among the strains despite considerable variation in sequence. Because of the limited space at the inner radius, these two helices can only fit one on top of the other. The involvement of residues 115 and 116 in the 24 A carboxyl-carboxylate pair and the probable involvement of arginines 90, 92 and 113 in RNA phosphate neutralisation also serve to define this region fairly precisely. The concentration of “interesting” amino acid residues in this region, including the RNA binding site, together with the ability of a-helices to fold up first after synthesis of a protein, suggests that this region may be the keystone of TMV protein architecture. The sequence of a TMV RNA fragment reported to bind to TMV protein [74] codes for this part of the protein, from residues 95 to 130, and it has been suggested that, by analogy with some bacteriophage Eur. J. Biochem. 53 (1975)

403

A. C. H. Durham and P. J. G. Butler

systems [75],this binding may enable the coat protein to act as a repressor for its own synthesis [76]. It is noticeable that in the first base position of each codon in this sequence, A and G outnumber U and C in the ratio of 6 : 1. If the reported binding of the RNA fragment to the protein is in the normal nucleotidebinding site, this would accord with the selectivity of the protein for purines rather than pyrimidines previously observed [53,54]. Some possible features of the RNA-binding site can now be suggested. The positive charges on arginines-90 and 92 are probably splayed out by the 1-turn on to one surface of the subunit, while residue 113 could well be on the other. If this were so, the RNA phosphates would be neutralised by a repeating pattern of one arginine in one subunit and two on its neighbour. Such an alternating pattern would serve to shorten the RNA backbone to the approximately 5 8, repeat distance necessary to fit in the groove at 40 8, radius. The RNA groove region is made up of the two portions of conserved sequence already noted plus another portion containing the only other three conserved residues on a row, 36 to 38. Residues close enough to interact with the RNA include a number bearing amide or hydroxyl groups, with hydrogen bonding potential. Two phenylalanines, residues 35 and 87, could stack against the nucleotide bases, particularly as they are likely to be aligned axially sticking out from the P-sheet, but no tyrosines or tryptophans are within range. The conserved aspartate residue 88 is a curious feature, paralleled by a conserved aspartate residue in the AMP t h d i n g site of several dehydrogenases, where it forms a hydrogen bond with a ribose hydroxyl group [77]. Another residue with an, as yet, poorly characterised function is arginine 112. This is largely conserved or, in cucumber green mottle mosaic virus, conservatively mutated to lysine [78], but in the strains U2 and Holmes Rib Grass it is changed to glutamine and this loss of a positive charge could explain the increased ultraviolet radiation sensitivity of U2 compared to Vulgave. Also, in the mutant PM5, the change to cysteine could shorten the a-helix 105 to 112 and cause sufficient local perturbation to account for the inability of the protein to coat the RNA, even though it otherwise forms the usual aggregates. In conclusion, it appears that our study has achieved its three objectives reasonably well. The likely radial separations of amino acid residues are clearly indicated and predictions of the secondary structure and overall arrangement of the polypeptide chain should prove useful for the interpretation of early low-resolution X-ray maps. We were surprised by the natural way in which the secondary structure predicEur. J. Biochem. 53 (1975)

tions fitted into place so well with the other data, and this augurs well for the success of the pragmatic approach to structure adopted by Chou and Fasman. Further confirmation must now await elucidation of the high resolution electron density map by X-ray crystallography. We would particularly like to thank Dr K. C. Holmes and G. J. Stubbs from the Max-Plunck-lnstitut Jur Mrdiziniwhe Forschung, Heidelberg, W. Germany and Mr M. W. Rees from the John lnnes Institute, Norwich, England, for permission to quote of their unpublished results. A.C.H.D. gratefully acknowledges a fellowship from the Helen Hay Whitney Foundation.

ANNEXES The following documents have been deposited at the Archives originales du centre de documentation du C.N.R.S., F-75971 ParisCedex-20, France, where they may be ordered as microfiche or photocopies. Reference No.: A.O.-541, Appendix 1. Sequence and secondary-structure predictions from TMV strains. Appendix 2. List of TMV-coat-protein mutants of which details of the amino-acid exchanges have been published.

REFERENCES 1. Macleod, R., Hills, G . J . & Markham, R. (1963) Nacure (Lond.)

200, 932 - 934. 2. Finch, J. T., Leberman, R., Chang, Y . 3 . & Klug, A. (1966) Nature (Lond.) 212,349- 350. 3. Chou, P. Y. & Fasman, G. D. (1974) Biochemistry, 13, 211222. 4. Chou, P. Y. & Fasman, G. D. (1974) Biochemistry, 13, 222245. 5. Schiffer, M. & Edmundson, A . B. (1967) Biophys. J . 7, 121235. 6. Fraenkel-Conrat, H. (1968) The Molecular Busis of Virology, p. 134, Reinhold Publishing Corporation, New York. 7. Leberman, R. (1971) J . Mol. Biol. 55, 23-30. 8. Anderer, F. A,, Uhlig, H., Weber, E. & Schramm, G. (1960) Naturr (Lond.) 186,922-925. 9. Tsugita, A., Gish, D. T., Young. J., Fraenkel-Conrat, H., Knight, C. A. & Stanley, W. M. (1960) Proc. Nictl Acad. Sci. U.S.A. 46, 1463-1469. 10. Klug, A. & Caspar, D. L. D. (1960) Adv. Virus Res. 7,225- 325. 11. Barrett, A. N., Barrington-Leigh, J., Holmes, K. C., Leberman, R., Mandelkow, E., Von Sengbusch, P. & Klug, A. (1973) CoM Spring Harbor Symp. Quant. Biol. 36, 433-448. 12. Gilbert. P. F. C. & Klug, A. (1974) J . Mol. Bid. 86, 193-207. 13. Durham, A. C. H. & k l u g , A . (1971) Nut. New Biol. 229, 42-46. 14. Durham, A. C. H., Finch, J. T. & Klug, A. (1972) Nut. New Biol. 229, 37-42. 15. Finch, J. T. & Klug, A, (1974) J. Mol. Biol. 87, 633-640. 3 6. Budzynski, A. 2. (1971) Biochim. Biophys. Acta, 251,292- 302. 17. Guttenplan, .I.B. & Calvin, M . (1973) Btochim. Biophys. Actu, 322, 294 - 300. 18. Sarkar, S. (1960) Z . Naturforsch. TeiEB, 15, 778-786. 19. Rentschler, L. (1967) Mol. Gen. Genet. 100, 96- 108. 20. Simmons, N. S . & Blout, E. R. (1968) Photochem. Photohiol. 8, 81 -92. 21. Schubert, D. & Krafczyk, B. (1969) Biochim. Biophys. Actu, 188, 155-157.

A . C. H. Durham arid P. J. G. Butler: Predictions of Structure of Tobacco-Mosaic-Virus Protein

404

22. Budzynski, A. Z. & Fraenkel-Conrat, H. (1970) Biochemistry, 9, 3301 - 3309. 23. Lauffer, M. A. & Stevens, C. L. (1968) Adv. Virus Res. 13, 1-63. 24. Wittmann, H. G. (1964) Z . Vererbungsl. 95, 333-344. 25. Gallwitz, U., King, L. & Perham, R. N. (1974) J . Mol. Biol. 87, 257 - 264. 26. Mandelkow, E. & Holmes, K. C. (1974) J . Mol. Bid. 87, 265 - 273. 27. Caspar, D. L. D. (1963) Adv. Prot. Chem. 18, 37-121. 28. Butler, P. J. G. & Durham, A. C. H . (1972) J . Mol. B i d . 72, 19-24. 29. Butler, P. J. G., Durham, A. C. H. & Klug, A. (1972) J . Mol. Biol. 72, 1 - 18. 30. Pressman, B. C. (1973) Fed. Proc. 32, 1698-1703. 31. Harris, J. I. & Knight, C. A. (1955) J . Biol. Chem. 214, 215230. 32. Tsugita, A. & Fraenkel-Conrat, H. (1962) J . Mol. B i d . 4 , 73 82. 33. Ginoza, W. & Atkinson, D. A. (1956) J . Am. Chem. Soc. 78, 2401 - 2404. 34. King, P. L. & Perham, R. N . (1971) Biochemistry, 10,981-987. 35. King, P. L. & Leberman, R. (1973) Biochim. Biophys. Acta, 322, 279- 393. 36. Fraenkel-Conrat, H. & Colloms, M. (1967) Biochemistry, 6, 2740-2745. 37. Perham, R. N. & Richards, F. M. (1968) J . Mol. Biol. 33, 795 - 807. 38. Scheele, R. B. & Lauffer, M. A. (1969) Biochemistry, 8, 35973602.. 39. Perham, R. N. & Thomas, J. 0. (1971) J . Mol. B i d . 62, 415418.40. Perham, R. N. (1973) Biochem. J . 131, 119-126. 41. Fraenkel-Conrat, H. & Sherwood, M. (1967) Arch. Biochem. Bioph?~120, 571 - 577. 42. Ramachandran, L. K. & Witkop, B. (1959) J . A m . Chem. Soc. 81,4028-4032. 43. Fairhead, S. M., Steel, J. S., Wreford, L. J. & Walker, I. 0. (1969) Biochim.Biophys. Acta, 194, 584- 593. 44. Chien, Y. J., Chang, Y. S. & Tsao, T. C. (1965) Sci. Sin. 14; 998 - 1008. 45. Von Sengbusch, P. (1965) 2. Vererbungsl. 96, 364-386. 46. Van Regenmortel, M. H. V. (1967) Virology, 31, 467-480. 47. Benjamini, E., Shimizu, M., Young, J. D. & Leung, C. Y. (1968) Biochemistry, 7, 1261- 1264. 48. Wittmann-Liebold, B. & Wittmann, H. G. (1967) Mof. Gen. Genet. 100, 358- 363. 49. Dayhoff, M. 0. (1972) Atlas ofprotein Sequence and Slructure. vol. 5, D285-D287, National Biomedical Res. Found.. Washington, D.C. -

49a. Hennig, B. (1970) Doctoral Thesis, University of Tubingen. 49b. Nogu, Y., Tochihard, H., Komura, Y. & Okada, Y. (1971) Virology, 45, 577- 585. 50. Cheng, P.-Y. (1968) Biochemistry, 7, 3367-3373. 51. Dobrov, E. N., Kust, S. V. & Tikhonenko, T. I. (1972) J . Gen. Virol. 16, 161 - 172. 52. Streeter, D. G. & Gordon, M. P. (1968) Photochem. Phorobiol. 8, 81 -92. 53. Fraenkel-Conrat, H. &Singer, B. (1964) Virology,23,354- 362. 54. Butler, P. J. G. & Klug, A. (1971) Nut. New Biol. 229, 47-50. 55. Franklin, R. E. (1955) Biochim. Biophys. Acra, 18, 313-314. 56. Gabler, R. & Bendet, I. (1972) Biopolymers, 11, 2393-2413. 57. Allen, F. S. & Van Holde, K. E. (1971) Biopdymers, 10, 865881. 58. Taniguchi, M., Yamaguchi, A. & Taniguchi, T. (1971) Biochim. Biophys. Acta, 251, 164- 171. 59. Beer, M. (1958) Biochim. Biophys. Acta, 29, 423-423. 60. Paulsen, G. (1972) Z. Naturforsch. Teil B, 27, 421-444. 61. Eiskamp, J . G. (1969) Ph. D. Thesis, University of Oregon. 62. Ohno, T., Yamaura, R., Kuriyama, K., Inoue, H. & Okada, Y. (1972) Virology, 50, 76 - 83. 63. Jockusch, H. (1966) Z. Vererbungsl. 98, 344-362. 64. Hariharasubramanian, V. & Siegel, A. (1969) Virology, 37, 203-208. 65. Ramachandran, L. K . (1971) Indian J . Biochem. Biophys. 8, 247-253. 66. Durham, A. C. H. (1972) FEBS Lett. 27, 147-152. 67. Durham, A. C. H. & Finch, J. T. (1972) J . Mol. Bid. 67, 307 - 3 14. 68. Kramer, E. & Wittmann, H. G. (1958) Z. Nuturforsch. Ted B, 13,30- 33. 69. Adiarte, A. L. & Lauffer, M. A. (1973) Arch. Biochem. Biophj,s. 158, 75 - 83. 70. Loring, H . S., Fujimoto, Y. & Tu, A. T. (1962) Virology, 16, 30-40. 71. Johnson, M. W. & Markham, R. (1962) Virology, 17,276-281. 72. Franklin, R. E. (1956) Nature (Lond.) 177, 928-930. 73. Butler, P. J. G. (1971) Nature iL0nd.J 233, 25-27. 74. Guilley, H., Jonard, G. & Hirth, L. (1974) Biochimie, 56, 181 -184. 75. Bernardi, A. &Spahr, P.-F. (1972) Proc. Nurt Acad. Sci. U . S . A . 69, 3033-3037. 76. Richards, K . E., Guilley, H., Jonard, G. & Hirth, L. (1974) FEBS Lett. 43, 31 - 32. 77. Rossmann, M. G., Liljas, A,, Branden, C. & Banaszak, L. J. (1975) in The Enzyme,s (Boyer, P. D., ed.) vol. 10, Academic Press, New York, in press. 78. Kurachi, K., Funatsu, G., Funatsu, M. & Hidaka, S. (1972) Agric. Biol. Chem. 36, 1109-1116.

A. C. H . Durham, Department of Microbiology, University of Cape Town, Rondebosch, South Africa P. J. G. Butler, M.R.C. Laboratory of Molecular Biology, Postgraduate Medical School, University of Cambridge, Hills Road, Cambridge, Great Britain, CB2 2QH

Eur. J. Biochem. 53 (1975)

A prediction of the structure of tobacco-mosaic-virus protein.

The location of amino acid residues within the tobacco mosaic virus protein subunit is discussed. Sequence data, X-ray crystallographic measurements, ...
766KB Sizes 0 Downloads 0 Views