J. Mol. Biol. (1990) 212, 737-761

Refinement of the Influenza Virus Hemagglutinin by Simulated Annealing William I. Weist, Axe1 T. Brtingerj The Howard Hughes Medical Institute and Department of Molecular Biophysics and Biochemistry Yale Urkersity, 260 Whitney Ave., New Haven, CT 06511, U.S.A.

John J. Skehel Mill

National Institute for Medical Research Hill, The Ridgeway, London NW7 IAA,

lJ.K.

and Don C. Wiley

Harvard

The Howard Hughes Medical I,nstitute and Department of Biochemistry and Molecular Biology University, 7 Divinity Ave, Cambridge, MA 02138, U.S.A.

(Received 8 September 1989; accepted 17 November 1989)

We have applied the method of simulated annealing to the refinement of the 3 pi resolution crystal structure of the influenza virus hemagglutinin glycoprotein, using the program X-PLOR. Two different methods were introduced into X-PLOR to treat the noncrystallographic symmetry present in this and in other crystal structures. In the first., only the unique protomer atoms are refined; by application of the non-crystallographic symmetry operators to the protomer atoms, the X-ray structure factor derivatives are effectively averaged, and a non-bonded energy term models the interactions of the protomer with its neighbors in the oligomer without explicit refinement of the other protomers in the crystallographic asymmetric unit. In the second method, the entire asymmetric unit is refined, but an effective energy term is added to the empirical energy that restrains symmetry-related atomic positions to their average values after least-squares superposition. Several other modifications and additions were made to previously published X-PLOR protocols, including weighting of the X-ray terms, maintenance of the temperature of the molecular dynamics simulation, treatment of charged groups, changes in the values of certain empirical energy parameters, and the use of N-linked carbohydrate empirical energy parameters. The hemagglutinin refinement proceeded in several stages. An initial round of simulated annealing of the monomer was followed by rigid-body refinement of the 3-fold non-crystallographic symmetry axis position and a second round of monomer refinement. A third round was performed on the trimer using non-crystallographic symmetry restraints in all regions except those in lattice contacts showing obvious derivations from 3-fold symmetry. The refinement was completed with several rounds of conventional positional and isotropic temperature factor refinement needed to correct bad model geomet,ry introduced by high-temperature molecular dynamics in regions of weak electron density. This st’ructure was then used as the basis for refinement of three crystallographically isomorphous hemagglutinin structures, including complexes with the influenza virus receptor, sialic acid. Model geometry comparable to well-refined high-resolution structures the ability of was obtained with relatively little manual int,ervenGon, demonstrating simulated annealing refinement t’o produce highly idealized structures at’ moderate resolution.

t Present, address: The Howard Hughes Medical Institute and Department of Biochemistry and Molecular Biophysics, Columbia University. 630 W. 168th St, Kew York, NY 10032, U.S.A.

$ Author to whom all correspondence should Ire addressed.

737 0022-2836/90/080737-25

$03.00/O

c

1990 Academic

Press Limited

1. Introduction The hemagglutinin glycoprotein (HAT) of bhe influenza virus membrane mediates the receptor binding and membrane fusion activities required for entry of the virus into a host cell, and is also the primary antigen of the virus (for a review, see Wiley & Skehel, 1987). The HA is a homotrimer consisting of a large ectodomain, a small transmembrane region, and a small domain inside the virus. Each monomer consists of t$wo disulfide-linked chains, formed by post-translational HA, and HA,, cleavage of a single polypeptide precursor; the Cterminal region of HAI anchors the HA in the membrane. Treatment of the X:31 virus (a recombinant influenza strain containing A/Aichi/68 HA) with the protease bromelain produces the trimeric, water-soluble ectodomain, BHA, of molecular mass z 210 kDa. Each monomer of X:31 BHA has 328 and 175 amino acid residues in HA, and HA2, respectively, and there are seven N-linked glycosylation sites per monomer. The structure of X:31 determined to 3 Lh resolution BHA was (1 A = @1 nm) by Wilson et al. (1981). Structures of mutant BHAs with altered antigenic, receptor binding, and fusion activities, and complexes of BHA with cellular receptor analogues have been reported (Knossow et al., 1984; Weis et al., 1988. 1990). Despite the 3 A resolution limit imposed by BHA crystals, there is great interest in using these structures for modeling studies as part, of an effort to design anti-influenza drugs. We have therefore refined these structures with the goal of obtaining models as close to idealized as possible. Macromolecular crystallographic refinement aims to improve the agreement, between observed structure factor amplitudes (F,) and those calculated from an atomic model (E’), while also improving the geometry or empirical energy of t,he model in accordance with prior knowledge of these quantities obt,ained from small molecule crystallography and spectroscopy. This is accomplished by adjusting the atomic positions and temperature factors of a model to minimize a t’arget funct’ion of t,hc form Etota, = ke,,pirical + EX-ray?where Bempirica,is t,he empirical energy or other function describing the geomet,ry of the model, and ICx.ray is a function describing the discrepancy between observed and calcula,ted structure factors. Conventional refinement by leastsquares or other minimization met,hods becomes trapped in local minima of Et,,,,, and manual adjust t Abbreviations

used: HA,

hemagglutinin;

I?‘, and F,,

observed and calculated structure factor amplitudes; SA, simulated annealing; SA refinement. crystallographic> refinement by simulated annealing with molecular dynamics:

R-factor, 1 IIFrlhkl

IFtk’ll/C IFF’I; hkl

NM, non-crystallographic symmetry; FFT, fast Fourier transform; R-factor, isotropic temperature factor; GlcNAc, 5-acetyl glucosamine; Man, mannose; r.m.s.. root-mean-square; sialic acid, X-acet,yl neuraminic acid.

ment of the model is necessary to overcome Ijarritars between local minima t.hat prevent t.hc optimization from proceeding. With currentI) ;LVitilahlr computing power. manual model adjustment,. using computer graphics to disp1a.y and rnovf’ portions ot the model in electron density maps, is th(l ratv limiting step in the refinement process. The tnrthod of simulated annealing (SA: Kirkpatrick et al. _ 1983) overcomes the local minimum problem by srarching the configuration space of a multiparameter funvtion such as I’&,~~,.In contrast, to c~onventional minimization methods, which allow only ener,gSc~t,icall~ “downhill” steps in /$,,,,, SA call cross barrikars between minima by t,aking “uphill” steps >Lswill. with probability c- ‘lkhT. where li, is l~oltzmartn’s constant and T is an effective temperature used as a control parameter. The optimization is eon t~rolh~tl by the “annealing schedule” (Kirkpat,rick pt ~1.. 1983; Bounds, 1987). defined as t,he srquencc’ of temperatures and numb~~r of configurations samplrti at each temperature during the search Kriinger rf ~6. (1987) have demonstrated t,he uGlity of SA ill refinement, by using tnolecular crystallographic dynamics (Karplus & Mc(!ammon. 1981) as ;L tool to heat the system. An effect!ive energy t,erm Ii:,, equal to t’he squared difference bet,ween obsrrvr~cl and calrulateti

structure

factor

amplituctes

(and

optiofl-

ally experimental and caalculatrd phasrs) sumtnrtl over all refiections. is added to t,he empirical energ)function describing covalent bonds. bond angIrls, torsion angles and non-bonded intJeractiolls used in conventional molecular dynamics sitnula~tions to restrain the dynamics t,rajectorq- to thtb tlxpcrimental observations. 1t has been shown that S;\ refinernent can produce an optimized st ructurta with much less manual intervention than is rrquirt~tl in conventional refinement (Briinger 1988n: Kriinger f$ n.Z., 1989; Fujinaga et al.. 1989). Refinement at) 3 A resolution is problematic~ because t,he diffraction data do no1 determint thP atomic posit,ions suficientl; to provide it highI) accurate model. Conversely. because thtl data onI>loosely (sonstrain the atomic posit,ions at t,his resolrltion, in principle one should br able to produces a highly idealized model t,hat agrees well with the data. In practice. however. least-syuares mrthods quickly become trapped in local minirna, making it difficult to obtain a satisfactory st,ructurts without a significant amount of manual model adjust,ment. particularly in a (basesuch as BHA, whicah has more than 12,000 non-hydrogen atsoms in the crystallographic asymmetric unit. Earlier refinements of HA by least-squares methods (Knossow ut al.. 1986: Weis Pt al.. 1988) produced structures ait,h low R-factors but’ rather large deviat#ions from ideal geomet’rv. Since SA refinement offers the possibility of idealczing thr struc%ure with little manual intervention. we concluded that St-\ refinement wits well suited for our purposes. Here, we describe the refinement, of BHA by simulated annealing. This work was done without reference t,o the previously reported refined BHA structures (Knossow et al., 1986: m’eis rt al. 1 1988).

Influenza

Virus Hemagglutinin

We refined a single amino acid variant of X:31 BHA to a low R-factor and good geometry, and then used this model to refine several crystallographically isomorphous HA variants. We have incorporated methods for exploiting the non-crystallographic symmetry (NCS) present in this and other systems into SA refinement. The refinements required relatively little manual intervention, and produced structures with geometry comparable to wellrefined higher-resolution structures.

2. Materials and Methods (a) HA variants and nomenclature In this work, the terms HA and BHA are used interchangeably. All of the HAS reported in this work are derived from variants of the X:31 virus, which contains the HA from the A/Aichi/68 (H3N2) strain. The mutant naming convention is: [single-letter code of amino acid in X:31] [residue number] [single-letter code of amino acid in mutant], with amino acids numbered 1 to 328 in HA, and 1001 to 1175 in HA,. Thus, G146D is a variant of the X:31 HA with glycine replaced by aspartic acid at position 146 of HA,. (b) C?ystalZogruphic data Details of crysbal growth, photographic data collection and data reduction are given elsewhere (Wilson et al., 1981; Knossow et al., 1984; Weis et aZ., 1988). HA crystallizes from 1.3 to 1.4 M-sodium citrate in space group P4,, with I HA trimer in the crystallographic asymmetric unit. Data between 7.0 A and the high-resolution limit of the crystals were used in the refinement; despite the significant contribution of bulk solvent scattering below 5 a resolution, we chose the 7 A cutoff to ensure that we had enough reflections in the refinement. No attempt was made t,o model the bulk solvent scattering. In all cases, there is a sharp falloff in data quality past 3.2 A resolution (3.6 A in the case of D1112G). We therefore included all reflections between 7.0 and 3.2 A but accepted only those reflections for which Fik’ 2 20,~~ past 3.2 A (3.6 A in the case of DlllSG). This represented a compromise between data quality and having a sufficient number of reflections in the refinement. The data are summarized in Table 1. ((3) Rejnemunt methods (i) Software

The program X-PLOR (Briinger, 19886) was used for all refinement calculations, including rigid body minimization, energy minimization and SA refinement. Skewing and averaging of difference Fourier maps for inspection on computer graphics systems was performed using the programs of Bricogne (1976). The interactive graphics program PROD0 (Jones, 1985) was used for model building. Analysis of model geometry was done in part with the program GEOM (G. Cohen, personal communication). (ii) EwLpirical energy parameters Protein and TTP3p water parameters from the molecular dynamics program CHARMM, version 20 (Brooks et al., 1983), were used with the modifications described by Briinger et al. (1989). Further modifications made during the course of the HA refinement are

SA Refinement

739

discussed below. Polar hydrogen atoms were explicitly included in the empirical energy calculations, but aliphatic hydrogen atoms were treated by using “extended atom” representations of the carbon atoms to which they are attached. For carbohydrate residues, parameters for an all-atom (i.e. including aliphatic hydrogen atoms) representation of glucose (Ha et al., 1988) were used for the pyranose ring and substituent hydroxyl groups. For other substituents and glycoside linkages, parameters were taken from the CHARMM protein parameters where possible; otherwise, reasonable values were chosen by analogy to existing parameters. hnomeric effects were ignored. The chirality at the epimeric centers was maintained by a strong improper torsion angle force constant of 500 kcal mol-’ rad-* (1 cal = 4.184 J), the same constant as that used for the amino acid rhiral centers (Briinger et al.. 1989). (iii) Treatment of non-crystallographic

,\ymm,etry

Two different methods for treating non-crystallographic symmetry were introduced into X-PLOR for the HA refinement, which we term strict and restrained. Strict NCS. Strict NCS assumes that NC&related monomers (or, more generally, protomers) are strictly identical. which permits refinement of a single protomer. In addition to reducing the amount of model inspection and adjustment required during refinement, strict NCS reducaes the empirical energy calculation by a factor of n times for n-fold NCS (it is not exact approximately because of inter-protomer non-bonded interactions; see below). Moreover, imposition of strict, NCS permits averaging of the structure factor derivatives with respect to the NCS, which improves the signal-to-noise ratio by averaging out noise in the data. We now discuss the modifica-stallographicb asymmetric unit is rt*finr(l. and NCS-rrlatrd atoms art‘ restrained t,o t,heir ;rvtbl’agt’ I)osi tions f, = (i.!/.5) after Irast-scprw sul)t,rpositiotl (Kabsch. 1976) of N- I protomrrs onto :I rc~frrc~nc~r~ prot,omrr b,v adding an rffeit~ive cbnrrg,v trrm Ir’‘NCS

-

kN,,

II,

T (‘i,nArt)’ (8)

to the rmpirical energy fun&on. whrrr kNCS is an rffec+ivc~ force constant used to weight this t~c~rtn. Isotropic, temperature fact,ors can btl trratctd similnrl~ \vith thfb restraint term.

-,’ neNCS C 1i (Hi.,-Rj)Z,

gNCS

(!J,

deviation for this term where oNCs is a target standard groups of NCS-related (Hendrickson, 1985). Different atoms can be weighted separately, or not included at all. This allows the imposit,ion of NVS on only pa.rt of the> structure. (iv) Hejinement of the n,o?l-crygtnllogra~)hir sym,metry axis To refine the position of the B-fold N(:S axis in t,ht: unit cell, we ran 20 steps of rigid body minimization on a trimer model generated from the monomer co-ordinat>es with the init,ial NCS transformations. The new rotation matrices and translation vectors required to t)ransform the 1st monomer onto the other 2 wer 30”

Resolution range used in refinement

0115 @126 0125 0160

71.4 67.9 70-8 64.7

7.0.-3.0 7.0-2.9 7.0-30 T.O- 3.2

between 120 and the resolution cc

Ii”

limit d,i, of the data.



of the reflection

simulated annealing or conventional minimization/ temperature factor refinement protocols given above, followed by model inspection and manual adjustment. The position of the S-fold NCS axis was also refined. Two rounds of strict NCS-SA refinement were followed by one round of restrained NCSSA refinement on the trimer. This was followed by restrained NCS conventional refinement rounds

67 242 741683 64 062 54:098

v:“-P’h

RI = hk’ &‘“’ where (Iti’) denotes the mean of the observed intensities b s, complex with sialyllactose (Weis et a2.. 1988).

No. of reflections in refinement

hkl.

during which minor model adjustments were made. We now discuss the major stages of the G146D refinement, which is summarized in Table 2. (b) Startiny model The initial monomer co-ordinates were from the X:31 HA model described by Wilson et al. (1981).

Table 2 Course of G146D refinement Round

I

2

3

Monomer 3957 SA o-279

Monomer 4038 SA 0270’

Trimer 12,156 SA 0258

Trimer 12,156 (‘onventional 0.433

Trimer 12,189 (lonventional 0222d

0018 3.7 27.0 14 @14 6.1

0.018 33 27.0 1.4 WI2 5.4

0017 3.0 47.0 14 0.1 1 50

0,015 2.9 27.0 1.5 0.10 4.3

04)15 23 27.0 14 049 44

O+l31 (1200) 0031 (119.5)

0.030 (120.0) 0.030 (1196)

0428 (120.0) 0.029 (1196)

Start,

Unit refined No. non-hydrogen atoms Method’ Rb O-390 r.m.s. deviations from equilibrium Bond (A) O-034 4.3 Angle (“) Dihedral (“)’ 32.0 3.4 Improper (“) @25 C” chiral volume (A3)’ 8.4 Peptide w (“) NCS r.m.s. deviation from superposition (4) Monomer 2 + 1 (x)8 Monomer 3 + 1 (x)~ B-factor r.m.s. deviations Bond (A*) Bngle (A*) NC6 (AZ) r.m.s. to previous round (ip) Main-chain Side-chain a SA, simulated refinement.

annealing;

conventional,

Final

5

14 2.4 0.96 0.73 1.3 conjugate

gradient

0.42 0.97 positional

ooti9 0.13

035 0.69 minimization

and restrained

isotropic

temperature

fact)or

c IIE”I -ICk’II

’ Does not include 18 residues/monomer omitted from the calculated structure factors during this round “The final G146D R-factor is 1.6% higher than that reported by Knossow et al. (1986). This reflects geometry of the current model, as well as the use of approximately 6000 additional, mostly high-resolution, work gained upon reprocessing the G146D data as described by Weis et al. (1988). ’ Includes peptide (0) torsion angles. ‘Defined by Hendrickson (1985). This quantity is not used in X-PLOR, but is displayed for comparison the program PROLSQ (Hendrickson, 1985). See the text for further discussion. g Spherical polar angle (in deg.) describing rotation about the 3-fold axis (defined by Rossmann & Blow

(see the text). both the greatly improved observed reflections in this with refinements done wit,h 1962).

Injluenza

Virus Hemagglutinin

All N-linked carbohydrate atoms present in that model were removed. An aspartic acid side-chain was added at position 146 of HA, using the model regularizer in FRODO (Jones, 1985). The idealized 146 side-chain conformation was found to have an adequate fit to the Fz’46D-F$‘31 difference map described by Knossow et al. (1984) and no unfavorable contacts with surrounding atoms, and was not adjusted further. Next, we checked for very close contacts within the monomer and between NCSrelated monomers. These contacts produce very large gradients in the empirical energy due to the r pIi repulsive part of the van der Waals’ potential, which can force atoms far from their initial positions and produce instabilities during minimizat’ion (Briinger, 1988a). No such contacts were found within the monomer, but residues 1 to 3 of HA,, which were not actually located in the original structure but were included in this model, interpenetrated residues 172 t,o 175 of HA, of an adjacent monomer. Manual model adjustment of HA, I t,o 8 and HA, 172 to 175 relieved these bad contacts. Finally, a temperat.ure factor of 16 A2 was assigned to all atoms.

(r) Round I and d-fold axis rejinem~erbt The first round of monomer refinement used the NCS transformations obtained from refinement of heavy-atom positions used in the original HA structure determination (Wilson et al., 1981). Since the HA is 135 A long in the direction of the molecular 3-fold symmetry axis, even a fairly small error in the orientation of the axis in the unit cell would be expected to have a dramatic impact on the quality of the averaged map; in particular, a significant error in the transformation would smear or “wash out” the averaged density relative to the unaveraged density. Comparison of averaged and unaveraged maps made with these transformations gave no indication that the axis was incorrect. We therefore reasoned that any small changes would be well within the radius of convergence of the refinement procedure. In retrospect, we should have refined the axis prior to any positional refinement but, as the results discussed below indicate, our procedure apparently did not create any problems. This first anntaaling run resulted in dramatic improvements in both the R-factor and the model geometry (Table 2: the R-factor dropped from 0.390 to 0.305 m the minimization prior to annealing). No changes indicating serious errors in the initial model took place. Analysis of the geometry as a function of residue number revealed that most of the remaining poor regions were’ eit’her in lattice contacts or in anticipat,ed regions of high mobility, such as the N terminus of HA, and the (C terminus of HA,. We inspected the ent,ire structure on averaged 2F,- F, and F,- F, difference Fourier maps in which the residues under consideration were omitted from the model used to calculate structure factors (50, or 10% of the residues. were removed at a time), refitting and regularizing where necessary. In some

SA Refinement

743

regions, such as HA, 1 to 8, no density was visible, in which case the model was idealized and adjusted so as not to make contacts with other residues t’hat, might subsequent,ly bias the model. At this point we added N-linked carbohydrate residues 1.0 the model. Of the seven glycosvlation sites in the HA (6 in HA,, 1 in HA,), unamdiguous density for the first N-acetyl glucosamine (GlcNAc) was visible at asparagine residues 38. 81 and 285, while density could be seen for the first three carbohydrate residues attached to Asnl65 (GlcNAcGlcNAc-Man). The Asnl65 site illustrates a problem due to the abilit,y of SA refinement to move at,oms far from their initial positions. As shown in Figure I, Trp222, which had poorly defined sidechain density, moved into the strong carbohydrate densit.y coming off Asn165 of an adjacent monomer and for which no model atoms were present, in the first round. This resulted in severe model geometry distortion of the t.ryptophan and the surrounding four residues. After rebuilding this region and adding the carbohydrat’e, proper geometry was maintained during subsequent’ refinement. This example illustrates that one must look very carefully at the model. both by difference Fourier maps and geometry a,s a function of residur rrutnber, to detect errors. The latter is especially important, as SA refinement’ can move atoms far enough to at least partially compensate for missing model atoms. leading to somewhat ambiguous difference Fourier maps and relatively low R-factors. This is an important consideration for tightly bound solvent molecaules. which are not put into thti model unt,il late in the refinement process. After this round, we checked the position of the :&fold N(1S axis. The results of the axis refinement are shown in Table 3A. We express the orientation of the 3-fold SCS axis with respect, to the crystallographic a.xes in terms of the spherical p&r angles defined by Rossmann & Blow (1962). As we expected, the refined axis changed trivially from the input’, with A4 = 0.15” and no change in rl/. We also performed rigid body minimization on the starting co-ordinat’es and on the round 1 co-ordinates prior to rebuilding. which both gave essentially identical transformations relating the monomers as t,he minimization of the rebuilt round 1 co-ordinates (Tahle 313). To e&mate t’he effect of this change on the co-ordinat,es, we note that the largest, distance from the cent’er-of-mass to the end of t,he molecule is approximately 90 8; t’hus, a change in the tilt of the axis of 0.15” gives a maximal change of 90 ,4 x tan(0.15”) E 0.2 A4 in the co-ordinates. Since we use a 1.0 a grid for FFT structure factor calculations, we concluded that this is a trivial change as far as the Y-ray derivatives are concerned. Furthermore, this distance is well within the 0.6 L%fine grid used to calculate maps for real space averaging by the method of Bricogne (1976) that we use to compute NCS-averaged electron density maps. Since rigid body minimization of the starting model gives a similar result, we conclude that the use of t,he old transformations did not int,roduce any significant

(a)

(cl Figure

1. SA refinement

(b)

(d)

can move atoms far from their initial positions to compensate for missing model atoms. Trp222, which has weak side-chain density, moved into strong N-linked carbohydrate density at Asn165 of an adjacent monomer during the 1st round of G146D refinement, before carbohydrate was added to the model. The I.0 CTcontour of the 2F,-F, electron density maps is shown. The maps have been averaged about the 3-fold NCS axis. (a) Electron density and co-ordinates after round 1, showing missing density for CB of residue 222. (b) 1zs (a), showing N-linked carbohydrate density at Asn165 of an adjacent monomer (numbered 2165). (c) and (d) Electron density and coordinates after round 5. The views in (c) and (d) correspond to those in (a) and (b), respectively. In (r). t*he stacking of Trp222 with GlcNAc340 (Table 9) is emphasized.

Injiuenza

Virus Hemagglutinin

745

SA Rejinement

Table 3 Re$nement of the position of the J-fold axis in the unit cell R” .-\. &.suIts of rigid body

rrrinimkxzticw~

Before minimization After minimization 13. C’otmparimn

4 (“1

A4 (“1

AlL (‘7

Ax (a)

0.15

040

0.102

AY

(A)

AZ (A)

of rebuilt rou,nd 1 co-ordinates

0308 0,302

of rigid body minhization

Starting model Before minimization After minimization

$ (7

-163X - 16.23

- 3x43 - 38.43

0.0663

-. 04~0249

of initial and round 1 co-ordinates

0.390 0.382

- 1638 -1621

- 3843 - 3%45

Round I model (before rebuilding) Before minimization @279 After minimization 0,273

- lC38 -1628

- 3%43 - 3%44

The sDherica1polar anglesb and $ are as definedby Rossmann & Blow (19621. =see table

2. 1

.-

error into the tirst round of SA refinement. However, since the results indicated that the old transformations introduced a small systematic error into the I’c terms. the new transformations were used for the subsequent, round of strict NCS refinement. and for all subsequent electron density map averaging. (d) Rownd 2 Another round of SA refinement was run on the rebuilt monomer, using the updated NCS transformations. Lattice contact residues that were incorrectly positioned by the refinement due to the NCS (as assessed by difference Fourier maps) were left out, of the structure factor calculation during this round (18 out of 503). It is not clear that anything was gained by leaving out these residues from the structure factor calculations, however; aside from preventing possibly significant errors in the calculated structure factors from inclusion of incorrectly placed residues, the placement or geometry of the residues themselves was not improved. We presume that this was due to the strict NCS imposed on non-bonded interactions. After this run, we noticed a systematic discrepancy between the root-mean-square (r.m.s.) deviations from equilibrium values of main-chain versus side-chain bond angles; the overall r.m.s. deviation was 3.6”, but was 4.0” for main-chain and 3.2” for side-chain angles. We felt that, even accounting for the known flexibility of the N-C-C (z) angle, the discrepancy was suspicious, and was likely due to the very strong force constant required to maintain the chirality of the C” atoms at high temperatures (Briinger et al.. 1989). We therefore increased the force constants on some of the main-chain bond angles (Table 4). A similar discrepancy was noted for the r.m.s. deviations from planarity of proline peptide bonds versus the other amino acids. This probably arose from the weak proline peptide torsion angle force constant used to allow cis-trans transitions during dynamics (Briinger et al., 1989). Therefore, we inrreased the force constant for this

angle to that used for the other amino acids (Table 4). We repeated the annealing protocol with these new parameters, and found none of the discrepancies mentioned above. The overall statistics for this model are shown in Table 2. The model was once again inspected and adjusted, including the addition of one GlcNAc molecule at HA, Asn154. A trimer model was then generated, and residues in lattice contacts inspected and adjusted in unaveraged difference Fourier maps. Despite the imposition of strict NCS, many residues in lattice contacts were well placed in unaveraged 2F, - F, density and had good geometry. The mannose at Asn165, which is near but not in a lattice contact, was distorted in two of the three monomers, and its unaveraged 2F,- FC density looked somewhat different among the monomers. While this may have been an effect of the nearby lattice contact, it more likely reflects the fact that this sugar is relatively mobile, and that the torsional barriers define the hexose ring conformation are quite low, which permit distortions if the X-ray data do not sufficiently define the residue. X-PLOR provides a facility to impose extra dihedral restraints in such cases, but since onlv this sugar had distorted ring geometry. we simply rebuilt it manually. (e) Round 3 A final round of SA refinement was run on the trimer. using positional NCS restraints on nonhydrogen atoms with a strong force constant.

Table 4 Ch.anges to force constants Angle type

kOld

k new

c’“-(‘-U N--C”- c (‘-N-H H-N-C”

20.0 450 30.0 350

600 7@0 *wo 70.0 loo-0

PRO peptide dihedral angle

50

All force constants are given in units of kcal mom’ rad-*.

Table 5

“A,

“A,

All factor

positional wfinement

and

isotropic~

irrnpf~rat

Itrf’

(II)

with NC’S restraints. After t ht, last round of’ annealing, residues HA, I56 and 161. at the bordr,rs of’ a loop

NCS restraints were not applied to hydrogen atoms. a Side-chain atoms only. b Restrained in round 3 of C146D refinement: released for subsequent G146D rounds and throughout the refinement of the other 3 strurtures.

\~RS

in a crj-stal cont)act’, and HA, 1 to 8 were quit,e distort,ed. W’r released the symmct ry restraints on these regions in addit ion to 1 how rt+asrd in round 4 t,o SW if’ they might adopt mor( rrasonablt~ geomcltry in the, absent of rest raintjs. In WI rosprc*! . I to X bvf~r’f’ ~IIIV to thfhir the prohlrms in wsidrlc~s

cornplt~t~~ c,r?stallo~raf,lii(,

k,NCS = 300 kcal mol-’

A-*. We wanted to leave as few regions as possible unrestrained, since small deviations from NCS cannot be ascribed significance at 3 A resolution. Moreover, since strict NCS was not imposed, significant deviations could, in principle, still occur in restrained regions. Thus, we left unrestrained only those residues near lattice caontacts for which unaveraged I$- F, maps clearly indicated that the strict NCS had caused them to be misplaced; furthermore side-chain atoms only were left unrestrained when it was clear that only t’hey were affected by the contact (Table 5). We also left the mannose at HA, Asn165 unrestrained to see whether the restraints had caused the distortions seen after round 1 (see above). This SA run yielded furt,her improvements in R-fact’or and geometry (Table 2). Most lattice contact residues had reasonable geometry and were well positioned in unaverdensity. The lattice contact nonaged 2F,-F, bonded energy was negative, indicating favorable contacts. For the NE-restrained regions of the model, the monomers superimposed with an r.m.s. deviation of about, 0.03 A (Table 2), well below the co-ordinate error expected at 3 a resolution. At t,his poirrl. il was clear t)hat thcl refinement had in

cYmrrTgeti

wsidws

(i.e.

c,ssentiallv

the serw t,hat the thaw with clear drnsity)

the same positions

well-ordered ret urnrd

in successive

t)o

SA runs.

of rnairr‘I’able 2. thrl r.m.s. drviation hetwwn round 2 and 3 was 0.35 A, whicah is about the co-ordinat,e error expected at this rwolution. (lomparison of the rouncl 2 and round 3 motif~ls on t~llv wmputcr graphics indic~atrd that, \vit 1) thr except ion of’ lattice corrt,act and disordered wgions. the main-chain and most of the sidcchain atoms returned to essentially t ht. samr positions. As

show;

in

c*ha,in atoms

Thus.

we did

b~~~rld

1~roduw

rxpe(bt

not

any

that

significoantj

more* SA refinement improvc~m~wts

in thtl

model. I II general. high-tcmpcrrat’nr~~ dynamics tends to degrade the gromet)ry in disordered (or wroneous)

rrtst,raint ment

;rc.tnally

has

regions is weaker

of the modc4. in thrsr areas;

converged

creat.cl

ordered manua,l work

in

more

since the X-ray once SA refineregions, it can

t)o fix

thcsc~

pwformrd

Ikspitr

tals,

we

the deftrnfd

ment worth

tiisordf~r~.

nloderate

wsolutio~~

isot~ropic~

while.

uw

1.3. while fa(*for Of’

0I’

IO.

N-t’ac4or \\‘(i

of’

I’&11

HLA

cyw

f’acwr

At, 3 A resolution,

of t ht> trniperat factor varies relativf~ly distinguish wall-ordwc~tl niolrc~iilc~. For f~sarriplf~. diminishw f II? wattwing efied

it

limit

trrnper;tturf~

f’ac%or

him-

although

011

t tlfs

wat

the l.rrillg

slowly. it is sutficicwt 10 from flc~xiblt~ rrgions of 1 ht.

:I ;\, il /I-fic.tor

at

f’il(~tor 10

f tlrf~f~

4’ Sf’th

I)\-

:I

fA(‘for

fiiniillislif~h Of’

ilIlf~~‘llil1

of’ IO .a’ of’

~~NJIII

it

IJ!, irlp

i\ 20

Ic~nipfTal ur(b steps positional and 20 st CL[JS rest raiwcl fac*tor retiuf~~rlf~nt oI1 1 Iicb rc~lbriilt rx~llrltl :< rrl0clrl. itd outlinc4 ~IKJ\-(1. ‘I’ht~ statist iczs till, this I~iodt~l an* givcw in the round 4 c~trtt~y iti ‘l’i11)lC’ 2. ‘l’h(~ Ovl-rail geonic~i ry irriprovwl slightly-. antI vary litt I(’ rtlodf~l adjust mfml was rqniwtl.

tnjluenza

Virus Hemagglutinin

747

$A ReJinemPnt

significantly between round 4 and the final round. so only the final statistics are shown in Tablr :! (h) Rcfinwmrnt

(a)

/ lb)

Figure 2. I)ouhlr protonatiorl of His56 amuses displacenwnt of I,ys26~. (‘ontinuous linm are the I.0 o cwntour of thr 2F,- Fc rlwtron density. broken lines arr t’he 39 o contour of -(PO - Fc) rlectro~~ density. The maps have brrn avrmgrtl dmut the 3-f&1 X(‘S axis. (a) ltrsults of rnininlizatic~n after (: 146T) round 5. with His56 doubl) I)rotonatrd. The positivrly cxhargrd Lys264 side-chain lies outsidr thr L)FO- Fc density. in rregativr F,-F, density. to avoid utif;~vornhlr interac>tions with the positiveI> chargrd histidintb ring. (b) Result of G146T) round 6 minimizatic,n after tlc~protonating His56 Ndl. The maps arr thtl same HS those in (a). The histidinr ring was manually turned over 1’~ 180” t,o optimize the hydrogenbonding brtwrt~n IV2 anti the main-chain carbonyl oxygen of residue 180. anti htltwrrn the drprotonatrd Xal and r+d(i-c Xc. prior to rcjurrd 6.

to appear in t,he unaveraged map in each monomer when cont)oured at the same density level (Z 2%). (3) The peak could not> be explained as an alternate side-chain caonf’orrnation or as a part of something larger (e.g. a (Litrate ion) that was missing from the model. (4) The water molecule had to have a good geometry, i.e. make sensible hydrogen bonds to neighboring atoms and have no bad van der Waals’ c,ont’acat,s. (6) After refinement, t,he water molecule had to appear in 2bi--bI, density and have a temperature fact’or of less than 40 AZ. Three more rounds of alternating positional and temperature factor refinement were now run; after each round. we made minor adjustments to the model. incsluding deprot,onation of some histidine residues as discussed in Materials and Methods (an tAxample is given in Fig. 2) and addition or removal of water molecules. The final model has only 11 water molecules in each monomer. due lo our conservative criteria: an example is shown in Figure 3. The overall geometry did not change

of i.somorpho/tcs

hrrr/o!/gl/ttir/ir~s

For refinement of the three st)ruc*t ures isomorphous to (2 146D list)ed in Table 1. wt’ first ran a round of SA refinement according to thtl prot’ocaol girtIn in t)he Materials and Methods. rsc>ppt that (1) three sets of alternating positional minimization and trmprrat,ure factor refinement were run prior to sinrulat~erl annealing, and (2) aft)er the usual nlinimization with charges turned off following annealing. anot)her t#hree alternating sets of positional and temperature factor refinement were run w&h the full charges tmrned on. Our motivat’ion for annealing st,ruct#ures so similar to G146D wit)h t,he refined G1461) model was twofold; to prevent, the model from getting stuck in false local minima (with passible degradation of geometry). and to obt,ain a relatively unbiased estimate of the co-ordinate error by comparing t)he co-ordinates in regions known to be identical based on F,-- F, difference maps. The annealing round was then followed by onta or two conventional minimization rounds to corrrcat an? problems introduced into disordered regions by SA refinement and t,o add water molecules, as discussed above. IYe briefly discuss each sefinemrnt t)rllow. (ii) IL:?&J The substitution of glutamine for IrucLintl at HA, 226 results in many small conformational cahanges in the surrounding protein. As shown by 7L22hQyG’46D difference Fourier maps (\l’ris ef bo ~1.. 198X). ‘Likewise, the differencae at. position 146 (Gly in L226Q and Asp in Gl46I)) causes small changes (Knossow P! al.. 1984). Thesr cahanges arc all smallfbr than I L%and thus should br \\rll withill t’tie radius of convergence of SA refinrmrrrt. M’tb const rucird a,n tL26Q model from thr round 4 G 1461) co-ordinates by removing the Asp side-chain at’ f 46 and replacing lieu wit,h (:tn at, 246, which was then manually atljust,ed into the 2Fk22hQ- I”FL4hD density. The R-fac*tor of t#his model was 0%X. Thta effective S-ray energy weight Crq was found b! running short minirnieat.ions at srvoral WA values smaller than that rt~commended by th(a stamlard pro(*edur

Refinement of the influenza virus hemagglutinin by simulated annealing.

We have applied the method of simulated annealing to the refinement of the 3 A resolution crystal structure of the influenza virus hemagglutinin glyco...
3MB Sizes 0 Downloads 0 Views