[21]
MOLECULAR MODELING AND DYNAMICS OF PEPTIDES
449
[2 1] M o l e c u l a r M o d e l i n g a n d D y n a m i c s of Biologically A c t i v e P e p t i d e s : A p p l i c a t i o n to N e u r o p e p t i d e Y By ALEXANDER D.
MACKERELL, JR.
Introduction The rational experimental design and development of pharmaceutical compounds is greatly enhanced by combining experimental data with information derived from theoretical approaches. Recent advances in the theoretical treatment of macromolecules offer the possibility of examining a number of structural variations, including mutations, deletions, and insertions in proteins and peptides, while simultaneously allowing the economizing of experimental resources. Theoretical analysis also has the potential to investigate structural and dynamic phenomena at a more detailed level than is experimentally attainable. ~.2Of the theoretical methods available, molecular modeling and molecular dynamics (MD) simulations 3,4 have been shown to be successful in the study of various biological systems. Biologically active peptides, which are known to be involved in an increasingly large number of physiological processes: '6 are of sizes readily accessible to molecular modeling and dynamics techniques as well as to chemical synthesis. These factors combined with biochemical or pharmacological binding and activity measurements allow the design of structure/ dynamics/activity studies yielding information which may then be applied to the development of agonists and antagonists. Of particular interest are peptides containing 10 to 50 amino acids. This group includes many of the biologically active peptides, 4'5 such as the growth factors, the pancreatic polypeptides, peptide YY (PYY), and neuropeptide Y (NPY). Although this discussion is confined to the pancreatic polypeptides the technique should be applicable to any polypeptide with a characteristic, well-folded
J C. L. Brooks III, M. Karplus, and B. M. Pettitt, "Proteins, A Theoretical Perspective of Dynamics, Structure and Thermodynamics" (Adv. Chem. Phys. 71). Wiley, New York, 1988. 2 j. A. McCammon and S. C. Harvey, "Dynamics of Proteins and Nucleic Acids," Cambridge Univ. Press, New York, 1988. 3 M. Karplus and J. A. McCammon, Crit. Reo. Biochem. 9, 293 (1981). 4 j. A. McCammon, Rep. Prog. Phys. 74, 1 (1984). 5 j. W. Taylor and E. T. Kaiser, Pharmacol. Reo. 38, 291 (1986). 6 E. T. Kaiser, Trends Biochem. Sci. 12, 305 (1987).
METHODS IN ENZYMOLOGY, VOL. 202
Copyright © 1991 by Academic Press, Inc. All rights of reproduction in any form reserved.
450
PROTEINS AND PEPTIDES: PRINCIPLES AND METHODS
[21]
solution structure. N P Y 7-9 may be considered an almost ideal system on which to test molecular modeling and MD approaches owing to its high homology with avian pancreatic polypeptide (APP) 1°'11 and the wealth of pharmacological data available in a variety of physiological models. Several modeled structures of NPY have been published, 12.13based on its homology with APP. Comparison of the sequences 14 indicates the absence of insertions and deletions. The positions of structurally important prolines (positions 2, 5, 8, and 13) and a glycine (position 9) are maintained, and several conservative changes (positions 10, 20, 28, and 31) are shown in Table I. Furthermore, circular dichroism ( C D ) , 15A6 N M R , 17 and Raman measurements 18indicate NPY in solution to have a helical content comparable to that found in the APP crystal structure. Investigations of structure-function relationships of NPY initially involved pharmacological measurements on various NPY fragments and the NPY free acid. 19-22The outcome of these studies indicate that the binding and activity of NPY are located primarily in the C-terminal region of the molecule, but deletion of the N terminus also results in lowered binding and activity. In order to gain further insight into the changes occurring in the C terminus on removal 7 K. Tatemoto, M. Carlquist, and V. Mutt, Nature (London) 296, 659 (1982). 8 K. Tatemoto, S. Siimesmaa, H. JOrnvall, J. M. Allen, S. R. Bloom, and V. Mutt, FEBS Lett. 179, 181 (1985). 9 V. Mutt, K. Fuxe, T. H6kfelt, and J. M. Lundberg, "Neuropeptide Y," Karolinska Nobel Conference Series, No. 14, Raven, New York, 1989. l0 T. L. Blundeil, J. E. Pitts, I. J. Tickle, S. P. Wood, and C.-W. Wu, Proc. Natl. Acad. Sci. U.S.A. 78, 4175 (1981). II I. Glover, I. Haneef, J. Pins, S. Wood, D. Moss, I. Tickle, and T. Blundell, Biopolymers 22, 293 (1983). 12j. Allen, J. Novotny, J. Martin, and G. Heinrich, Proc. Natl. Acad. Sci. U.S.A. 84, 2532 (1987). 13 I. Glover, D. J. Barlow, J. E. Pitts, S. P. Wood, I. J. Tickle, T. L. Blundell, K. Tatemoto, J. R. Kimmel, A. Wollmer, W. Strassburger, and Y.-S. Zhang, Eur. J. Biochern. 142, 379 (1985), 14 K. Tatemoto, Proc. Natl. Acad. Sci. U.S.A. 79, 5485 (1982). 15 j. L. Krstenansky and S. H. Buck, Neuropeptides 10, 77 (1987). 16j. Boublik, N. Scott, J. Taulane, M. Goodman, M. Brown, and J. Rivier, Int. J. Pept. Protein Res. 33, 11 (1989). I7 V. Saudek and J. T. Pelton, Biochemistry 29, 4509 (1990). 18A. Balasubramaniam, V. Renugopalakrishnan, D. F. Rigen, M. S. Nussbaum, R. S. Rapaka, J. C. Dobbs, L. A. Carreira, and J. E. Fischer, Biochim. Biophys. Acta 997, 176 (1989). 19 F. Rioux, H. Bachelard, J.-C. Martel, and S. St.-Pierre, Peptides 7, 27 (1986). 20 C. Wahlstedt, N. Yanaihara, and R. Hakanson, Regul. Pept. 13, 307 (1986). 21 j. M. Danger, M. C. Tonan, M. Lamacz, J. C. Martel, S. St.-Pierre, G. Pellitier, and H. Vaudry, Life Sci. 40, 1875 (1987). 22 M. O. Perlman, J. M. Perlman, M. L. Adamo, R. L. Hazelwood, and D. F. Dyckes, Int. J. Pept. Protein Res. 30, 153 (1987).
[21]
MOLECULAR MODELING AND DYNAMICS OF PEPTIDES
TABLE I PRIMARY SEQUENCES OF AVIAN PANCREATIC POLYPEPTIDE AND NEUROPEPTIDE ya
Residue
APP
NPY
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Gly Pro Ser Gin Pro Thr Tyr Pro Gly Asp Asp Ala Pro Val Glu Asp Leu lie Arg Phe Tyr Asp Asn Leu Gin Gin Tyr Leu Asn Val Val Thr Arg His Arg Tyr-NH2
Tyr Pro Ser Lys Pro Asp Asn Pro Gly Glu Asp Ala Pro Ala Glu Asp Leu Ala Arg Tyr Tyr Ser Ala Leu Arg His Tyr lie Asn Leu Ile Thr Arg Gin Arg Tyr-NH2
From Tatemoto. ~4
451
452
PROTEINS AND PEPTIDES: PRINCIPLES AND METHODS
[2 1]
of the N terminus of the peptide, studies were undertaken combining molecular modeling and MD simulations with pharmacological experiments on NPY and various fragments. 23,z4
Principles Many molecular modeling and all MD simulation methodologies require the potential energy of the system to be calculated as a function of the atomic coordinates.l,2 Considering the large size of macromolecules, the potential energy function must be computationally simple in order to allow the necessary calculations to be performed within a reasonable time frame. A number of potential energy functions have been presented with this goal in mind. 25-28 These functions separate the potential energy of a system into internal and external parts. The internal terms, which are frequently assumed to be harmonic in nature, include contributions from bond deformations, angle deformations, torsional rotational barriers, and improper torsional terms. External terms calculate nonbonded atom-atom interactions in a pairwise manner and include contributions for van der Waals repulsion, dispersion attraction (London forces), and electrostatic (Coulombic) interactions. In some cases an additional external term is included for hydrogen bond stabilization. Along with the simple potential energy function, computational savings may also be gained by limiting the number of atoms in the system. This may be achieved via extended atom models that ignore nonpolar hydrogen atoms by including them in the heavy atom to which they are attached, exclude solvent (i.e., where the calculation is performed in vacuum), and truncate the nonbond atom lists such that external energies are calculated only between atoms within a selected distance of each other. The potential energy function is applied to molecular modeling by changing the structure in a systematic fashion in order to find low energy conformations of components of the system (e.g., amino acid side chains). The overall potential energy of the system is decreased in energy minimizations, generally by displacing atoms in 23 A. D. MacKerell, Jr., Comput.-Aided Mol. Des. 2, 55 (1988). 24 A. D. MacKerell, Jr., A. Hems6n, J. S. Lacroix, and J. M. Lundberg, Regul. Pept. 25, 295 (1989). 25 B. R. Brooks, R. Bruccoleri, B. Olafson, D. J. States, S. Swaminathan, and M. Karplus, J. Comput. Chem. 4, 187 (1983). 26 j. Hermans, H. J. C. Berendsen, W. F. van Gunsteren, and J. P. M. Postma, Biopolymers 23, 1513 (1984). 27 S. J. Weiner, P. A. Koilman, D. T. Nguyen, and D. A. Case, J. Comput. Chem. 7, 230 (1986). 28 W. L. Jorgensen and J. Tirado-Rives, J. Am. Chem. Soc. 110, 1657 (1988).
[2 1]
MOLECULAR MODELING AND DYNAMICS OF PEPTIDES
453
the direction of the forces acting on them. In MD simulations Newton's equations of motions are solved with respect to the potential energy function, allowing the system to move as a function of time. In order to undertake a structure/dynamics/function study a starting three-dimensional structure is required. In the optimal case an experimentally determined structure, from either X-ray crystallography or NMR nuclear Overhauser effect (NOE) data, is used as the starting point. If an experimental structure is not available a starting structure must be prepared via molecular modeling. Various strategies have been developed for such work, 29-31 several of which are presented in this treatise. The most promising are those starting with homologous structures for which experimental three-dimensional structures are known. This allows the conserved regions to be aligned 32 and the sequence variations to be modeled into the new structure. In the present case modeling started with the known structure of APP, and the mutated side chains were built into energetically acceptable starting positions and subsequently energy minimized in order to relax the entire structure. The experimental structure from which the modeling was initiated should also be subjected to an identical minimization procedure in order to gauge the influence of the applied potentials and parameters on the obtained structure. Energy minimization alone, however, limits the structure to the local minimum in the region of conformational space afforded by the homologous experimental structure and subsequent modeling. 33 This limitation may be overcome by the application of molecular dynamics simulations. The advantage of using MD simulations in combination with molecular modeling is the presence of kinetic energy, allowing the system to overcome conformational barriers between local energy minima ( - 0 . 6 kcal/ mol at room temperature).34 Thus, the system may more effectively search conformational space, yielding a less biased low energy conformation(s) than can be attained with traditional energy minimization. In the case of NPY, MD simulations 24 resulted in structures which maintained the basic secondary and tertiary characteristics of the APP crystal structure; however, significant changes occurred in the N- and C-terminal regions of the peptide. Since the biological activity of NPY has been shown to be local29 H. H.-L. Shih, J. Brady, and M. Karplus, Proc. Natl. Acad. Sci. U.S.A. 82, 1697 (1985). 3o p. Kollman, Annu. Rev. Phys. Chem. 38, 303 (1987). 31 N. L. Summers and M. Karplus, J. Mol. Biol. 210, 785 (1989). 32 T. Blundell, D. Carney, S. Gardner, F. Hayes, B. Howlin, T. Hubbard, J. Overington, D. A. Singh, B. L. Sinanda, and M. Sutcliffe, Eur. J. Biochem. 172, 513 (1988). 33 W. F. van Gunsteren, Protein Eng. 2, 5 (1988). 34 j. /~qvist, P. Sandblom, T. A. Jones, M. E. Newcomer, W. F. van Gunsteren, and O. Tapia, J. Mol. Biol. 192, 593 (1986).
454
PROTEINS AND PEPTIDES: PRINCIPLES AND METHODS
[2 1]
ized primarily in the C terminus of the molecule these structural differences could influence the structure-function interpretation of the results. Application of a single MD simulation to obtain a stable low energy structure relies on that particular simulation searching global, low energy regions of conformational space (i.e., it is possible to perform a simulation of reasonable length, 100 psec, and still not obtain a more global energy minima owing to that simulation not sampling regions of conformational space close to the minima; however, simulations of extended periods may overcome this problem35). To circumvent this limitation an approach has been developed in which numerous simulations are performed from the same starting structure but using different initial velocity distributions. The use of different initial velocity distributions is assumed to send the simulations into different regions of conformational space and, thereby, increase the probability of finding an energetically stable region of conformational space; a more effective sampling of conformational space should result. From each simulation a time-averaged structure is obtained. Of the structures the one with the lowest potential energy is selected as the " n e w " initial structure and further simulations performed in order to verify that the selected structure is in an energetically stable region of conformational space. This method, referred to as the "simulate and test" approach, is repeated in an iterative fashion until the energetic properties have converged as diagrammed in Fig. 1. In this approach the potential energy versus time of the simulations (N is the number of simulations in Fig. 1) is monitored along with the potential energies of the time-averaged structures from the simulations. If either the potential energy decreases during the simulation [APE(t) in Fig. 1] or the potential energy of any of the time-averaged structures [E2 in Fig. I] is less than that of the structure used to start the simulation [Ej in Fig. 1] another round of MD simulations is performed starting with the E2(N) structure with the lowest potential energy. Once it is observed that neither the potential energy versus time of the simulation nor the potential energy of the time-averaged structures has decreased an energetically stable structure has been attained. This energetic stability, however, does not guarantee that the global minima has been found. Once an energetically stable "wild-type" structure is obtained, perturbation of that structure may be performed. The perturbed structure is prepared via molecular modeling, and that structure then subjected to the simulate and test procedure in order to obtain an energetically stable structure. Comparisons are made between the wild-type energetically stable structure and perturbed structures based on data from simulations 35 B. R. Brooks, personal communication (1990).
[21]
MOLECULAR MODELING AND DYNAMICS OF PEPTIDES
455
Initial Structure (El)
II
if E2(N) < E1 and
MD Simulations (N>2)
A P.E.(t) < 0.0 Select E2(N) with lowest P.E. E1 = E2(N)
Time-Averaged Structures (Minimized, E2(N)) if E2(N) = El and A P.E.(t) = 0.0
Time-Averaged Structures for analysis FIG. 1. Flow diagram of simulate and test s c h e m e . El represents the energy of the structure u s e d to initiate the N M D simulations, E2(N) represents the potential energies of the minimized t i m e - a v e r a g e d structures from the N simulations, and AP.E.(t) is the potential energy o f the simulation as a function of time.
performed on the energetically stable structures. Such an approach generally ensures that the structural/dynamic differences between the structures being compared are due to the introduced modifications and not caused by the movement of one of the structures into a lower energy region of conformational space as compared to the other structure. Finally, it should be emphasized that the MD approach has been shown to capture the essential physical behavior of a variety of macromolecules, although the absolute structures obtained are limited by the potentials and parameters applied in the calculations. The overall goal of the present approach is to obtain information on structural and dynamic differences between energetically stable structures, with the differences being reasonably representative of changes occurring in the experimental regime.
,*56
PROTEINS AND PEPTIDES" PRINCIPLES AND METHODS
[2 II
Application
Molecular Modeling Molecular modeling of NPY was initiated from the 1.4-.~ APP X-ray coordinates ~° obtained from the Brookhaven Protein Data Bank (PDB identifier 1PPT) 36using the program HYDRA. 37The method applied in the present case may be considered a minimal approach to modeling. Visual analysis of the sequences, which has been confirmed using more in-depth alignment algorithms,38 showed a lack of insertions or deletions, allowing the mutated residues to be substituted in a 1 : 1 fashion. Mutation started at position 1 by building the side chain in an all-trans configuration while maintaining the remainder of the APP side-chain and backbone positions. Using a fully extended atom model (i.e., no hydrogen atoms are explicitly included) the side-chain torsion angles were manually rotated starting with the XI torsion angle followed by the X2 torsion angle, etc. In each case the torsion angle having the lowest interaction energy with the surrounding protein environment was selected. This was performed with all the sidechain torsions in that residue and repeated for the same side chain in an iterative fashion until an approximate minimum interaction energy conformation was obtained. The remaining mutated residues were sequentially treated in the same fashion from the N to C termini. Such a procedure may readily be automated. Following the preliminary modeling, polar hydrogen atoms were added using the program CHARMM 25 and the PARAM19 parameter s e t . 39 The resultant structure was then subjected to 200 steps of unrestrained adopted-basis Newton-Raphson (ABNR) minimization to allow a more global relaxation of the entire structure, including backbone atoms. Root-mean-square (rms) differences between the backbone and sidechain atoms of the APP crystal structure and the minimized APP are shown in Fig. 2A and 2B, respectively; rms differences between the backbone atoms of the APP crystal structure and the NPY model are shown in Figure 2C. As seen in Fig. 2A and 2B, the minimization of APP leads to structural changes of up to 1 .A in both the side-chain and backbone atoms. These changes are related to the limitations of the potential function and 36 F. C. Bernstein, T. F. Koetzle, G. J. B. Williams, E. F. Meyer, Jr., M. D. Brice, J. R. Rodgers, O. Kennard, T. Shimanouchi, and M. Tasumi, J. Mol. Biol. 112, 535 (1977). 37 R. E. Hubbard, in "Structure, Dynamics and Function of Biomolecules" (A. Ehrenberg, R. Rigler, A. G~slund, and L. Nilsson, eds.), Springer Series in Biophysics, Vol. 1, p. 122. Springer-Verlag, Berlin (1987). 3s N. L. Summers et al., manuscript in preparation. 19 W. E. Reiher III, Ph.D. Thesis, Harvard University, Cambridge, Massachusetts (1985).
[21]
MOLECULAR MODELING AND DYNAMICS OF PEPTIDES I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
5
10
15
20
25
30
35
457
A
o q)
B
.+=,
n~ C
Residue FIG. 2. Root mean squarc difference(in A) between the A P P minimizcd structureand the A P P crystal structurc for thc (A) sidc-chainatoms (Gly O and excluding hydrogcns) and the (B) backbone atoms (N, C~, C, O), and (C) between the N P Y modeled and minimized structure a n d A P P crystal structure b a c k b o n e atoms. Regions u n d e r s c o r e d with dots are in the polyproline helix and t h o s e u n d e r s c o r e d with d a s h e s are in the ct helix.
to the loss of the crystal environment, including a Z n 2+ ion bound to the N terminus and the side chains of Asn-23 and His-34. The differences between the unminimized APP and NPY model (Fig. 2C) are of similar magnitude, though some larger changes do occur in the region of residues 6 to I0. Thus, the initial modeling and minimization of NPY appear not to have introduced any significant alteration of the APP starting structure in relation to minimization of APP alone. Various NPY fragments were prepared for MD simulations by excising the unwanted portion of the molecule from the energetically stable and minimized full NPY structure (see below) or by changing the C-terminal amide to a carboxylic acid in the case of the NPY free acid. Studies have also been performed on NPY analogs where the 1-7 region of the peptide is linked to the 20-36 region via a side-chain peptide bond between a glutamate or asparate residue and a lysine or ornithine residue. 4° In this work the 8-19 region of the peptide was removed and the mutated side chain built at position 7 or position 20 in an all-trans configuration. A long peptide bond was then created between the side-chain carboxyl and amino groups, completing the cross-link. Relaxation of the long peptide bond to 4o A. D. MacKerell, Jr., u n p u b l i s h e d results (1990).
458
PROTEINS AND PEPTIDES: PRINCIPLES AND METHODS
[21]
obtain one of standard geometry was then performed via a restrained ABNR minimization. Harmonic constraints of 95 kcal mol -~ ,~-2 were applied to the cross-link atoms (all atoms in residues 7 and 20) and decreased by 5 kcal mol -~ /~-2 every 5 steps with harmonic constraints of 5000 kcal mol-1 ,~-z applied to the remaining atoms throughout the minimization. This procedure, which may also be considered a minimal modeling approach, allows only regions of the peptide adjacent to the long peptide bond to relax while structural relaxation in other regions is negligible. Successful application of this approach has been performed in the modeling of a loop mutant of triose-phosphate isomerase. 4~ Care, however, must be taken when using such an approach to avoid the formation of cis-peptide bonds, which are only rarely observed in proteins and are a poor initial geometry for the modeled peptide bond.
Molecular Dynamics MD simulations were performed using the program CHARMM zs starting with the energy-minimized structure of the NPY model. The C-terminal amide of NPY was treated with the same parameters as formamide. 39 Nonbonded interactions were truncated at 8.0 /~. In order to obtain a gradual cutoff, the electrostatic interactions were shifted and the van der Waals interactions were switched (6.5 to 7.5 ,g,) or shifted2s to obtain gradual cutoffs at 7.5 A. Integration of Newton's equations of motion were performed using the Langevin algorithm, where random forces and frictional terms are included to compensate for the lack of solvent. A friction coefficient of 50 psec- 1 was used. SHAKE was used to restrain covalent bonds involving hydrogen atoms, allowing a time step of 0.002 psec to be used. Using the modeled and minimized NPY structure as a starting conformation, eight simulations at 300 K and three simulations at 600 K were performed. Each simulation differed only in the random number seed used to initially assign the atomic velocities and to assign the random forces in the Langevin dynamics during the simulation. Each simulation was allowed to continue until the potential energy versus time was approximately stable (see Fig. 3). Time-averaged structures were then determined over the last 30 psec of each simulation (0.1 psec interval) and subjected to a restrained minimization in order to remove bad contacts and distorted internal geometries arising from the averaging procedure. The restrained minimization was performed using the ABNR minimizer with harmonic constraints set at 15 kcal mol-i ~ - 2 applied to each atom and decreased 41 D. L. Pompliano, A. Peyman, and J. R. Knowles,
Biochemistry 29, 3186 (1990).
[21]
MOLECULAR MODELING AND DYNAMICS OF PEPTIDES
459
by 5 kcal mol-1 A-2 every 20 steps followed by 40 unrestrained steps of ABNR minimization. Such a minimization was performed on all timeaveraged structures. The potential energy of each structure was then determined and the structures with the lowest values selected for further analysis. The selected structure was minimized for 200 additional unrestrained ABNR steps and subjected to a second round of simulations in order to test the energetic stability of the system. These two rounds of simulations comprise one simulate and test cycle. The goal of the simulations is not to perform an exhaustive search of conformational space but rather to search the region of conformational space in the vicinity of the starting structure, which contains the secondary structural elements observed in CD and Raman 15-~8 experiments and tertiary characteristics of the highly homologous APP. Results concerning the potential energy versus time for the various simulations are presented in Fig. 3A for the 300 K simulations and in Fig. 3B for the 600 K simulations. At both temperatures it may be seen that a large variation in the decay of the potential energy versus time occurred for the different simulations, implying that various regions of conformational space were being searched. This is supported by analysis of the initial and final potential energies (Table II), showing the diversity of the simulation structure potential energies. Table II also shows the large decrease in the potential energy of all of the simulation structures as compared to the starting minimized NPY and APP structures. The simulations performed at 600 K were attempted in order to expand the search of conformational space in a limited fashion; however, the resulting time-averaged structures were not of a lower potential energy than those obtained from the 300 K simulations. Selection of t h e " best" structure was performed using potential energy criteria and requiring the maintenance of secondary structure characteristics. Primary analysis was performed on the - 1343 and - 1345 kcal/mol structures and involved comparison of the backbone ~b and ~bangles with those expected for ideal ot and polyproline helices. 42'43The best agreement occurred in the - 1343 kcal/mol structure (Fig. 4), which was then used for further testing. It should be noted that the selection criteria could also be based on solvent accessibility44,45or the ratio of the solvent accessibility of hydrophobic versus hydrophilic residues.46 The latter has recently been shown to be a good indicator of properly folded structures. 42 L. Pauling, R. B. Corey, and H. R. Branson, Proc. Natl. Acad. Sci. U.S.A. 37, 205 (1951). 43 G. N. Ramachandran and V. Sasisekharam, Adv. Protein Chem. 23, 659 (1968). 44 A. Shrake and J. A. Rupley, J. Mol. Biol.79, 351 (1973). 45 R. M. Sweet and D. Eisenberg, J. Mol. Biol. 171, 479 (1983). 46 j. Novotn~,, R. Bruccoleri, and M. Karplus, J. Mol. Biol. 177, 787 (1984).
460
PROTEINS AND PEPTIDES: PRINCIPLES AND METHODS I
A
150
I 1~0
I ~-
[21]
I
.
.
.
q
.
*°°1-
-1
100
"8 50 0
0
I
I
I
i
f
I
i
I
25
50
75
B 150
100 O
50
0 0
100
Time, Picoseconds FIG. 3. Potential energy versus time of the (A) 300 K and (B) 600 K conformational searches of NPY. Inset in (A): Potential energy versus time for the three follow-up simulations initiated from the -1343 kcal/mol structure. The dashed line represents the result from MacKerell} 3 Potential energies are offset such that the lowest value of each curve was set to zero. (From MacKerell e t al. 24)
Following the selection of a structure the test portion of the simulate and test cycle was performed via three simulations, with the resulting potential energies versus time shown in the inset of Fig. 3A. Again, the only difference between the simulations was the initial random number seed resulting in different initial velocity distributions. In these simulations it can be seen that the potential energy decrease through the eourse of the simulations was minimal, indicating that the selected structure was approximately energetically stable. From these simulations the time-averaged structures were determined over the last 89 psec of the simulation; the potential energies of the resulting structures are presented in the bottom portion of Table II. Comparison of the potential energies of the last three structures with that of the structure used to initiate the round of simulations ( - 1343 kcal/mol) shows them to be similar, again indicating the energetic stability of the - 1343 kcal/mol structure. The - 1343 kcal/
[21]
M O L E C U L A R M O D E L I N G A N D D Y N A M I C S OF P E P T I D E S
461
TABLE II POTENTIAL ENERGIES OF MINIMIZED A P P , MODELED AND MINIMIZED N P Y , AND TIME-AVERAGED N P Y STIMULATION STRUCTURES Structures Minimized APP NPY
Potential energy (kcal/mol)
- 1208 - 1174
Simulation 2
1289 - 1299
3 4 5
- 1304 -1317 - 1326
6
- 1333
1
7 8 Follow-up simulation 1 2 3
-
- 1343 - 1345 - 1347 - 1343 - 1348
FIG. 4. S t e r e o d i a g r a m o f t h e - 1343 k c a l / m o l N P Y s t r u c t u r e . ( F r o m M a c K e r e l l
et al. 24)
462
PROTEINS AND PEPTIDES: PRINCIPLES AND METHODS
[21]
mol structure was then used for analysis of rms differences between the full NPY and the various fragments, whereas the follow-up simulations were used in the analysis of rms structural fluctuations and configurational entropy. Further studies, 4° however, have shown an additional relaxation of the - 1343 kcal/mol structure, emphasizing the difficulty in determining the true energetic stability of a compound. Preparation of the various NPY fragments for comparison with the full NPY again used the simulate and test approach. All runs were performed at 300 K. In this work no more than one simulate and test cycle was required in order to obtain energetically stable structures, although in some instances the first round of simulations were deemed stable enough for use directly. Shown in Fig. 5A and 5B are the three initial simulations of the 2-36 and 25-36 NPY fragments. With the 2-36 fragment a significant decrease in the potential energy through the course of the simulation did not occur, and the lowest potential energy time-averaged structure (last 89 psec) was used for further analysis. A relaxation was observed with the 25-36 fragment requiring that a second round of simulations be performed to test for energetic stability. The potential energy versus time, shown in the inset of Fig. 5B, was stable and the resulting low energy structure used for analysis. A second set often simulations on the full NPY and each of its fragments was performed starting from the energetically stable structures. These simulations were performed for 15 psec with the time-averaged structures determined over the last 14 psec of each trajectory. Comparison of the full NPY and the various fragments involved both structural and dynamic differences in the 25-36 region of the peptide (the smallest fragment used in the study). The rms differences were calculated by initially performing a least-squares fit of the backbone atoms (Ca, N, C, O) in the 25-36 region followed by calculation of the rms difference between the - 1343 kcal/mol full NPY structure and the lowest potential energy time-averaged fragment structures obtained from the follow-up simulations where the potential energy versus time was observed to be stable. Root mean square fluctuations were also determined from the follow-up simulations. The reported rms fluctuations are the average of the three or ten follow-up simulations performed. Determination of the configurational entropy of the molecules was performed by calculating the determinant, tr, of the covariance matrix. The matrix elements, Orij, 47'48 are given by: orij = ((qi -
(q))( qj -
(qj)))
47 M. Karplus and J. N. Kuschick, Macromolecules 14, 325 (1981). 48 j. Brady and M. Karplus, J. Am. Chem. Soc. 107, 6103 (1985).
(i)
[21]
MOLECULAR MODELING AND DYNAMICS OF PEPTIDES
463
i
A
1,50
100 ,---4 0
50 ,-W
,- ." "'"-.--'""'"'--"-;,x'2-'~,--:'k"'S'"'~:
0
0
I
taft
!
I
i
0
150
150 100 °,,--t
50 ,,¢,a
i00 0
0
25
50
50
75
100
...: - . h
~
-.
0
.
I
I
I
25
50
75
..--...I
100
Time. P i c o s e c o n d s FIG. 5. Potential energy versus time of the (A) 2-36 and (B) 25-36 NPY fragment simulations. Inset in (B): Potential energy versus time for the three follow-up simulations initiated from the 25-36 fragment lowest potential energy structure from the first round of simulations. Potential energies are offset such that the lowest value of each curve was set to zero. The alternate line types represent different starting random seed numbers.
where qi and qj are the individual atom Cartesian coordinates and ( ) represents the time average. The configurational entropy, Sq, was then calculated:
Sq =
½ka[N(1 + In 2~r) + In o-]
(2)
where k B is the Boltzmann constant and N, the number of Cartesian
coordinates, equals 148 in the present study. Atoms included were those in residues 25-36, excluding the residue 25 amide or amino group and the C-terminal amide. Other methodologies for the calculation of configura-
464
PROTEINS AND PEPTIDES: PRINCIPLES AND METHODS
[21]
T A B L E III ROOT MEAN SQUAREDIFFERENCES AVERAGED OVER RESIDUES 25--36 BETWEEN N P Y AND N P Y FRAGMENTS FROM SIMILATION LOWEST ENERGY STRUCTURES a
rms difference (A) Fragment
Long
Short
Free acid
2.22 1.31 1.97 2.64 1.33 1.19 1.71 3.62
0.61 0.69 0.72 1.08 -1.33 -2.99
2-36 13-36 16-36 19-36a 19-36b 20-36 25-36
a Results are for all nonhydrogen atoms. Long represents the 89 psec structures and short the 14
psec structures. 19-36b represents results where the side chain of Ser-22 was rotated away from the N terminus. From MacKerell et al. 24
tional entropy have been presented elsewhere. 49-5~ In the calculation of the configurational entropy it has been observed that the interval between time frames (see Table V) and the length of the simulation 52 may influence the results. Values are reported as average configurational entropy difference between the full NPY 25-36 region and that of the 25-36 region of the various fragments. These were calculated by determining ten independent configurational entropy differences between the ten full NPY and ten fragment short simulations and taking the average. The standard error was determined from the ten differences. Root mean square differences between the full NPY and the various fragments are presented in Table III for all heavy atoms in the 25-36 region for both the long (89 psec) and short (14 psec) time-averaged structures. As may be seen there is a trend where the rms difference increases as the size of the fragment decreases. Exceptions, however, occur with the NPY free acid and with the 19-36 and 20-36 fragments from the 89 psec simulations. The free acid results suggest that the negative charge at the 49 O. 5o H. 51 K. s2 A.
Edholm and H. J. C. B e r e n d s e n , Mol. Phys. 51, 1011 (1984). Meirovitch, M. Vasquez, and H. A. Sheraga, Biopolymers 26, 651 (1987). K. Irikura, B. Tidor, B. R. Brooks, and M. Karplus, Science 229, 571 (1985). DiNola, H. J. C. Berendsen, and O. E d h o l m , Macromolecules 17, 2044 (1984).
[21]
MOLECULAR MODELING AND DYNAMICS OF PEPTIDES
465
C terminus of NPY may affect the structure of the C-terminal region of the peptide as well as alter interactions of NPY with a receptor owing to the replacement of the C-terminal amide group with a carboxylic acid.~2 With the 19-36 and 20-36 fragments the smaller structural change in the long simulation results as compared to the 13-36 and 16-36 fragments was due to a hydrogen bond between the hydroxyl side-chain group of Ser-22 and the N-terminal amino group, stabilizing the helical structure of the peptide. Originally, 24 this result was attributed to the lack of an explicit solvent representation in the simulations, which, if present, may attenuate this interaction. Recent work involving the 16-36, 17-36, 18-36, 19-36,and 20-36 NPY fragments, however, indicate that the calculated effect may also be occurring in experiments. 16In the Boublik et al. 16study, which included a number of NPY fragments, it was observed that the 17-36, 18-36, 19-36, and 20-36 fragments had activities differing from the full NPY and the 16-36 fragment. The helical content of the 18-36 fragment was also shown to be fairly high. Thus, those experimental results may possibly be attributed to the internal hydrogen bond observed in the 19-36 and 20-36 fragments, although further calculations on the 17-36 and 18-36 fragments are required. Combining the rms difference results from the short 14 psec simulation structures with pharmacological binding and activity measurements yields a trend where an increased potency ratio (i.e., increase in ECs0 of the fragment as compared to that of the full NPY) is associated with increased rms differences of the fragments with respect to the full NPY structure (Fig. 6). Furthermore, the results show two distinct and apparently disparate trends. A large increase in potency ratio paralleling the increase in structural rms difference is indicated in two cases. A second, much smaller increase in potency ratio versus rms difference is seen in the remaining cases. This disparity can be reconciled with the original experimental potency ratio data. The large increase in potency ratio was observed in experiments involving the binding to rat brain membranes 53 and perfusion pressure in guinea pig heart, j9 whereas the lower potency ratio increase includes studies on in vitro receptor binding and in vioo perfusion pressure measurements in pig spleen 24 and the inhibition of melanotropin release in frog pituitary. 21 These different trends suggest alternate binding modes of NPY with receptors, which was previously indicated. 2°'54 Beyond the average rms difference of the 25-36 region, detailed descriptions of the change in position of individual residues are quite useful in developing structure-function relationships. Presented in Fig. 7 are rms 53 J.-C. Martel, S. St.-Pierre, and R. Quirion, Peptides 7, 55 (1986). 54 S. P. Sheikh, R. Hakansan, and T. W. Schwartz, FEBS Lett. 245, 209 (1989).
466
PROTEINS AND PEPTIDES: PRINCIPLES AND METHODS
[21]
300
850 o
200 150 z o
100 50 0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
RMS DIFFERENCE Fro. 6. Potency ratio [ECs0(X)/ECs0(NPY)] of various pharmacological measurements versus calculated rms difference (~) between all heavy atoms in the 25-36 region in the various fragments and the full NPY. Symbols: frog pituitary melantropin release inhibitionn ([2); rat brain binding 53 ( x ); guinea pig heart perfusion pressure 19(©); pig spleen binding (A); and pig spleen perfusion pressure 24 (#). Lines were drawn only to connect the points from the individual experiments, and lines truncated at the top of the figure represent the continuations to the potency ratios for the 25-36 fragments of the respective experiments. See MacKerell e t al. 24 and the references cited for original data. (From MacKerell e t al. 24)
differences of the side-chain and backbone atoms averaged over individual residues for both the long (89 psec) and short (14 psec) lowest energy structures. This has been performed for the total structures following leastsquares fitting of the common backbone atoms (Fig. 7A,B) and for the 25-36 region of each NPY fragment following least-squares fitting of the backbone atoms in residues 25-36 with respect to the full NPY - 1343 kcal/mol structure 25-36 region (Fig. 7C,D). Concerning the full structures it may be seen that structural changes of up to 8 A occurred and that the sites where the largest changes occur are fairly consistent throughout the different fragments. The same results are also observed with the 25-36 region of the fragments. A majority of the structural changes involve residues located in the contact region between the a and polyproline helices which shift owing to the loss of interaction with the polyproline helix in the fragments. Some significant changes are also seen toward the C terminus in the long simulations. From these observations it appears that the interaction of the polyproline helix with the a helix is important for the structural maintenance of the C-terminal region of NPY. Along with structural changes dynamic properties of molecules have an equally important influence on binding events owing to free energies of
[21]
467
MOLECULAR MODELING AND DYNAMICS OF PEPTIDES
4 0
.. r
.
8
.
.
I
.
I
a
s6
.
i
.
.
.
.
.
.
I
.
.
.
I
i
.
.
I
,
x
..
i
i
7
.
I
I
.
.
.
I
.
.
4,
.
.
.
•
a
i
B
i
.
I
.
.
.
.
I
.
I
't
)
.
~_-:._z:_-:_o7_-7-S
.
I
x
I
.
I
.
.
I
_
i
i
4 o ~.
8
~-""
....
J
_ .
I
I
.
.
.
.
u
.
.
r
.
.
I
.
.
i
I
I
I
..............
i
E
i
e~
i
i
i
i
5
10
15
20
RESIDUE
25
30
85
i
i
i
i
i
i
25
27
29
31
33
35
RESIDUE
FIG. 7. Root mean square difference (/~) between NPY and the various fragments averaged over residues and the backbone (N, C~, C, O) (top) or side-chain (Gly O and excluding hydrogens) (bottom) atoms for the complete fragments for (A) the long (89 psec) simulation lowest energy structures and (B) the short (14 psec) simulation lowest energy structures, and for the 25-36 region from the (C) long (89 psec) and (D) short (14 psec) simulation lowest energy structures. Free acid (dot), 2-36 (dot-dash), 13-36 (thin line), 16-36 (line), 19-36 (thin dash), and the 25-36 (dash) fragments. (From MacKerell e t al. 24)
binding including contributions from both enthalpic and entropic effects. 55 Enthalpic terms are generally related to potential energy and structural changes, whereas entropy is related to the sampling of conformational space by the system. One way to quantitate the entropy of peptides is via rms fluctuations. All heavy atom fluctuations in the 25-36 region of the peptides studied for both the long 89 psec simulations and the short 14 psec simulations are presented in Table IV. In both cases there appears to be a trend showing the rms fluctuations to increase as the fragment size decreases; however, the trend is by no means unambiguous. This ambiguity may be related to the limited number and length of the simulations. A second measure of the role of dynamics is via the calculation of 55 M. Pettit and M. Karplus, i n "Molecular Graphics and Drug Design" (A. S. V. Burgen, G. C. K. Roberts, and M. S. Tute, eds.), p. 75. Elsevier, Amsterdam, 1986.
468
PROTEINS AND PEPTIDES: PRINCIPLES AND METHODS
[21]
T A B L E IV R o o t MEAN SQUARE FLUCTUATIONS AVERAGED OVER RESIDUES 25--36 FOR N P Y AND THE N P Y FRAGMENTS a r m s difference (/~) Fragment
Long
Short
NPY Free acid 2-36 13-36 16-36 19-36 25-36
0.89 0.83 0.74 0.86 1.00 0.95 1.13
0.42 0.44 0.44 0.45 0.45 0.45 0.48
a R e s u l t s are for all n o n h y d r o g e n atoms. L o n g repr e s e n t s the 89 psec structures and short the 14 psec simulations with the results averaged over three and ten runs, respectively. F r o m MacKerell e t al. 24
their configurational entropy. In calculating the configurational entropy of a system the correlation between the motion of atoms or groups is taken into account. Thus, although the amplitude of motions may be large, if the motions are well correlated the contribution to the configurational entropy will be small, whereas in some cases small amplitude motions, if highly uncorrelated, will make a large contribution to the configurational entropy. Analysis of Table V showing the difference in configurational entropy between full NPY and the fragments does not show a clear trend. Further, the errors associated with these calculations are large. Thus, whereas the role of entropy is significant in understanding binding phenomena, the ability to quantitate it is difficult. It would appear that approximately an order of magnitude increase in the number of simulations or simulation time would be required in order to obtain reasonable statistics for the configurational entropy results. With these limitations in mind, combining the trends from the rms fluctuations and the fact that the configurational entropy is higher in every fragment as compared to NPY (i.e., all values in Table V are positive) suggests that the N terminus of the peptide decreases the flexibility in the C terminus of the peptide. This appears to be due to the close spatial orientation of the N and C termini in the full NPY structure (Fig. 4). Assuming that the flexibility of a peptide is decreased when it is bound to a receptor, lower flexibility in the 25-36 region of the full NPY when unbound would favor binding to a receptor
[21]
M O L E C U L A R M O D E L I N G A N D D Y N A M I C S OF P E P T I D E S
469
TABLE V CONFIGURATIONALENTROPYDIFFERENCEBETWEEN NPY AND VARIOUSFRAGMENTSa Change in configurational entropy (cal/mol K) Fragment Free acid 2-36 13-36 16-36 19-36 25-36 a
20 fsec/interval 246 260 195 103 241 412
--- 511 - 536 +- 387 -+ 370 -+ 318 -+ 314
40 fsec/interval 45 203 153 259 281 220
+ 646 -+ 630 -+ 519 + 496 -+ 282 -+ 556
Values are the averages of 10 pairwise differences between NPY and the respective fragments with the error representing the S.D. of those differences. From MacKerell e t al. 24
owing to the loss of e n t r o p y on binding being smaller. Combining this effect with the r m s difference result suggests a model where both the structural m a i n t e n a n c e , allowing the appropriate l i g a n d - r e c e p t o r interactions (enthalpic contributions), and conformational stability (entropic contributions) of the 25-36 region of the N P Y are important for binding.
Summary A m e t h o d o l o g y is presented to allow results from molecular dynamics simulations to be c o m b i n e d with pharmacological binding and activity m e a s u r e m e n t s in order to study the s t r u c t u r e - f u n c t i o n relationships of neuropeptide Y. This a p p r o a c h is a general one and should also be applicable to other peptides for which stable structures are k n o w n to exist in solution. The basis o f the m e t h o d is the calculation of energetically stable structures via a simulate and test approach. This a p p r o a c h uses molecular d y n a m i c s simulations to search conformational space in order to find low potential energy structures. T h e energetic stability of the structures is then tested via additional simulations. Once energetic stability has been achieved, perturbations of the structure m a y be p e r f o r m e d via molecular modeling. The simulate and test a p p r o a c h is then used to obtain energetically stable structures for the perturbed c o m p o u n d . C o m p a r i s o n b e t w e e n the energetically stable starting and p e r t u r b e d structures can then be m a d e concerning b o t h structural and dynamic changes. By using an energetically
470
PROTEINS AND PEPTIDES" PRINCIPLES AND METHODS
[21]
stable structure prior to the perturbation, the assumption can be made that the calculated differences are primarily due to the perturbation rather than to one or both of the structures reaching a m o r e energetically favorable state. It should be e m p h a s i z e d that the calculations are being p e r f o r m e d employing a limited physical model such that the influence of that model on the o b s e r v e d results must always be taken into account. Acknowledgments The author acknowledges support from the National Science Foundation in the form of a NATO Postdoctoral Fellowship and special thanks to Lennart Nilsson, Rudolf Rigler, and Neena Summers for helpful discussions.