FULL PAPER

WWW.C-CHEM.ORG

Accurately Modeling Nanosecond Protein Dynamics Requires at least Microseconds of Simulation Gregory R. Bowman* Advances in hardware and algorithms have greatly extended the timescales accessible to molecular simulation. This article assesses whether such long timescale simulations improve our ability to calculate order parameters that describe conformational heterogeneity on ps-ns timescales or if force fields are now a limiting factor. Order parameters from experiment are compared with order parameters calculated in three different ways from simulations ranging from 10 ns to 100 ls in length. Importantly, bootstrapping is employed to assess the variability in results for each simulation length. The results of 10–100 ns timescale simulations are highly variable, possibly explaining the variation in levels of agreement between simulation and experiment in published works examining different proteins. Fortunately, microsecond timescale simulations improve both the accuracy and precision of calculated order parameters, at least so long as the full exponential fit or truncated

average approximation is used instead of the common longtime limit approximation. The improved precision of these long timescale simulations allows a statistically sound comparison of a number of modern force fields, such as Amber03, Amber99sb-ILDN, and Charmm27. While there is some variation between these force fields, they generally give very similar results for sufficiently long simulations. The fact that so much simulation is required to precisely capture ps-ns timescale processes suggests that extremely extensive simulations are required for slower processes. Advanced sampling techniques could aid greatly, however, such methods will need to maintain accurate kinetics if they are to be of value for calcuC 2015 lating dynamical properties like order parameters. V Wiley Periodicals, Inc.

Introduction

of simulation methodologies and force fields. These parameters quantify the disorder of particular bond vectors, ranging from zero, for a bond that undergoes substantial reorientations, to one, for a bond that never changes its orientation relative to the molecular frame of reference. It is common to measure order parameters for every amide bond in a protein’s backbone and the methyl groups of its side-chains, providing a global description of dynamics occurring on short timescales. They are typically measured via relaxation-dispersion NMR experiments, which are insensitive to dynamics longer than the molecular tumbling time, and therefore describe only psns dynamics. Room temperature crystallography provides an alternative way to measure ps-ns timescale order parameters[17] and residual dipolar coupling NMR experiments provide access to microsecond–millisecond timescale motions.[18] Importantly, order parameters can also be calculated from ensembles of structures generated by computer simulations. For example, it is common to calculate order parameters from tens to hundreds of nanoseconds of molecular dynamics

Distributed computing,[1] specialized hardware,[2] and advanced sampling algorithms[3,4] have greatly expanded the scope of problems that atomically detailed simulations can address, even providing access to millisecond timescale events. Such simulations have been used to retrodict the folded structures and folding timescales of small proteins,[5–7] to discover hidden allosteric sites,[8–11] and to elucidate the mechanisms of conformational changes.[12–14] Despite these successes, important questions remain about the accuracy of modern simulations and how much sampling is required to achieve convergence. For example, recent work comparing folding simulations with hydrogen-exchange data suggest that long timescale molecular dynamics simulations still do not describe high-energy, partially unfolded states well[15] and that different force fields yield different folding pathways.[16] However, these results do not preclude the possibility that simulations accurately describe shorter timescale conformational changes or that longer simulations would converge on a more consistent result. This article explores whether our increased ability to run long simulations offers an accurate description of the native ensemble by comparing simulations with NMR measurements of generalized order parameters (S2 ). Order parameters provide a comprehensive description of conformational heterogeneity within the native ensemble and serve as a valuable benchmark for assessing the performance 558

Journal of Computational Chemistry 2016, 37, 558–566

DOI: 10.1002/jcc.23973

G. R. Bowman Department of Biochemistry & Molecular Biophysics, Department of Biomedical Engineering, and Center for Biological Systems Engineering, Washington University School of Medicine, St. Louis, Missouri, 63110 E-mail: [email protected] Contract grant sponsor: Burroughs Wellcome Fund C 2015 Wiley Periodicals, Inc. V

WWW.CHEMISTRYVIEWS.COM

FULL PAPER

WWW.C-CHEM.ORG

simulations, assuming that these are sufficient to capture the ps-ns timescale motions that are observed in standard NMR experiments. However, this assumption may not be valid. Since order parameters are an ensemble average, they have contributions from all populated states, including those separated from the native state by high-energy barriers, which are unlikely to be accessed during nanoseconds of simulation. Indeed, residual dipolar coupling experiments have revealed dynamics on longer timescales than are observed in relaxation-dispersion techniques[19] and similar increases in conformational heterogeneity have been observed in longer timescale simulations,[20–25] suggesting that such large barriers exist. Combining measurements from simulations and experiments has provided numerous insights as to the dynamic character of proteins. For example, a combined approach has been used to elucidate the liquid-like character of proteins,[26,27] to develop a conformational entropy meter,[28,29] to probe the role of dynamics in protein function,[30–33] and to understand the response of proteins to ligand binding.[34,35] Evaluations of the relative merits of different force fields have also been conducted by comparing order parameters from simulations to experiments.[22,23,36,37] Such comparisons should also be a valuable means of testing sampling methods.[38] The purpose of this study is to systematically evaluate how the level of agreement between experimental order parameters for both backbone and side-chain degrees of freedom and order parameters from computer simulations varies with different simulation lengths, force fields, and methods for calculating order parameters. This article largely focuses on ubiquitin, because it is a common model system for developing NMR methods, so there are order parameters for both the backbone amides[39] and side-chain methyl groups.[40] Furthermore, its small size—76 residues (Fig. 1)—makes it amenable to long molecular dynamics simulations. Therefore, it was possible to run 10 ls of molecular dynamics simulations of ubiquitin with each of three force fields: Amber03, Amber99sb-ILDN, and Charmm27. Similar comparisons are also performed for the backbone dynamics of E. coli ribonuclease H (RNase H)—a much larger protein with 155 residues (Fig. 1)—to test the generality of the conclusions drawn for ubiquitin. The results have important implications for the application and development of advanced sampling techniques, demonstrating that microseconds of simulation data are required for accurate calculation of much faster timescale protein dynamics.

Methods Simulations For each force field considered, a total of 10 ls of molecular dynamics simulation of ubiquitin were run. All simulations were run with Gromacs 4.6.5[41,42] on machines with 8 CPU cores and one NVIDIA Tesla K20 GPU. Rather than running one long simulation, this aggregate simulation length was achieved by running 20 independent 500 ns simulations. Based on previous work, the analyses of these data should be equivalent to

Figure 1. Structures of ubiquitin and E. coli RNase H.

a single 10 ls simulation.[43] The single starting conformation used for all of these simulations was generated by solvating the crystallographic structure of ubiquitin (PDB ID: 1UBQ[44]) with TIP3P water[45] in a dodecahedron box that extended 1 nm beyond the protein in any dimension. This system was energy minimized with each of the following force fields: Amber03,[46] Amber99sb-ILDN,[47] and Charmm27 [48] (Charmm22 with the CMAP correction[49]). Energy minimization was carried out with the steepest descent algorithm until the maximum force fell below 1000 kJ/mol/min using a step size of 0.01 nm and a cut-off distance of 1.2 nm for the neighbor list, Coulomb interactions, and Van der Waals interactions. The solvent was then equilibrated in a 1 ns simulation with a position restraint on all protein heavy atoms. All bonds were constrained with the LINCS algorithm[50] and virtual sites[51] were used to allow a 4 fs time step. Cut-offs of 1.1, 0.9, and 0.9 nm were used for the neighbor list, Coulomb interactions, and Van der Waals interactions, respectively. The Verlet cut-off scheme was used for the neighbor list and particle mesh Ewald[52] was employed for the electrostatics. The v-rescale thermostat[53] was used to hold the temperature at 300 K and the Berendsen barostat[54] was used to bring the system to 1 bar pressure. For the production runs, the position restraint was removed and the Parrinello–Rahman barostat[55] was employed. Similar settings were used to run 100 ls of simulation of RNase H (PDB: 1F21[56]), as described previously.[10] Missing residues in the crystal structures of each protein were added with Modeller[57] and structures were drawn with PyMOL.[58] Calculation of order parameters Order parameters were calculated from these molecular dynamics simulations using three variants of the Lipari–Szabo model-free approach.[59] This approach connects the spectral densities measured via NMR with the autocorrelation function for a unit vector pointing along some bond vector as CðtÞ ¼ hP2 ðl½0l½t Þi where P2 is the second-order Legendre polynomial (P2 ðx Þ ¼ 3 2 1 2 x 2 2), l½t  is a unit vector along a particular bond at time t, and h. . .i denotes an equilibrium ensemble average. In relaxation-dispersion experiments, this correlation function is measured out to the molecular tumbling time (sm ), which is Journal of Computational Chemistry 2016, 37, 558–566

559

FULL PAPER

WWW.C-CHEM.ORG

typically on the order of a few nanoseconds. The molecular tumbling times of ubiquitin and RNase H are about 3.5[40] and 8.5 ns,[60] respectively. Therefore, one can calculate corresponding order parameters from molecular dynamics simulations by aligning all the structures to some common reference structure, calculating CðtÞ from t ¼ 0 to t ¼ sm , and fitting the resulting curve to a single exponential using   CðtÞ ¼ S2 1 12S2 e2t=se

and standard deviation of the correlation coefficient and RMSD across these 50 random sets of trajectories were calculated. When evaluating the performance of simulations shorter than 500 ns, the beginning of randomly selected trajectories were used to mimic the results one would obtain by starting short simulations from the crystallographic structure.

Results and Discussion

2

where S is the generalized order parameter for a particular residue and se is the effective correlation time for motions of that residue relative to the protein. This method for calculating order parameters will be referred to as the full exponential fit. There are also two approximations to the full single exponential fit: the truncated average and long-time limit approximations. In the truncated average approximation, one simply calculates the average of the autocorrelation function near the molecular tumbling time using 1 S ¼ sm -ss 2

sðm

hP2 ðl½0l½tÞidt

ss

where the start time (ss ) to sm is the interval of correlation times the average is taken over. For results reported in the rest of this article, ss ¼ sm -1 ns. The results are basically insensitive to changing ss by 500 ps. In the long-time limit approximation, one assumes S2 ¼ lim hP2 ðl½0l½t Þi t!1

which yields[61] 3 S2 ¼ 2 h i 1 2 2 hlx i 1hl2y i 2 1hl2z i 2 12hlx ly i 2 12hlx lz i 2 12hly lz i 2 2 2 where lx is the x component of the unit vector l. Since this long-time limit can be applied to simulations that are too short to provide adequate statistics for any of the other methods and this approximation has been reported to give equivalent results to the single exponential fit, it has often been used in the literature. A correction factor of 0.9 that has been used to account for zero-point averaging in past work[62,63] did not improve the results and is not employed in this study. Bootstrapping was used to explore the reproducibility of order parameters calculated from different amounts of simulation. For example, to calculate the typical level of agreement between 1 ls of simulation and experiment, 50 sets of trajectories were generated, each consisting of two randomly selected 500 ns simulations. For each set of trajectories, order parameters were calculated using one of the methods outlined above and then the Pearson r correlation coefficient and RMSD between these calculated order parameters and the experimental values were calculated. Finally, the average 560

Journal of Computational Chemistry 2016, 37, 558–566

Hundreds of nanoseconds of simulation give variable results I first sought to assess the reproducibility of order parameters calculated from hundreds of nanosecond of molecular dynamics simulation using the long-time limit approximation, because this protocol is a common practice in the literature. While one might expect 100 ns of simulation to be adequate to capture processes on ps-ns timescales, the results for both side-chain methyl groups and backbone amides are actually quite variable. For example, for the side-chains, the correlation coefficients between simulation and experiment for 100 simulations generated with the bootstrapping approach described above ranged from 0.70 to 0.90 (mean of 0.82 6 0.06) when the Amber03 force field was used. Interestingly, this same distribution is reflected in other published data sets. For example, Kasinath et al.[29] recently reported correlation coefficients between experimental methyl group order parameters and order parameters they calculated from 112 to 260 ns simulations with the long-time limit approximation. These correlation coefficients range from 0.69 to 0.92 (mean of 0.83 6 0.08). The similarity of their results to my own is particularly remarkable given that different force fields and simulation software were used. This level of variation is significant because past studies benchmarking simulations with experimental data have interpreted differences in correlation coefficients as small as 0.1–0.2 to support claims that one force field is preferable over another. While such differences are not the sole basis for these conclusions, greater reproducibility of calculated order parameters will allow more stringent comparisons between different force fields and simulation methods. Unfortunately, switching to a different method for calculating order parameters without changing the simulation length does not yield more precise results. For example, using the full single exponential fit actually increases the variability for 100 ns simulations, yielding correlation coefficients ranging from 0.63 to 0.93 (mean of 0.84 6 0.08) for the side chains. The truncated average method for calculating order parameters provides a moderate improvement, yielding correlation coefficients ranging from 0.83 to 0.93 (mean 0.88 6 0.03). In fact, this approximation gives equivalent results to the full exponential fit for long simulations and generally gives more robust results for shorter simulations without a significant loss of accuracy. This difference probably arises from sensitivity of the fitting procedure used for the full exponential fit. Therefore, I focus on the truncated average and long-time limit approximations for the rest of this article. WWW.CHEMISTRYVIEWS.COM

WWW.C-CHEM.ORG

FULL PAPER

Figure 2. Correlation coefficients (r) and RMSDs between experimental order parameters for ubiquitin and those calculated from simulations with the Amber03 force field. Results for side-chain methyl groups are on the left (A and B) and results for backbone amides are on the right (C and D). Mean values from 50 bootstrapped samples are shown as a function of the simulation length for the long-time limit approximation (open diamonds) and from the truncated average approximation (closed circles). The error bars represent one standard deviation.

Microseconds of simulation provide greater accuracy and precision To explore the possibility that longer simulations might improve the reproducibility of calculated order parameters, distributions of correlation coefficients and RMSDs between simulation and experiment were calculated for simulations ranging from 10 ns to 10 ls in aggregate length. Figure 2 shows the results for both the long-time limit and truncated average approximations for ubiquitin’s side-chain methyl groups and backbone amides, using the Amber03 force field. The most important result of this comparison is that 10 ls of simulation provide significantly improved results for the

side chains, both in terms of accuracy and precision. For example, the correlation coefficient between experimental methyl group order parameters and those calculated from 10 ls simulations using the truncated average approximation ranges from 0.90 to 0.92 (mean of 0.91 6 0.004, Fig. 2A). In addition to having a much tighter distribution than the correlation coefficients for shorter simulations, this level of agreement is an improvement over a previously reported correlation coefficient of 0.81 from 50 ns of simulation of ubiquitin with the Amber99sb force field.[64] A similar improvement is observed in the distributions of RMSDs between methyl group order parameters derived from simulation and experiment (Fig. 2B). Journal of Computational Chemistry 2016, 37, 558–566

561

FULL PAPER

WWW.C-CHEM.ORG

Figure 3. Correlation coefficients (r) and RMSDs between experimental backbone order parameters for RNase H and those calculated from simulations with the Amber03 force field. Mean values from 50 bootstrapped samples are shown as a function of the simulation length for the long-time limit approximation (open diamonds) and from the truncated average approximation (closed circles). The error bars represent one standard deviation.

Backbone order parameters calculated from 10 ls of simulation are also more precise than those calculated from shorter simulations, though they are not significantly more accurate (Figs. 2C and 2D). Somewhat surprisingly, 100 ns simulations tend to yield the highest correlation coefficients with experiment, consistent with previous work comparing simulations with residual dipolar couplings.[22] However, the RMSD between simulation and experiment is almost constant (Fig. 2D). The variation in the correlation coefficient probably results from the fact that many of the residues have very similar backbone order parameters—the average backbone order parameter is 0.84 6 0.06—so the correlation coefficient is sensitive to small changes in the calculated order parameters. 562

Journal of Computational Chemistry 2016, 37, 558–566

Notably, the long-time limit approximation gives worse results for longer simulations (Fig. 2). The results for 10 and 100 ns simulations are reasonably similar for both the backbone and side chains, as judged by both the correlation coefficient and RMSD. However, the agreement between simulation and experiment decreases as the simulation length is extended from 100 ns to 1 ls. This is due to the fact that the long-time limit includes long correlation times that are not included in the full single exponential fit and truncated average, which are truncated at the molecular tumbling time. These slower processes should be captured in residual dipolar coupling experiments, so applying the long-time limit approximation to microsecond timescale simulations might yield agreement with such experiments.[24] However, reproducibly predicting order parameters for a microsecond–millisecond timescale process could require ms-s of simulation given the present results. Regardless, these results clearly demonstrate that applying the truncated average approximation to microsecond timescale simulations gives better results than the long-time limit approximation applied to simulations of any length. So, while the long-time limit was of great utility while microsecond timescale simulations were beyond reach, the truncated average or full exponential fit should now be used instead. To test the generality of these results, similar calculations were performed for RNase H. Figure 3 demonstrates that this analysis yields trends that are very similar to those observed for ubiquitin. For example, the accuracy and precision of order parameters calculated with the truncated average approximation increase with increasing simulation length, while the performance of the long-time limit degrades as more data are used. Calculating order parameters from the full 100 ls data set with the truncated average approximation yields a correlation coefficient of 0.90. This result demonstrates that the correlation coefficient continues to increase beyond the 10-ls scale shown in Figure 3. It also suggests unrealistic conformational changes are not occurring on longer timescales and that the results for ubiquitin might also improve with additional simulation time. The fact that the correlation coefficient between simulation and experiment is higher for RNase H’s backbone than for ubiquitin’s is probably the result of the grater spread in the experimental values for RNase H—0.90 6 0.14 for RNase H compared to 0.84 6 0.06 for ubiquitin. The greater spread makes the correlation coefficient less sensitive to small variations. These results present an interesting challenge for advanced sampling techniques. Many enhanced sampling algorithms work by modifying the underlying potential or temperature. While the simulation results can be reweighted to obtain correct thermodynamics, there are limited possibilities for obtaining accurate kinetics. Without accurate kinetics, the full single exponential fit and truncated average cannot be used to calculate order parameters. While the long-time limit approximation can be used, the fact that it does not perform as well as the other methods means that conclusions drawn using this approximation may not be reliable. Furthermore, without accurate kinetics, it is impossible to judge whether a set of WWW.CHEMISTRYVIEWS.COM

FULL PAPER

WWW.C-CHEM.ORG

Table 2. Averages and standard deviations of order parameters from experiments and each force field for both side-chain methyls and backbone amides. Source Experiment Amber03 Amber99sb-ILDN Charmm27

Figure 4. Experimental order parameters for ubiquitin and those calculated from 10 ls of simulation with each of the force fields considered using the truncated average approximation. The circles represent the actual data points and they are connected with lines to help highlight trends. The vertical lines for the side-chains result from residues with two methyl groups.

simulations is capturing a slow process, which will yield particularly poor results when the long-time limit approximation is applied. Three modern force fields give similar results The increased precision of order parameters calculated from microseconds of simulation provides the opportunity to compare the performance of different force fields. To make such a comparison, the calculations presented above were repeated with the Amber99sb-ILDN and Charmm27 force fields. Given the results above, 10 ls of simulation were run for each force field and order parameters were calculated using the truncated average approximation to obtain reproducible results. Comparing the side chain and backbone order parameters for each force field shows they all follow the same trends (Fig. 4). This conclusion is supported by the high correlation coefficients between order parameters calculated from the different force fields (Table 1). Furthermore, all three force fields tend to





0.49 6 0.21 0.61 6 0.22 0.63 6 0.21 0.63 6 0.24

0.84 6 0.06 0.84 6 0.16 0.85 6 0.16 0.86 6 0.14

overestimate the side chain order parameters. This trend is confirmed by the average order parameters in Table 2. There are two possible explanations for this trend. One possible explanation is that all three force fields are overly stiff. However, it is also possible that even longer simulations would discover new states and converge on order parameters that are in better agreement with experiment. To better understand the differences between these alternative force fields, each was compared with experiment as a function of the aggregate amount of simulation (Figs. 5 and 6). This analysis shows that Charmm27 gives somewhat more reproducible results than Amber03 for 100 ns simulations—the correlation coefficient with side-chain methyl group order parameters from experiment ranges from 0.79 to 0.91 (mean of 0.85 6 0.04), compared to 0.70–0.90 (mean of 0.82 6 0.06) for Amber03. As with Amber03, the results of longer simulations are less variable. However, the difference between the long-time limit and truncated average approximations is not as great. At microsecond timescales, the long-time limit even yields a higher correlation coefficient for the backbone. However, the truncated average approximation is probably still preferable since it gives less variable results, a lower RMSD to experiment, and performs better for ubiquitin’s side-chains and RNase H’s backbone. Charmm27 also yields higher correlation coefficients for the backbone. However, the RMSD between simulation and experiment is comparable to Amber03. Taken in combination with the similar trends observed in Fig. 4, I cannot conclude that either Amber03 or Charmm27 is clearly preferable over the other. Amber99sb-ILDN does not appear to offer an obvious advantage either. It offers higher correlations with experiment for the backbone than Amber03 but performs slightly less well than Amber03 or Charmm27 for the side chains. The RMSDs between simulation and experiment are also very similar to those for Amber03 and Charmm27.

Conclusions Table 1. Correlation coefficients between order parameters for both sidechain methyls and backbone amides calculated by applying the truncated average approximation to 10 ls simulations run with different force fields. Force fields Amber03/Amber99sb-ILDN Amber03/Charmm27 Amber99sb-ILDN/Charmm27

Side-chain

Backbone

0.91 0.85 0.92

0.96 0.94 0.97

The results presented above demonstrate that order parameters calculated from microseconds of simulation are more accurate and more precise than those calculated from 10 to 100 ns simulations, despite the fact that these order parameters capture ps-ns timescale motions. The fact that 100 ls of simulation provide even better results than 1–10 ls of simulation suggests that even longer simulations might provide an additional benefit. The results of shorter simulations are highly Journal of Computational Chemistry 2016, 37, 558–566

563

FULL PAPER

WWW.C-CHEM.ORG

Figure 5. Correlation coefficients (r) and RMSDs between experimental order parameters for ubiquitin and those calculated from simulations with the Charmm27 force field. Results for side-chain methyl groups are on the left (A and B) and results for backbone amides are on the right (C and D). Mean values from 50 bootstrapped samples are shown as a function of the simulation length for the long-time limit approximation (open diamonds) and from the truncated average approximation (closed circles). The error bars represent one standard deviation.

variable, which may explain some of the variability in agreement between simulation and experiment observed for different systems and different force fields reported in the literature. Realizing the improvement provided by longer simulations requires using the full exponential fit or truncated average methods rather than the long-time limit. Interestingly, the amount of simulation and method of calculating order parameters seem to impact the agreement between simulation and experiment more than the choice of force field. These results imply that much longer simulations are required in many other settings as well. For example, if tens of microseconds of simulation are required to precisely calculate processes that occur on ps-ns timescales, then a millisec564

Journal of Computational Chemistry 2016, 37, 558–566

ond of simulation is unlikely to be sufficient to describe a complicated process that occurs on a millisecond timescale, such as protein folding or conformational changes involved in signaling. These findings have a number of implications for advanced sampling techniques. First, modern force fields are accurate enough that there is definitely something to be gained by running molecular simulations for longer periods of time. However, obtaining accurate and precise order parameters requires having kinetic information, and many advanced sampling techniques sacrifice kinetics to achieve enhanced sampling. Therefore, we need new ways to extract accurate kinetics from these methods. WWW.CHEMISTRYVIEWS.COM

WWW.C-CHEM.ORG

FULL PAPER

Figure 6. Correlation coefficients (r) and RMSDs between experimental order parameters for ubiquitin and those calculated from simulations with the Amber99sb-ILDN force field. Results for side-chain methyl groups are on the left (A and B) and results for backbone amides are on the right (C and D). Mean values from 50 bootstrapped samples are shown as a function of the simulation length for the long-time limit approximation (open diamonds) and from the truncated average approximation (closed circles). The error bars represent one standard deviation.

Acknowledgment I hold a Career Award at the Scientific Interface from the Burroughs Wellcome Fund and am also grateful to R.B. Fenwick, J.S. Fraser, K.M. Hart, and P.E. Wright for helpful comments. Keywords: conformational heterogeneity  molecular dynamics  order parameters  enhanced sampling  force fields

How to cite this article: G. R. Bowman. J. Comput. Chem. 2016, 37, 558–566. DOI: 10.1002/jcc.23973

[1] M. Shirts, V. S. Pande, Science 2000, 290, 1903. [2] D. Shaw, R. Dror, J. Salmon, J. P. Grossman, K. M. Mackenzie, J. A. Bank, C. Young, M. M. Deneroff, B. Batson, K. J. Bowers, M. P. Eastwood, D. J. Ierardi, J. L. Klepeis, J. S. Kuskin, R. H. Larson, K. Lindorff-Larsen, P. Maragakis, M. A. Moraes, S. Piana, Y. Shan, B. Towles, In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, Portland, Oregon, 2009. [3] L. C. T. Pierce, R. Salomon-Ferrer, C. A. F. de Oliveira, J. A. McCammon, R. C. Walker, J. Chem. Theory Comput. 2012, 8, 2997. [4] L. Zheng, M. Chen, W. Yang, Proc. Natl. Acad. Sci. USA 2008, 105, 20227. [5] V. A. Voelz, G. R. Bowman, K. Beauchamp, V. S. Pande, J. Am. Chem. Soc. 2010, 132, 1526. [6] G. R. Bowman, V. A. Voelz, V. S. Pande, J. Am. Chem. Soc. 2011, 133, 664.

Journal of Computational Chemistry 2016, 37, 558–566

565

FULL PAPER

WWW.C-CHEM.ORG

[7] K. Lindorff-Larsen, S. Piana, R. O. Dror, D. E. Shaw, Science 2011, 334, 517. [8] A. Ivetac, J. A. McCammon, Chem. Biol. Drug Des. 2010, 76, 201. [9] B. J. Grant, S. Lukman, H. J. Hocker, J. Sayyah, J. H. Brown, J. A. McCammon, A. A. Gorfe, PLoS ONE 2011, 6, e25711. [10] G. R. Bowman, P. L. Geissler, Proc. Natl. Acad. Sci. USA 2012, 109, 11681. [11] G. R. Bowman, E. R. Bolin, K. M. Hart, B. C. Maguire, S. Marqusee, Proc. Natl. Acad. Sci. USA 2015, 112, 2734. [12] D.-A. Silva, G. R. Bowman, A. Sosa-Peinado, X. Huang, PLoS Comput. Biol. 2011, 7, e1002054. [13] K. J. Kohlhoff, D. Shukla, M. Lawrenz, G. R. Bowman, D. E. Konerding, D. Belov, R. B. Altman, V. S. Pande, Nat. Chem. 2014, 6, 15. [14] R. Nygaard, Y. Zou, R. O. Dror, T. J. Mildorf, D. H. Arlow, A. Manglik, A. C. Pan, C. W. Liu, J. J. Fung, M. P. Bokoch, F. S. Thian, T. S. Kobilka, D. E. Shaw, L. Mueller, R. S. Prosser, B. K. Kobilka, Cell 2013, 152, 532. [15] J. J. Skinner, W. Yu, E. K. Gichana, M. C. Baxa, J. R. Hinshaw, K. F. Freed, T. R. Sosnick, Proc. Natl. Acad. Sci. USA 2014, 111, 15975. [16] S. Piana, K. Lindorff-Larsen, D. E. Shaw, Biophys. J. 2011, 100, L47. [17] R. B. Fenwick, H. van den Bedem, J. S. Fraser, P. E. Wright, Proc. Natl. Acad. Sci. USA 2014, 111, E445. [18] J. R. Tolman, K. Ruan, Chem. Rev. 2006, 106, 1720. [19] O. F. Lange, N.-A. Lakomek, C. Fare`s, G. F. Schr€ oder, K. F. A. Walter, S. Becker, J. Meiler, H. Grubm€ uller, C. Griesinger, B. L. de Groot, Science 2008, 320, 1471. [20] J. Xia, N.-J. Deng, R. M. Levy, J. Phys. Chem. B 2013, 117, 6625. [21] P. Maragakis, K. Lindorff-Larsen, M. P. Eastwood, R. O. Dror, J. L. Klepeis, I. T. Arkin, M. Ø. Jensen, H. Xu, N. Trbovic, R. A. Friesner, A. G. P. Iii, D. E. Shaw, J. Phys. Chem. B 2008, 112, 6155. [22] O. F. Lange, D. van der Spoel, B. L. de Groot, Biophys. J., 2010, 99, 647. [23] K. Lindorff-Larsen, P. Maragakis, S. Piana, M. P. Eastwood, R. O. Dror, D. E. Shaw, PLoS ONE 2012, 7, e32131. [24] G. R. Bowman, P. L. Geissler, J. Phys. Chem. B 2014, 118, 6417. [25] P. R. L. Markwick, G. Bouvignies, M. Blackledge, J. Am. Chem. Soc. 2007, 129, 4724. [26] K. Lindorff-Larsen, R. B. Best, M. A. DePristo, C. M. Dobson, M. Vendruscolo, Nature 2005, 433, 128. [27] K. H. DuBay, G. R. Bowman, P. L. Geissler, Acc. Chem. Res. 2015, 48, 1098. [28] M. S. Marlow, J. Dogan, K. K. Frederick, K. G. Valentine, A. J. Wand, Nat. Chem. Biol. 2010, 6, 352. [29] V. Kasinath, K. A. Sharp, A. J. Wand, J. Am. Chem. Soc. 2013, 135, 15092. [30] K. A. Henzler-Wildman, M. Lei, V. Thai, S. J. Kerns, M. Karplus, D. Kern, Nature 2007, 450, 913. [31] J. Chen, C. L. Brooks, P. E. Wright, J. Biomol. NMR 2004, 29, 243. [32] J. L. Radkiewicz and C. L Brooks, J. Am. Chem. Soc. 1999, 122, 225. [33] K. A. Stafford, A. G. Palmer, III, F1000Res 2014, 3, 67. [34] S. Lindert, P. M. Kekenes-Huskey, G. Huber, L. Pierce, J. A. McCammon, J. Phys. Chem. B 2012, 116, 8449. [35] B. Fuglestad, P. M. Gasper, J. A. McCammon, P. R. L. Markwick, E. A. Komives, J. Phys. Chem. B 2013, 117, 12857. [36] V. Hornak, R. Abel, A. Okur, B. Strockbine, A. Roitberg, C. Simmerling, Proteins 2006, 65, 712. [37] D. Long, D.-W. Li, K. F. A. Walter, C. Griesinger, R. Br€ uschweiler, Biophys. J. 2011, 101, 910.

566

Journal of Computational Chemistry 2016, 37, 558–566

[38] R. D. Malmstrom, C. T. Lee, A. T. Van Wart, R. E. Amaro, J. Chem. Theory Comput. 2014, 10, 2648. [39] N. Tjandra, S. E. Feller, R. W. Pastor, A. Bax, J. Am. Chem. Soc. 1995, 117, 12562. [40] A. L Lee, A. P. F Flynn, A. J. Wand, J. Am. Chem. Soc. 1999, 121, 2891. [41] D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark, H. J. C. Berendsen, J. Comput. Chem. 2005, 26, 1701. [42] S. Pronk, S. Pall, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R. Shirts, J. C. Smith, P. M. Kasson, D. van der Spoel, B. Hess, E. Lindahl, Bioinformatics 2013, 29, 845. [43] G. R. Bowman, D. Ensign, V. Pande, J. Chem. Theory Comput. 2010, 6, 787. [44] S. Vijay-Kumar, C. E. Bugg, W. J. Cook, J. Mol. Biol. 1987, 194, 531. [45] W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey, M. L. Klein, J. Chem. Phys. 1983, 79, 926. [46] Y. Duan, C. Wu, S. Chowdhury, M. C. Lee, J. Comput. Chem. 2003, 24, 1999. [47] K. Lindorff-Larsen, S. Piana, K. Palmo, P. Maragakis, J. L. Klepeis, R. O. Dror, D. E. Shaw, Proteins 2010, 78, 1950. [48] A. D. MacKerell, D. Bashford, M. Bellott, R. L. Dunbrack, J. D. Evanseck, M. J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir, K. Kuczera, F. T. Lau, C. Mattos, S. Michnick, T. Ngo, D. T. Nguyen, B. Prodhom, W. E. Reiher, B. Roux, M. Schlenkrich, J. C. Smith, R. Stote, J. Straub, M. Watanabe, J. Wi orkiewicz-Kuczera, D. Yin, M. Karplus, J. Phys. Chem. B 1998, 102, 3586. [49] A. D. Mackerell, M. Feig, C. L. Brooks, J. Comput. Chem. 2004, 25, 1400. [50] B. Hess, J. Chem. Theory Comput. 2007, 4, 116. [51] K. A. Feenstra, B. Hess, H. J. C. Berendsen, J. Comput. Chem. 1999, 20, 786. [52] U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee, L. G. Pedersen, J. Chem. Phys. 1995, 103, 8577. [53] G. Bussi, D. Donadio, M. Parrinello, J. Chem. Phys. 2007, 126, 014101. [54] H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren, A. DiNola, J. R. Haak, J. Chem. Phys. 1984, 81, 3684. [55] M. Parrinello, A. Rahman, J. Appl. Phys. 1981, 52, 7182. [56] E. R. Goedken, J. L. Keck, J. M. Berger, S. Marqusee, Protein Sci. 2000, 9, 1914. [57] B. Webb, A. Sali, Curr. Protoc. Bioinformatics 2014, 47, 5.6.1. [58] W. L. DeLano, The PyMOL Molecular Graphics System, Version 1.6, Schr€ odinger, LLC, 2002. [59] G. Lipari, A. Szabo, J. Am. Chem. Soc. 1982, 104, 4546. [60] A. M. Mandel, M. Akke, A. G. Palmer, III, J. Mol. Biol. 1995, 246, 144. [61] D. C. Chatfield, A. Szabo, B. R. Brooks, J. Am. Chem. Soc. 1998, 120, 5301. [62] D. A. Case, J. Biomol. NMR 1999, 15, 95. [63] G. M. Clore, C. D. Schwieters, Biochemistry 2004, 43, 10678. [64] S. A. Showalter, E. Johnson, M. Rance, R. Br€ uschweiler, J. Am. Chem. Soc. 2007, 129, 14146.

Received: 16 March 2015 Revised: 3 May 2015 Accepted: 24 May 2015 Published online on 16 June 2015

WWW.CHEMISTRYVIEWS.COM

Accurately modeling nanosecond protein dynamics requires at least microseconds of simulation.

Advances in hardware and algorithms have greatly extended the timescales accessible to molecular simulation. This article assesses whether such long t...
790KB Sizes 1 Downloads 8 Views