An accurate and efficient computation method of the hydration free energy of a large, complex molecule Takashi Yoshidome, Toru Ekimoto, Nobuyuki Matubayasi, Yuichi Harano, Masahiro Kinoshita, and Mitsunori Ikeguchi Citation: The Journal of Chemical Physics 142, 175101 (2015); doi: 10.1063/1.4919636 View online: http://dx.doi.org/10.1063/1.4919636 View Table of Contents: http://scitation.aip.org/content/aip/journal/jcp/142/17?ver=pdfcov Published by the AIP Publishing Articles you may be interested in Combine umbrella sampling with integrated tempering method for efficient and accurate calculation of free energy changes of complex energy surface J. Chem. Phys. 141, 044108 (2014); 10.1063/1.4887340 Computation of the free energy due to electron density fluctuation of a solute in solution: A QM/MM method with perturbation approach combined with a theory of solutions J. Chem. Phys. 140, 134111 (2014); 10.1063/1.4870037 Comparison of volume and surface area nonpolar solvation free energy terms for implicit solvent simulations J. Chem. Phys. 139, 044119 (2013); 10.1063/1.4816641 Communication: Free-energy analysis of hydration effect on protein with explicit solvent: Equilibrium fluctuation of cytochrome c J. Chem. Phys. 134, 041105 (2011); 10.1063/1.3535560 Revisiting the finite temperature string method for the calculation of reaction tubes and free energies J. Chem. Phys. 130, 194103 (2009); 10.1063/1.3130083

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 134.124.28.17 On: Tue, 11 Aug 2015 03:22:44

THE JOURNAL OF CHEMICAL PHYSICS 142, 175101 (2015)

An accurate and efficient computation method of the hydration free energy of a large, complex molecule Takashi Yoshidome,1,a) Toru Ekimoto,1 Nobuyuki Matubayasi,2,3 Yuichi Harano,4 Masahiro Kinoshita,5 and Mitsunori Ikeguchi1,b) 1

Graduate School of Medical Life Science, Yokohama City University, 1-7-29, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 2 Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka 560-8531, Japan 3 Elements Strategy Initiative for Catalysts and Batteries, Kyoto University, Katsura, Kyoto 615-8520, Japan 4 Faculty of Pharmaceutical Sciences, Himeji Dokkyo University, 7-2-1 Kamino, Himeji-shi, Hyogo 670-8524, Japan 5 Institute of Advanced Energy, Kyoto University, Uji, Kyoto 611-0011, Japan

(Received 7 January 2015; accepted 21 April 2015; published online 7 May 2015) The hydration free energy (HFE) is a crucially important physical quantity to discuss various chemical processes in aqueous solutions. Although an explicit-solvent computation with molecular dynamics (MD) simulations is a preferable treatment of the HFE, huge computational load has been inevitable for large, complex solutes like proteins. In the present paper, we propose an efficient computation method for the HFE. In our method, the HFE is computed as a sum of ⟨UUV⟩/2 (⟨UUV⟩ is the ensemble average of the sum of pair interaction energy between solute and water molecule) and the water reorganization term mainly reflecting the excluded volume effect. Since ⟨UUV⟩ can readily be computed through a MD of the system composed of solute and water, an efficient computation of the latter term leads to a reduction of computational load. We demonstrate that the water reorganization term can quantitatively be calculated using the morphometric approach (MA) which expresses the term as the linear combinations of the four geometric measures of a solute and the corresponding coefficients determined with the energy representation (ER) method. Since the MA enables us to finish the computation of the solvent reorganization term in less than 0.1 s once the coefficients are determined, the use of the MA enables us to provide an efficient computation of the HFE even for large, complex solutes. Through the applications, we find that our method has almost the same quantitative performance as the ER method with substantial reduction of the computational load. C 2015 AIP Publishing LLC. [http://dx.doi.org/10.1063/1.4919636]

I. INTRODUCTION

Hydration is an imperative factor to discuss a variety of chemical processes in aqueous solutions such as micelle formation, protein folding and aggregation, and binding of drug molecules. In the thermodynamic point of view, it is characterized by the hydration free energy (HFE). Constructing a computation method of the HFE is, thus, a fundamental issue to discuss the chemical processes. In addition to quantitative performance, the method must provide an efficient way for computing the HFE because computation of the HFE for large, complex solutes such as proteins and their aggregations is required for discussing the chemical processes. Although implicit solvent models enable us to compute the HFE with minor effort, they do not take into explicit account the solvent structures around solute and the excluded volume (EV) effect.1 They employ simplified calculations with empirical parameters2–4 and are at the stage a)Present address: Department of Applied Physics, Graduate School

of Engineering, Tohoku University, 6-6-05, Aoba, Aramaki, Aoba-ku, Sendai 980-8579, Japan.

b)Author to whom correspondence should be addressed. Electronic mail:

[email protected] 0021-9606/2015/142(17)/175101/11/$30.00

of active development toward improving the quantitative performance.5 On the other hand, explicit solvent models can naturally incorporate the atomic details and the EV effect in the HFE computation. The thermodynamic integration (TI) and free-energy perturbation (FEP) methods6 are the standard methods for computing the HFE using molecular simulations. In these methods, the HFE can be calculated exactly within a force field used. However, molecular simulations of many intermediate states connecting the initial (reference system containing only water molecules) and final (solution system consisting of water and solute) states are required to obtain the HFE. This leads to too much computational demand for large, complex solutes like proteins. The reference interaction site model (RISM) and related theories7 are approximated statistical thermodynamics theories of molecular liquids. In these theories, the HFE can be calculated only with the distribution functions of solution and reference systems. They do not require any molecular simulations. In spite of their efficiency, their quantitative performance is poor3,8 due to the approximations employed in the theories. Thus, a novel approach is required to construct a computational method for the HFE, which satisfies both of efficiency and quantitative performance.

142, 175101-1

© 2015 AIP Publishing LLC

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 134.124.28.17 On: Tue, 11 Aug 2015 03:22:44

175101-2

Yoshidome et al.

J. Chem. Phys. 142, 175101 (2015)

Recently, two fascinating methods for computing the hydration thermodynamic quantities have been proposed. One is the method of energy representation (ER method).9–12 In the method, computation of the HFE is possible using the data obtained through the two molecular dynamics (MD) simulations of the solution and reference systems. Since no intermediate states are required, a large reduction of computation is achieved. The ER method exploits the following approximated free-energy functional to compute the HFE, ∆µ:  ∆µ = duUVuUV ρe (uUV) ′ − F[ρe (uUV), ρe0 (uUV), χ0(uUV,uUV )],

(1)

where F is an approximated functional whose form is described in Refs. 10–13, uUV is the pair interaction energy between solute and water molecule, ρe (uUV) and ρe0 (uUV) are the energy distribution functions of solution and reference ′ systems, respectively, and χ0(uUV,uUV ) is the correlation matrix of the reference system. The energy distribution function and the correlation matrix of the reference system are obtained through the test-particle insertion of the solute into the reference system. It has shown that the ER method can quantitatively reproduce exact values of the HFE of the amino-acid analogs calculated by the TI.3 Thus the ER method provides more efficient computation than the TI, and almost the same accuracy as it. The method has also been applied to investigations of the HFE of proteins.4,13,14 To make the energetics right, however, the number of water molecules needs to be large enough to account well for the long-range interaction. The typical number of water in a simulation box is ∼105 for large, complex solutes like proteins, which demands heavy computational load for energetics including the HFE calculation in the ER method. Thus, improvements would be desirable to apply the ER method to large, complex solutes. The other is the morphometric approach (MA) which enables us to calculate the thermodynamic quantities that are primarily governed by the geometry of solute.15,16 One of them is the hydration entropy (HE). It has shown that the HE is almost independent of the solute-solvent interactions,17,18 indicating that it is almost determined by the geometry of the solute. In the MA, such a quantity is expressed using only four geometric measures with a fixed structure and corresponding coefficients. The resultant form of the quantity Z is Z = C1Vex + C2 A + C3 X + C4Y.

(2)

Here, Vex is the EV, A is the water-accessible surface area (ASA), and X and Y are the integrated mean and Gaussian curvatures of the accessible surface, respectively, and they form the four geometric measures. The accessible surface is the surface that is accessible to the centers of water molecules.19 The EV is the volume that is enclosed by the accessible surface.19 We note that the solute shape enters Z only via the four geometric measures. Thus, once the coefficients (C1, C2, C3, and C4) in Eq. (2) are determined, Z of a solute with any structure is obtained only from the four geometric measures. In previous studies, the four coefficients are determined using such data as the values of Z for hard-sphere solutes with various sizes, which are obtained

through the integral equation theory or its angle-dependent version,16,20 and experimental data of small solutes.21 It has shown that the experimentally measured changes in the thermodynamic quantities upon folding of apoplastocyanin can well be reproduced20 using a hybrid of the angle-dependent integral equation theory8 and the MA. Since the computation of the solvation entropy of one structure of protein G is finished in less than 0.1 s on a small personal computer,16 the MA can be applied to very large proteins such as F1-ATPase.22 Thus, fast and accurate computation of the HE is possible using the MA. The application to the computation of the HFE of hydrophobic solutes is also possible.23,24 However, the MA cannot be applied to the HFE of a solute with long-range solute-water pair interactions because the hydration energy strongly depends on the interactions.18 In the present study, we propose a novel computation method of the HFE using a hybrid of the ER method and the MA. In our method, ∆µ is expressed as the sum of two terms: one of them, referred to as ∆µDirect, is ⟨UUV⟩/2 where ⟨UUV⟩ is the ensemble average of the sum of pair interaction energy between solute and water molecule at a fixed conformation of the solute; and the other term, referred to as ∆µIndirect, is primarily determined by the geometry of a solute. In order to reduce the computational load of ∆µ, we propose that ∆µIndirect is calculated by the MA. The coefficients are determined using the data of ∆µIndirect obtained through the ER method. Since the computation of ∆µIndirect is finished in less than 0.1 s once the coefficients are determined, the use of the MA leads to a reduction of computation load of ∆µ compared to the TI, the FEP, and the ER methods and to applications of our method to large, complex solutes. Thus, our method resolves the issues of both of the ER method and the MA: it enables us to apply the MA to the computation of ∆µ and computational load of ∆µ of our method is smaller than that of the ER method. We show that our method has almost the same quantitative performance as the ER method by comparing the results with those obtained through the ER method. For example, we calculate the values of the HFE of the protein conformations. The deviation of the values of the HFE obtained by our method from those by the ER method is found to be less than 2%. Thus our method satisfies both of efficiency and quantitative performance. The efficiency enables us to calculate the values of the HFE for a set of protein conformations. In order to demonstrate this, we apply our method to the discrimination of the native structure of a protein from misfolded decoy structures. We find that the discrimination is successful.

II. STRATEGY FOR CALCULATION OF THE HYDRATION FREE ENERGY

Our strategy of computing the HFE is based on the results obtained by Karino and Matubayasi.14 Using all-atom MD simulations coupled with the ER method, they calculated ∆µ for a set of conformations of cytochrome c sampled in the course of equilibrium fluctuation. They found the following relation: ∆µ ≃

⟨UUV⟩ + constant, 2

(3)

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 134.124.28.17 On: Tue, 11 Aug 2015 03:22:44

175101-3

Yoshidome et al.

J. Chem. Phys. 142, 175101 (2015)

where the constant is almost invariable over the set of conformations examined. The parallelism between ∆µ and ⟨UUV⟩/2 can be understood as follows. ⟨UUV⟩ is composed of the electrostatic and van der Waals energies between protein and water. We focus on the former energy in the discussion because it is the principal component of ⟨UUV⟩. In the linearresponse-type approximation,25,26 the free-energy change for placing charge q into a solute, ∆µq , is a half of the change in the solute-water interaction energy upon placing q, ∆UUV,q , ∆µq =

∆UUV,q . 2

(4)

∆µq can also be written as ∆µq = ∆UUV,q + ∆UVV,q − T∆SV,q ,

(5)

where ∆UVV,q and ∆SV,q are the water-water energy change and water-entropy change upon placing q, respectively, and T is the absolute temperature. Equations (4) and (5) yield ∆UUV,q = −(∆UVV,q − T∆SV,q ). (6) 2 We note that the right-hand side of Eq. (6) represents the free-energy change arising from the water reorganization accompanying the insertion of charge q into a solute. We can thus conclude that the compensation between a half of ∆uUV,q and the free-energy change arising from the water reorganization leads to Eq. (4). From the discussion, ⟨UUV⟩/2 in Eq. (3) can be the free energy arising from the direct interactions between protein and water. We denote this free energy as ∆µDirect hereafter. From the discussion above, the difference ∆µ − ∆µDirect is essentially independent of the protein-water interaction potentials and is mainly determined by the geometry of the protein. In the difference, indirect interactions such as the EV effect1 are involved. We thus denote the difference as ∆µIndirect hereafter. Since ∆µDirect can readily be computed through an MD of a solution system, a computation of ∆µIndirect with minor effort leads to a reduction of the computation load of ∆µ. In the present study, we assume that it is described by the MA,15,16 ∆µIndirect = C1Vex + C2 A + C3 X + C4Y.

(7)

The four coefficients are determined as follows; the values of ∆µIndirect for solutes with various sizes are obtained using MD simulations coupled with the ER method (see Eq. (10) in Sec. III A); we calculate the geometric measures for each of the solutes by means of the extension16 of Connolly’s algorithm;27,28 and the four coefficients are determined using the least squares fitting to Eq. (7). As discussed in Secs. III and IV, the values of ∆µIndirect obtained through the ER method can well be described by the MA. This indicates that ∆µIndirect for an arbitrary solute involving large, complex solutes can be obtained only through calculating the four geometric measures of the solute, once the four coefficients are determined. Since the computation time is less than 0.1 s, a substantial reduction of computation load is achieved. With the four coefficients of ∆µIndirect in the MA, the HFE of a solute is obtained in the following procedure. We first perform an MD simulation of a solution system with the

structure of the solute fixed. ⟨UUV⟩ is then exactly obtained through  ′ ′ ′ ⟨UUV⟩ = duUV uUV ρe (uUV ). (8) The four geometric measures are also calculated with the solute structure used in the MD simulation. Finally, ∆µ is obtained as ∆µ = ∆µDirect + ∆µIndirect.

(9)

We note that ∆µIndirect is computed within 0.1 s. Further, ∆µDirect is also readily computed from an MD of the solution system. Thus, our method provides an efficient computation of the HFE for large, complex solutes like proteins compared to the other methods coupled with MD simulations. Compared with the other methods combined with MD simulations, our method uses only MD data of solution system once the four coefficients in the MA are given. This suggests that if an MD of solution system was already conducted for such purposes as elucidation of mechanism of protein function, only additional computations of ⟨UUV⟩ and ∆µIndirect provide ∆µ. On the other hand, two MD for the ER method and many MD for the TI and FEP methods are required to obtain ∆µ. For example, in the ER method, MD data of reference system are also required to obtain ∆µ. As discussed in Sec. III, such the difference is also a reason that our method provides more efficient computation of ∆µ than the ER method. Through the performance tests using the protein conformations that are not used for the determination of the four coefficients, we demonstrate in Secs. III and IV that the present method has almost the same quantitative performance of ∆µ as the ER method. Thus, we succeed in reducing the computational demand with keeping quantitative performance. We refer our method to “ER-MA method,” hereafter, because it is a hybrid of the ER method and the MA. Finally, we note the difference of our decomposition scheme of ∆µ into ∆µDirect and ∆µIndirect from the one based on the Weeks-Chandler-Anderson (WCA) scheme.29 Since UUV is composed of the electrostatic and van der Waals (LennardJones (LJ)) interactions and the latter can be decomposed into purely attractive and repulsive interactions according to the WCA scheme, ∆µ can be decomposed into three components arising from the attractive and repulsive interactions of the LJ interaction and from the electrostatic interactions.29 In contrast, our decomposition scheme is based on the results obtained by Karino and Matubayasi:14 ∆µDirect is ⟨UUV⟩/2, and we define the difference ∆µ − ∆µDirect as ∆µIndirect. Since ⟨UUV⟩ involves both of attractive and repulsive interactions, our decomposition scheme is not based on the WCA scheme.

III. RESULTS

In this section, we perform two computations: one of them is to determine the four coefficients in the MA, and the other is to test quantitative performance of the ER-MA method. To concentrate on the discussion of usefulness of the ER-MA method, we only describe a brief summary of the computations, and their details are in the supplementary material.30

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 134.124.28.17 On: Tue, 11 Aug 2015 03:22:44

175101-4

Yoshidome et al.

In all the computations, we employ solutes with total charge zero for the following reason. When ∆µ is computed using a method coupled with MD simulations, the systemsize dependence is essentially absent for the solute with no total charge,31 while the solute charge non-neutrality leads to a system-size dependence of ∆µ31–33 due to the periodic boundary conditions applied in MD simulations. Since the ER method, employed for determining the four coefficients in the MA, exploits the data of MD simulations, it is no exception. System-size dependence of ∆µ implies that the coefficients in the MA applied to ∆µIndirect would depend on the system size when solutes with nonzero total charge are employed for the determination. We therefore focus on the solutes with total charge zero in the present study. It should be noted that the present focus is arisen not from issues of the MA and the ER method but from those of MD simulations. A. Determination of the four coefficients of ∆µIndirect in the morphometric approach

To determine the four coefficients, we first obtain ∆µIndirect of 76 solutes using the ER method. The solutes have various sizes from small solutes like amino-acid analogs to large solutes like proteins. For the computation of ∆µIndirect using the ER method, MD simulations of solution and reference

J. Chem. Phys. 142, 175101 (2015)

systems are required. The MD of the solution systems are performed with the solute structures fixed. Namely, only the water molecules move. ∆µIndirect is obtained through ∆µIndirect = ∆µ −

⟨UUV⟩ . 2

(10)

We also calculate the four geometric measures of each of the solutes. In calculating them, each solute is modeled as a set of fused hard spheres. The diameter of each atom is set at the σ-value of the Lennard-Jones potential parameters of CHARMM22.34 That of water molecule is set at 2.8 Å in accordance with the previous studies focusing on the waterentropy effect.16,20,22,35,36 Finally, the four coefficients are determined by the least-square fitting applied to Eq. (7). The values of the thermodynamic quantities and the four geometric measures are summarized in the supplementary material.30 To confirm the performance of the fitting to the data for ∆µIndirect, we compare the values of ∆µIndirect obtained through the ER method and the MA. The comparison is illustrated in Figures 1(a) and 1(b). Figure 1(b) enlarges near the origin of Figure 1(a). We denote the values calculated from the ER method and the MA as ∆µIndirect,ER and ∆µIndirect,MA, respectively. Solute size becomes larger as the value of ∆µIndirect increases. Surprisingly, the agreement between the ER method and the MA is extremely good even for the pro-

FIG. 1. (a) Comparison between ∆µ Indirect,ER and ∆µ Indirect,MA for 76 solutes used for the determination of the four coefficients in the MA. (b) Magnification at around the origin of (a). The dotted line represents ∆µ Indirect,ER = ∆µ Indirect,MA. The values of the difference between ∆µ Indirect,ER and ∆µ Indirect,MA as a function of ∆µ Indirect,ER is illustrated in (c) (the proteins) and (d) (the other solutes), respectively.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 134.124.28.17 On: Tue, 11 Aug 2015 03:22:44

175101-5

Yoshidome et al.

J. Chem. Phys. 142, 175101 (2015)

teins. Standard error of the fitting is 7.4 kcal/mol. Figures 1(c) and 1(d) show the values of the difference between ∆µIndirect,ER and ∆µIndirect,MA. In the figures, “Difference” is defined by ∆µIndirect,ER − ∆µIndirect,MA. The values of “Difference” for the proteins are in Figure 1(c), and the other solutes are in Figure 1(d), respectively. It follows from Figure 1(d) that most of the values are smaller than the standard error. As for the proteins, some of the values are larger than the standard error. However, the values of “Difference” are much smaller than those of ∆µIndirect,ER because the values of the deviation DIndirect defined by DIndirect ≡ 100 ×

∆µIndirect,ER − ∆µIndirect,MA ∆µIndirect,ER

(%)

(11)

are less than 6%. The present results strongly suggest that ∆µIndirect can well be described by the MA. They are surprising results because the theoretical basis is different between the MA and the ER method. Furthermore, computation of ∆µIndirect using the MA is finished in less than 0.1 s even for large, complex solutes like proteins once the four coefficients in the MA are given. Thus, the MA provides a more efficient computation of ∆µIndirect than the ER method.

that were slightly optimized in an implicit solvent in our previous study.35 Among the decoy structures, we find a decoy structure for which the value of the protein intramolecular energy, EC, is smaller than that for the native structure. We refer this decoy structure to “EC-min structure” hereafter. In addition to the native and EC-min structures, we also select the decoy structures whose root mean square displacement (RMSD) for Cα atoms from the native structure is closest to 4, 6, 8, 10, 12, 14, 16, 18, and 22 Å in accordance with the previous study.13 We name these “C4,” “C6,” “C8,” “C10,” “C12,” “C14,” “C16,” “C18,” and “C22” structures hereafter. Furthermore, we generate three decoy structures that have smaller values of EC than the native structure by minimizing the decoy structures in vacuum. The minimization is performed using the CHARMM biomolecular simulation program38 through the Multi-scale Modeling Tools in Structural Biology (MMTSB) program.39 Values of the Cα -RMSD for the generated structures from the native structure are 9.16 Å, 11.6 Å, and 13.7 Å, respectively. They are referred to as “EM9,”“EM11,” and “EM13” structures hereafter. In summary, the native and 13 decoy structures are employed for the performance tests.

B. Performance tests of the ER-MA method

1. Accuracy of the hydration free energy for the protein structures

We demonstrate usefulness of the ER-MA method through the two performance tests. As one of the tests, we calculate ∆µIndirect and ∆µ of the protein structures by the ERMA and ER methods to check whether the ER-MA method can quantitatively reproduce the values of ∆µIndirect and ∆µ obtained through the ER method. We also test whether the native structure of the protein can be discriminated from the misfolded decoy structures using the ER-MA method. Finally, computation time for ∆µ of a protein structure is compared between the ER-MA and ER methods. The protein we use for the two performance tests is transcription factor LFB1 (PDB code: 1lfb). We note that it is not used for the determination of the four coefficients of ∆µIndirect in the MA. Its native and 999 misfolded decoy structures are in the rosetta decoy set.37 We use the structures

Comparison of the values of ∆µIndirect and ∆µ obtained through the ER-MA and ER methods is discussed here. In the computation using the ER-MA method, MD simulations of the solution systems are performed with the protein structures fixed. Using the data obtained through the MD simulations, ∆µ is computed. In the computation of ∆µ using the ER method, we use the same MD data of the solution systems, and an MD of reference system is also performed. ∆µIndirect,ER is also computed through Eq. (10). Note that the four coefficients in Eq. (7) used here for 1lfb are taken from the fitting done for other solutes in Subsection III A; no information about 1lfb is employed in the calculation of ∆µIndirect,MA. The results are illustrated in Figures 2(a) (∆µIndirect) and 2(b) (∆µ), respectively. In the figures, we plot DIndirect (Eq. (11)) and DHFE defined by

FIG. 2. (a) DIndirect plotted as a function of ∆µ Indirect,ER and (b) DHFE plotted as a function of ∆µ ER. The protein is 1lfb.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 134.124.28.17 On: Tue, 11 Aug 2015 03:22:44

175101-6

Yoshidome et al.

DHFE ≡ 100 ×

∆µER-MA − ∆µER (%), ∆µER

J. Chem. Phys. 142, 175101 (2015)

(12)

as a function of ∆µIndirect,ER and ∆µER, respectively. Here ∆µER-MA and ∆µER are the HFE obtained through the ER-MA and ER methods, respectively. The values of |DIndirect| and |DHFE| are less than 3.5% and 2%, respectively. The former values are less than the values of |DIndirect| of the proteins used for the fitting (≤6%), indicating the accuracy of the MA applied to ∆µIndirect. Thus, the ER-MA method has almost the same quantitative performance as the ER method. This is very encouraging because the ER-MA method enables us to achieve a reduction of the computation load with keeping quantitative performance.

2. Application to the discrimination of the native structure from misfolded decoy structures

The results of discriminating the native structure of 1lfb from the selected misfolded decoy structures are described here. We first discuss the free-energy function for the discrimination. The free-energy change upon inserting the protein with a fixed structure into water, ∆F, is ∆F = EC + ∆µ.

(13)

∆F should take the smallest value for the native structure. We do not take the protein conformational entropy, SC, into consideration in the following reason:36 Since both the native and decoy structures are very compact, it can be assumed that the difference in SC between the native structure and any decoy structure is much smaller than that in the other components. We use ∆F for the discrimination. The values of ∆µ obtained through the ER-MA method in Subsection III B 1 are used. Figure 3(a) shows the values of EC plotted as a function of Cα -RMSD from the native structure. It follows that EC is smaller for EC-min, EM9, EM11, and EM13 structures than for the native structure (denoted as “N” in the figure). This suggests that the factor that stabilizes the native structure is the HFE. The plot of ∆µER-MA is shown in Figure 3(b). ∆µER-MA is higher for EC-min, EM9, EM11, and EM13 structures than for the native structure. The values of ∆F are shown in Figure 3(c). It takes the lowest value for the native structure. Thus the native structure is successfully discriminated from the decoy structures used. We note that the differences of ∆F between the native structure and the decoy structures are larger than the standard error of ∆µIndirect, ∼7 kcal/mol. As shown in Figure 3(a), the discrimination fails if only EC is used because EC-min, EM9, EM11, and EM13 structures have lower values of EC than the native structure. The factor that stabilizes the native structure is ∆µER-MA (see

FIG. 3. The values of E C (a), ∆µ ER-MA (b), and ∆F (c) as a function of the Cα -RMSD from the native structure. The protein is 1lfb.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 134.124.28.17 On: Tue, 11 Aug 2015 03:22:44

175101-7

Yoshidome et al.

Figure 3(b)): its value is smaller for the native structure than for the four structures above. However, ∆µER-MA is also not the smallest for the native structure. We can thus conclude that both of the intramolecular energy and the HFE are required for the discrimination of the native structure. 3. Comparison of the computational time of the hydration free energy between the ER-MA and ER methods

Finally, computational time for ∆µ of 1lfb is compared among the ER-MA and ER methods. To this end, we first describe the procedures for computing ∆µ. To obtain ∆µ using the ER method, the following computations are required in addition to an MD of solution system: 1. computation of ρe (uUV), 2. an MD of reference system, ′ 3. computation of ρe0 (uUV) and χ0(uUV,uUV ). ′ Once ρe (uUV), ρe0 (uUV), and χ0(uUV,uUV ) are obtained, ∆µ is computed with minor effort using the free-energy functional.10,12,13 On the other hand, using MD data of solution system and the four coefficients of ∆µIndirect in the MA, the ER-MA method provides ∆µ through the following procedure:

(i) computation of ρe (uUV), (ii) computation of ⟨UUV⟩ through Eq. (8), (iii) computation of ∆µIndirect using the MA (Eq. (7)). We note that the computation times for procedures (ii) and (iii) are much smaller than the computation time for procedure (i). In both of the methods, most of the computation time for ∆µ is devoted to MD of solution system. However, since it is common between the two methods, we compare the computation times for the other procedures. Namely, the computation time for the other procedures in the ER method, which is a sum of the computation times for procedures 1, 2, and 3, is compared with that in the ER-MA method, which is almost the same as the computation time for procedure (i). Surprisingly, we find that the computation time for the other procedures in the ER-MA method is about one-fifth of that in the ER method. We can conclude that the ER-MA method provides much more efficient computation of ∆µ than the ER method when the computation time for the solution MD is not counted. The reasons for reduction of the computation time are the following two. By comparing the procedures between the ERMA and ER methods, the procedures 2 and 3 in the ER method are not necessary for the ER-MA method because only MD data of solution system are required in computing ∆µ using the ER-MA method. This is one of the reasons for reduction of the computation time. Another reason is the use of the MA, which enables us to finish the computation of ∆µIndirect in less than 0.1 s. The procedure (i) in the ER-MA method can be eliminated as follows. Since UUV can easily be computed using a snapshot of the MD of solution system, its computation can be performed in parallel with the MD. After the MD, ⟨UUV⟩ is obtained through the ensemble averaging of UUV, which can be performed with minor effort. Thus, simultaneous conduction

J. Chem. Phys. 142, 175101 (2015)

of MD and computation of ⟨UUV⟩ eliminates the procedure (i) and leads to a further reduction of computation time for ∆µ.

IV. DISCUSSION A. Importance of the excluded volume term in ∆µIndirect

Since the effects of solute-water pair interactions are completely involved in ∆µDirect, ∆µIndirect can be the free energy arising from the cavity formation. The computation of the free energy has ever been performed within a simple assumption. Carlson and Jorgensen40 have shown that the values of the HFE for small solutes can well be fitted by the equation similar to Eq. (3). They performed three types of the fitting to the free energy arising from the cavity formation by assuming that it is proportional to molecular volume, molecular surface area, or ASA of a solute. Although it is unclear whether such proportional relation is also true for large, complex solutes like proteins, in most of the studies of the HFE of proteins, the free energy arising from the cavity formation is estimated by assuming that it is scaled by ASA. Here, we show in our fitting that EV term in Eq. (7) is indispensable to quantitatively describe the behavior of ∆µIndirect,ER of solutes with various sizes from small solutes like amino-acid analogs to large solutes like proteins. To this end, we determine the coefficients of ∆µIndirect by fitting the following two forms in addition to Eq. (7): ∆µIndirect,ASA = C2 A, ∆µIndirect,EVASA = C1Vex + C2 A.

(14) (15)

Figure 4 shows the comparison between ∆µIndirect,ER and three types of fitting forms of ∆µIndirect (∆µIndirect,MA (black points), ∆µIndirect,ASA (blue points), and ∆µIndirect,EVASA (red points), respectively) for 76 solutes used for the determination of the coefficients. Black points are the same as those in Figure 1(a). Standard errors of the fitting are 78 kcal/mol for

FIG. 4. The values of ∆µ Indirect,ER for 76 solutes are compared with those of ∆µ Indirect,MA (black points), ∆µ Indirect,ASA (blue points), and ∆µ Indirect,EVASA (red points), respectively.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 134.124.28.17 On: Tue, 11 Aug 2015 03:22:44

175101-8

Yoshidome et al.

J. Chem. Phys. 142, 175101 (2015)

TABLE I. The values of the coefficients of ∆µ Indirect. “SE” is the standard error of fitting.

(coefficients)A (coefficients)L (coefficients)S (coefficients)R1 (coefficients)R2 (coefficients)R3 (coefficients)R4

C 1 ((kcal/mol)/Å3)

C 2 ((kcal/mol)/Å2)

0.048 142 0.048 074 0.054 669 0.046 962 0.047 753 0.048 762 0.048 547

−0.034 343 −0.034 282 −0.059 657 −0.030 357 −0.032 662 −0.036 637 −0.036 123

∆µIndirect,ASA and 7.4 kcal/mol for ∆µIndirect,EVASA, respectively. It follows that only the ASA term cannot reproduce the values of ∆µIndirect,ER. On the other hand, ∆µIndirect,EVASA has almost the same quantitative performance as ∆µIndirect,MA. We can thus conclude that the EV term is important for the quantitative performance of ∆µIndirect over a wide range of solutes from amino-acid analogs to proteins. The present result indicates that the curvature term (C3 X + C4Y in Eq. (7)) has only a minor contribution to ∆µIndirect,MA. However, we found in previous studies that the contribution becomes larger when T becomes lower.23,24 It also strongly affects the temperature dependence of the hydration thermodynamic quantities.23,24 In addition, it has been proposed for spherical hydrophobic solutes that at 1 atm, the HFE for small solutes primarily depends on the volume of the solute and that it is scaled by γ A for large solutes41 where γ is the liquid-vapour surface tension. We discussed in the previous papers23,24 that the weakening of the hydrophobicity at low temperatures, leading to cold denaturation of proteins, cannot be explained if the hydrophobicity of the proteins is the same as that of large solutes. This is because γ is a decreasing function of the temperature. Note that C1 and C2 are not equal to the pressure and the surface tension.

C 3 ((kcal/mol)/Å) C 4 (kcal/mol) SE (kcal/mol) −0.086 142 −0.055 317 1.117 7 −0.261 17 −0.043 582 −0.068 764 −0.049 463

0.096 946 0.117 79 −0.566 41 −0.079 971 −0.925 51 1.748 4 −0.629 26

7.4 11 3.1 7.6 7.3 6.0 8.0

material,30 there is no redundancy of the removing 19 solutes among the groups. Each group is composed of 57 solutes with various sizes. The four coefficients are then determined using the solutes in each group. Hereafter, they are denoted as (coefficients)R1, (coefficients)R2, (coefficients)R3, (coefficients)R4, respectively. The determined four coefficients are summarized in Table I. The coefficients determined in Sec. III A are denoted as (coefficients)A, for which all of the 76 solutes were used for the fitting. It follows that the values of C1 and C2 are almost the same among the coefficients except for (coefficients)S. Although those of C3 and C4 are largely different among them, the discussion in Subsection IV A suggests that such difference would have only minor effects on the performance of the MA applied to ∆µIndirect. On the other hand, the values of C1 and |C2| for (coefficients)S are larger than those for the other coefficients. Thus, the values of C1 and C2 are fairly independent of the selection of solutes when proteins are contained. The standard error of the fitting is the smallest for (coefficients)S (3.1 kcal/mol) and the largest for (coefficients)L (11 kcal/mol), respectively, implying that the value of the standard error of the fitting for (coefficients)A (7.4 kcal/mol) is the result of the fitting using both of small and large solutes.

B. Dependence of the coefficients of ∆µIndirect in the morphometric approach on the selection of solutes

In Sec. III A, the four coefficients of ∆µIndirect are determined using 76 solutes with various sizes from small solutes like amino-acid analogs to large solutes like proteins. Thirty-one solutes are proteins, and the other 45 solutes have much smaller sizes than proteins. Here, we discuss how the values of the coefficients depend on the choice of solutes. In the present study, three types of determination are performed. In two of them, we first divide the 76 solutes into two groups: Group L composes of 31 proteins and the other solutes are involved in group S. The four coefficients are then determined using the solutes in each group. From the determined coefficients, denoted by (coefficients)L and (coefficients)S, respectively, we can discuss whether the values of the four coefficients depend on the size of the solutes. Through the other type of the determination illustrated hereafter, we also discuss how much the coefficients change if the number of the solutes is reduced. To this end, we first divide the 76 solutes into four groups, groups R1, R2, R3, and R4, each of which is prepared by removing 19 solutes from the 76 solutes. As shown in Tables S6 and S7 in the supplementary

FIG. 5. DIndirect is plotted against ∆µ Indirect,ER. The native and 13 decoy structures of 1lfb are used. Black points are the results obtained using (coefficients)A and are the same as those in Figure 2(a). The other points are the results obtained using (coefficients)L (red), (coefficients)S (blue), (coefficients)R1 (green), (coefficients)R2 (cyan), (coefficients)R3 (magenta), and (coefficients)R4 (orange), respectively. The red points are almost overlapped with the black points.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 134.124.28.17 On: Tue, 11 Aug 2015 03:22:44

175101-9

Yoshidome et al.

J. Chem. Phys. 142, 175101 (2015) TABLE II. The values of the coefficients in Eq. (16). α

β

0.506 62 0.570 31

C 1 ((kcal/mol)/Å3) C 3 ((kcal/mol)/Å) C 4 (kcal/mol) 0.041 32

−2.0361

1.0417

amino-acid analogs and water, the free-energy change upon placing charge was shown to be deviated from Eq. (4).42,43 Thus, it is unclear why the linear-response-type form of ∆µDirect works well. To answer this, we fit ∆µ using the following form: ∆µ = α⟨UUV,elec⟩ + β⟨UUV,LJ⟩ + C1Vex + C3 X + C4Y,

FIG. 6. A is plotted against ⟨UUV,LJ⟩.

We test the performance of each of the coefficients with the native and 13 decoy structures of 1lfb used in Sec. III. To this end, we calculate the values of ∆µIndirect using the ER and ER-MA methods, and then DIndirect is obtained. As illustrated in Fig. 5, the values of |DIndirect| become larger for (coefficients)S than for (coefficients)A, indicating that the performance worsens. On the other hand, the performance of the other coefficients is essentially the same as that of (coefficients)A. Thus, in order for the good performance, proteins should be selected when we determine the four coefficients. C. Discussion of a linear-response-type form of ∆µDirect

The results shown in Sec. III validate the linear-responsetype form of ∆µDirect as well as the morphometric form of ∆µIndirect. However, ⟨UUV⟩ is a sum of the electrostatic and van der Waals energies between solute and water, and the freeenergy change upon turning on the van der Waals interaction between solute and water is unlikely to obey the linearresponse-type form. Furthermore, for small solutes such as

(16)

where ⟨UUV,elec⟩ and ⟨UUV,LJ⟩ are the electrostatic and van der Waals components of ⟨UUV⟩, respectively. C2 A is not included for the fitting because of the strong correlation between A and ⟨UUV,LJ⟩ as shown in Fig. 6. The values of the coefficients in Eq. (16) are summarized in Table II. It follows that the value of α is ∼0.5, indicating that the linear-response-type form is valid for the electrostatic interaction between solute and water. By contrast, the value of β is deviated from 0.5. It is likely that the deviation is absorbed into ∆µIndirect. Actually, the values of C3 and C4 are changed from those in ∆µIndirect, while the value of C1 is essentially unchanged. The change in values of C3 and C4 and the correlation between ⟨UUV,LJ⟩ and A suggest that the deviation is absorbed into C2 A + C3 X + C4Y of ∆µIndirect. D. Behavior of the hydration free energy upon fluctuation of the native structure of a protein

We have tested so far the performance of the ER-MA method using the solutes with fixed structures. However, the structural flexibility of solute would induce fluctuation of the value of ∆µ. Here, we investigate how ∆µ, ∆µDirect, ∆µIndirect, and EC change upon small fluctuation of the native structure of 1lfb. To sample a set of structures in the course of equilibrium fluctuation, we first perform a 30-ns MD simulation of the flexible protein molecule. Then, 20 structures are saved every 1 ns between 11 and 30 ns. ∆µ for each of the structures is

FIG. 7. (a) The values of ∆µ (red), E C (blue), and ∆µ + E C (black) plotted against the simulation time. Those of ∆µ Direct (orange) and ∆µ Indirect (magenta) are plotted in (b).

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 134.124.28.17 On: Tue, 11 Aug 2015 03:22:44

175101-10

Yoshidome et al.

computed using the ER-MA method. Details are described in the supplementary material.30 Figures 7(a) and 7(b) show the values of ∆µ, ∆µDirect, ∆µIndirect, EC, and ∆µ + EC as functions of simulation time, t, in 30-ns MD. In the figures, the values of ∆X(t) ≡ X(t) − X(t = 11 ns) where X(t) is a quantity at t are plotted. It follows that while ∆µ and EC fluctuate significantly, the compensation of the fluctuations of these two quantities leads to a much smaller fluctuation of ∆EC + ∆µ. When we look at the values of ∆µDirect and ∆µIndirect, ∆µIndirect is almost constant, indicating that the fluctuation of ∆µ arises from that of ∆µDirect. Actually, it follows from Figure 7(b) that ∆µDirect and EC are compensating. We also compare the values of ∆µIndirect and ∆µ obtained using the ER and ER-MA methods. We find that the averages of |DIndirect| and |DHFE| among the 20 structures are 1.3% and 0.9%, respectively, indicating that the ER-MA method has almost the same quantitative performance as the ER method in the case of equilibrium fluctuation. The result also validates the linear-response-type form of ∆µDirect in the present case. This is consistent with the result by Karino and Matubayasi.14 V. CONCLUSION

In the present paper, we have proposed an accurate and efficient computational method of the HFE, ER-MA method, by combining the ER method with the MA. In the method, the HFE is expressed as ∆µ = ∆µDirect + ∆µIndirect. Here, ∆µDirect is a half of ⟨UUV⟩ and can be the free energy arising from the direct interactions between solute and water. ∆µIndirect mainly reflects indirect interactions such as the EV effect. We have thus assumed that it is described by the MA, Eq. (7). This description provides a computation of ∆µIndirect with minor effort because ∆µIndirect can be computed in less than 0.1 s once the four coefficients in the MA are determined. Furthermore, ⟨UUV⟩ leading to ∆µDirect can readily be computed using the data obtained through an MD simulation of a solution system. These enable us to perform a more efficient computation of the HFE than the other methods coupled with MD simulations and to apply the ER-MA method to large, complex solutes like proteins. In order to validate the assumption of using the MA for ∆µIndirect, we have determined the four coefficients as follows. First, the values of ∆µIndirect for 76 solutes with total charge zero have been computed using the ER method. The solute sizes have ranged from small solutes such as amino-acid analogs to large solutes such as proteins. During the MD simulations required for the computation of ∆µIndirect using the ER method, the structures of the solutes are fixed at a certain conformation and only the water molecules are moved. After the four geometric measures of each of the solutes have been calculated, the four coefficients have been determined using the least-square fitting applied to Eq. (7). We have found that the values of ∆µIndirect obtained through the ER method can quantitatively be described by the MA, validating the assumption. We have also discussed that the values of C1 and C2 are almost independent of selection of solutes when proteins are contained. A linear-response-type form of ∆µDirect has also

J. Chem. Phys. 142, 175101 (2015)

been validated by fitting ∆µ using Eq. (16): The coefficient of ⟨UUV,elec⟩ is ∼0.5, and although that of ⟨UUV,LJ⟩ deviates from 0.5, we have discussed that the deviation would be absorbed into C2 A + C3 X + C4Y in ∆µIndirect. We have calculated ∆µ for a set of conformations of a protein, which are not used for the determination of the four coefficients in the MA. We have found the excellent agreement between the ER-MA method and the ER method: The absolute values of the deviations are less than 2%. Thus, the ER-MA method has almost the same accuracy as the ER method, while the former provides more efficient computation of ∆µ than the latter. Improvements of the performance of the ER-MA method would be possible by increasing the number of solutes used for the determination of the four coefficients of ∆µIndirect and by improving the form of ∆µDirect. These improvements are in progress. We have applied the ER-MA method to the discrimination of the native structure of a protein from the misfolded decoy structures. The discrimination has been found to be successful. We have also discussed the changes in the thermodynamic quantities upon equilibrium fluctuation of the native structure of a protein. Although fluctuations of ∆µ and EC are significantly large, compensation among them leads to a small fluctuation of ∆µ + EC. By decomposing ∆µ into ∆µDirect and ∆µIndirect, we have found that the fluctuation of ∆µ arises from the former. We have also shown that the linearresponse-type form of ∆µDirect is also valid in the case of equilibrium fluctuation. Other applications to such various phenomena as ligand binding are in progress. The application to solutes with nonzero total charge is also in progress. In calculating the HFE for such solutes, due to the finite size of solution system and the periodic boundary conditions, long-range interactions between the solution system and its own periodic images strongly affect the value of the HFE,31–33 although such interactions do not exist at infinite dilution corresponding to experimental situations (it is empirically known that the affection is negligible for the solutes with zero total charge31). In order for the application, thus, it is necessary to perform corrections for the systemsize dependence of ∆µ before the coefficients of ∆µIndirect are determined. Several correction schemes have been proposed for small solutes,31–33 and recently some of us found a suitable correction for ∆µ of proteins.44 With the correction schemes, we are now extending the ER-MA method to solutes with nonzero total charge and will report in the subsequent publications. ACKNOWLEDGMENTS

The computer program for the morphometric approach was developed with Roland Roth. This work was supported by X-Ray Free Electron Laser Priority Strategy Program from the Ministry of Education, Culture, Sports, Science and Technology of Japan (MEXT) to M.I.; by the Platform for Drug Design, Informatics and Structural Life Sciences to M.I.; by JSPS (Japan Society for the Promotion of Science) Grantin-Aid for Scientific Research (Nos. 23651202 and 26240045 to N.M., 25291035 to M.K., and 25291036 to M.I.); by the Elements Strategy Initiative for Catalysts and Batteries

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 134.124.28.17 On: Tue, 11 Aug 2015 03:22:44

175101-11

Yoshidome et al.

(MEXT) to N.M.; by Computational Materials Science Initiative, Theoretical and Computational Chemistry Initiative, and HPCI System Research Project (Project IDs: hp120093, hp130022, hp140156, and hp140214) of the Next-Generation Supercomputing Project to N.M.; by HPCI Strategic Programs for Innovative Research, Computational Life Science and Application in Drug Discovery and Medical Development (Project IDs: hp120309, hp130003, and hp140229) (MEXT) to M.I. 1M.

Kinoshita, Biophys. Rev. 5, 283 (2013). Roux, Biophys. Chem. 78, 1 (1999). 3Y. Karino, M. V. Fedorov, and N. Matubayasi, Chem. Phys. Lett. 496, 351 (2010). 4H. Saito, N. Matubayasi, K. Nishikawa, and H. Nagao, Chem. Phys. Lett. 497, 218 (2010). 5F. Zeller and M. Zacharias, J. Phys. Chem. B 118, 7467 (2014). 6M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids (Oxford University Press, Oxford, 1987). 7F. Hirata, Molecular Theory of Solvation (Kluwer, Dordrecht, 2003). 8M. Kinoshita, J. Chem. Phys. 128, 024507 (2008). 9N. Matubayasi and M. Nakahara, J. Chem. Phys. 113, 6070 (2000). 10N. Matubayasi and M. Nakahara, J. Chem. Phys. 117, 3605 (2002); Erratum, 118, 2446 (2003) 11N. Matubayasi and M. Nakahara, J. Chem. Phys. 119, 9686 (2003). 12S. Sakuraba and N. Matubayasi, J. Comput. Chem. 35, 1592 (2014). 13K. Takemura, H. Guo, S. Sakuraba, N. Matubayasi, and A. Kitao, J. Chem. Phys. 137, 215105 (2012). 14Y. Karino and N. Matubayasi, J. Chem. Phys. 134, 041105 (2011). 15P. M. König, R. Roth, and K. R. Mecke, Phys. Rev. Lett. 93, 160601 (2004). 16R. Roth, Y. Harano, and M. Kinoshita, Phys. Rev. Lett. 97, 078101 (2006). 17T. Imai, Y. Harano, M. Kinoshita, A. Kovalenko, and F. Hirata, J. Chem. Phys. 125, 024911 (2006). 18S. Yasuda, T. Yoshidome, H. Oshima, R. Kodama, Y. Harano, and M. Kinoshita, J. Chem. Phys. 132, 065105 (2010). 19B. Lee and F. M. Richards, J. Mol. Biol. 55, 379 (1971). 20T. Yoshidome, M. Kinoshita, S. Hirota, N. Baden, and M. Terazima, J. Chem. Phys. 128, 225104 (2008). 21Y. Harano, R. Roth, and S. Chiba, J. Comput. Chem. 34, 1969 (2013). 2B.

J. Chem. Phys. 142, 175101 (2015) 22T.

Yoshidome, Y. Ito, M. Ikeguchi, and M. Kinoshita, J. Am. Chem. Soc. 133, 4030 (2011). 23T. Yoshidome and M. Kinoshita, Phys. Rev. E 79, 030905(R) (2009). 24T. Yoshidome and M. Kinoshita, Phys. Chem. Chem. Phys. 14, 14554 (2012). 25R. M. Levy, M. Belhadj, and D. B. Kitchen, J. Chem. Phys. 95, 3627 (1991). 26J. Åqvist, C. Medina, and J.-E. Samuelsson, Protein Eng. 7, 385 (1994). 27M. L. Connolly, J. Appl. Crystallogr. 16, 548 (1983). 28M. L. Connolly, J. Am. Chem. Soc. 107, 1118 (1985). 29Y. Deng and B. Roux, J. Phys. Chem. B 113, 2234 (2009). 30See supplementary material at http://dx.doi.org/10.1063/1.4919636 for the solutes used, details of the numerical methods, convergence behavior of ∆µ Direct and ∆µ Indirect obtained through the ER method, and the list of seventy-six solutes and their values of the thermodynamic quantities and the four geometric measures. 31G. Hummer, L. R. Pratt, and A. E. García, J. Phys. Chem. A 102, 7885 (1998). 32G. Hummer, L. R. Pratt, and A. E. García, J. Phys. Chem. 100, 1206 (1996). 33G. Hummer, L. R. Pratt, and A. E. García, J. Chem. Phys. 107, 9275 (1997). 34A. D. MacKerell, Jr., D. Bashford, M. Bellott, R. L. Dunbrack, Jr., J. D. Evanseck, M. J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. JosephMcCarthy, L. Kuchnir, K. Kuczera, F. T. K. Lau, C. Mattos, S. Michnick, T. Ngo, D. T. Nguyen, B. Prodhom, W. E. Reiher III, B. Roux, M. Schlenkrich, J. C. Smith, R. Stote, J. Straub, M. Watanabe, J. Wiorkiewicz-Kuczera, D. Yin, and M. Karplus, J. Phys. Chem. B 102, 3586 (1998). 35S. Yasuda, T. Yoshidome, Y. Harano, R. Roth, H. Oshima, K. Oda, Y. Sugita, M. Ikeguchi, and M. Kinoshita, Proteins: Struct., Funct., Genet. 79, 2161 (2011). 36T. Yoshidome, K. Oda, Y. Harano, R. Roth, Y. Sugita, M. Ikeguchi, and M. Kinoshita, Proteins: Struct., Funct., Genet. 77, 950 (2009). 37K. T. Simons, R. Bonneau, I. Ruczinski, and D. Baker, Proteins: Struct., Funct., Genet. 37, 171 (1999). 38B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus, J. Comput. Chem. 4, 187 (1983). 39M. Feig, J. Karanicolas, and C. L. Brooks III, J. Mol. Graphics 22, 377 (2004). 40H. A. Carlson and W. L. Jorgensen, J. Phys. Chem. 95, 10667 (1995). 41D. Chandler, Nature 437, 640 (2005). 42G. Hummer, L. R. Pratt, and A. E. García, J. Phys. Chem. 99, 14188 (1995). 43B. Lin and B. M. Pettitt, J. Comput. Chem. 32, 878 (2010). 44T. Ekimoto, N. Matubayasi, and M. Ikeguchi, J. Chem. Theory Comput. 11, 215 (2015).

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 134.124.28.17 On: Tue, 11 Aug 2015 03:22:44

An accurate and efficient computation method of the hydration free energy of a large, complex molecule.

The hydration free energy (HFE) is a crucially important physical quantity to discuss various chemical processes in aqueous solutions. Although an exp...
998KB Sizes 1 Downloads 8 Views