FULL PAPER

WWW.C-CHEM.ORG

Virtual-System-Coupled Adaptive Umbrella Sampling to Compute Free-Energy Landscape for Flexible Molecular Docking Junichi Higo,*[a] Bhaskar Dasgupta,[a] Tadaaki Mashimo,[b,c] Kota Kasahara,[a] Yoshifumi Fukunishi,[d] and Haruki Nakamura[a] A novel enhanced conformational sampling method, virtualsystem-coupled adaptive umbrella sampling (V-AUS), was proposed to compute 300-K free-energy landscape for flexible molecular docking, where a virtual degrees of freedom was introduced to control the sampling. This degree of freedom interacts with the biomolecular system. V-AUS was applied to complex formation of two disordered amyloid-b (Ab30–35) peptides in a periodic box filled by an explicit solvent. An interpeptide distance was defined as the reaction coordinate, along which sampling was enhanced. A uniform conformational distribution was obtained covering a wide interpeptide

distance ranging from the bound to unbound states. The 300K free-energy landscape was characterized by thermodynamically stable basins of antiparallel and parallel b-sheet complexes and some other complex forms. Helices were frequently observed, when the two peptides contacted loosely or fluctuated freely without interpeptide contacts. We observed that V-AUS converged to uniform distribution more effectively C 2015 Wiley Periodicals, than conventional AUS sampling did. V Inc.

Introduction

section. The energy enhancement, however, tends to overlook minor basins. Suppose that two basins denoted as B1 and B2 exist at an energy level that corresponds to a room temperature Troom (Scheme 1). The numbers of microscopic states for B1 and B2 in an energy slice ½E; E1dE are referred to as nB1 ðEÞdE and nB2 ðEÞdE, respectively. We furthermore assume that B1 and B2 are major and minor basins, respectively: nB1 ðEÞ  nB2 ðEÞ. Probability of existence partitioned to B1 and Ð 21 B2 at Troom is, respectively, PB1 5Zentire nB1 ðEÞexp½2broom EdE Ð 21 and PB2 5Zentire nB2 ðEÞexp½2broom EdE, where R is a gas con21 stant, broom 5ðRTroom Þ21 , and Zentire is a parameter called partition function. The free-energy difference DF between the basins (i.e., the free energy of B2 measured from that of B1 ) is

To reveal the free energy landscape of a biomolecular system, generalized ensemble methods are useful, because those methods can accelerate conformational sampling[1,2] One of the generalized ensemble methods, multicanonical Monte– Carlo sampling, was developed to study a two-dimensional (2D) Potts model on lattices,[3] and applied to biological systems.[4–7] Nakajima et al. extended the multicanonical Monte– Carlo sampling to Cartesian-coordinate-based molecular dynamics,[8] and this method was abbreviated to McMD. Thanks to adaptation of the Cartesian coordinates, solvent was treated readily by an explicit model, which was a technically important progress for computing a biomolecular system under a realistic condition.[9] Partial-enhancement technique[10,11] and multicomponent enhancement technique[12,13] have been proposed based on the methodological superiority of multicanonical sampling. Suppose that a biologically important structure of the biomolecule exists in a basin of a conformational space and that high-energy barriers surround the basin. The multicanonical method enhances sampling along an energy axis (energy enhancement). Thus, energy is the reaction coordinate for the enhancement. Then, a simulation trajectory, starting from a conformation far away from the basin, reaches the basin overcoming the barriers in an executable computation time. We note that multicanonical sampling enhances the structural changes indirectly via the energy enhancement. Namely, the energy enhancement brings the conformation to highenergy regions in the conformational space, and then transitions among energy basins are induced. We explain the mathematical details of the multicanonical sampling in the Methods

DOI: 10.1002/jcc.23948

[a] J. Higo, B. Dasgupta, K. Kasahara, H. Nakamura Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka 565–0871, Japan E-mail: [email protected] [b] T. Mashimo Technology Research Association for Next Generation Natural Products Chemistry, 2–3-26 Aomi, Koto-Ku, Tokyo 135-0064, Japan [c] T. Mashimo Information, Mathematical Science and Bioinformatics Co., Ltd., 4–21-1, Higashiikebukuro, Toshima-ku, Tokyo 170-0013, Japan [d] Y. Fukunishi Molecular Profiling Research Center for Drug Discovery (molprof), National Institute of Advanced Industrial Science and Technology (AIST), 2-3-26 Aomi, Koto-ku, Tokyo 135-0064, Japan Contract grant sponsor: Ministry of Economy, Trade, and Industry (METI) Japan (J.H., B.D., T.M., Y.F., and H.N.); Contract grant sponsor: MEXT Japan (H.N. and K.K); Contract grant number: 24118001; Contract grant sponsor: The Ministry of Education, Culture, Sports, Science and Technology (MEXT) (J.H.) (Grant-in-Aid for Scientific Research on Innovative Areas); Contract grant number: 21113006 C 2015 Wiley Periodicals, Inc. V

Journal of Computational Chemistry 2015, 36, 1489–1501

1489

FULL PAPER

WWW.C-CHEM.ORG

Scheme 1. Schematic potential-energy surface. The x- and y-axes represent conformation (or configuration) of the system and the potential energy E, respectively. Shaded rectangular is the energy range ETroom, which corresponds to a room temperature Troom. Two energy basins B1 and B2 exist in the room-temperature energy range. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

DF52RTroom ln½PB2 PB1 21 . A long multicanonical simulation produces a conformational ensemble, where PB1 and PB2 satisfy a relation PB2 PB1 21 5exp½2broom DF. Then the ratio PB2 PB1 21 decreases rapidly with increasing DF as: PB2 PB1 21 56:731023 for DF53:0 kcal mol21, PB2 PB1 21 52:431024 for DF55:0 kcal mol21, and PB2 PB1 21 55:631028 for DF510:0 kcal mol21. This means that only one snapshot is sampled for B2, when 1:73 107 snapshots are sampled for B1 if DF53:0 kcal mol21. Consequently, the statistical error in PB2 rapidly increases with the decreasing basin width of B2 for a finite length of simulation. Usually this statistical error is trivial because the minor basins are not important to understand biophysical behavior of the system. However, there are cases where small-probability regions have an essentially important role for the behavior. For instance, in molecular binding where biomolecules are immersed in a solvent of finite size, the probability assigned to the unbound state may be considerably smaller than that to the bound state. Then, the statistical weight assigned to the unbound state may involve a large error in the multicanonical sampling, which results in that the free-energy landscape for molecular binding is inaccurate. A method called “adaptive umbrella sampling (AUS)” enhances structural motions directly along a structural parameter.[14,15] Now, we suppose that both B1 and B2 are biophysically important basins and that pathways connecting B1 and B2 are under consideration. If variation of the structural parameter can trace well with the interbasin motion, the parameter can be an effective reaction coordinate for sampling. As explained later, AUS equally samples both the basins as well as the interbasin regions along the reaction coordinate. In either multicanonical molecular dynamics (McMD) or AUS, a bias is introduced to enhance sampling along the reaction coordinate. However, the bias function is unknown a priori. Then, there are some procedures to determine the bias. In McMD, simulation is iterated where the bias function is updated.[2,8] Wang and Landau proposed another procedure where the bias function is updated step by step of simulation.[16] Thus, the Wang–Landau method is an automated refinement procedure without the iterative simulations. Meta1490

Journal of Computational Chemistry 2015, 36, 1489–1501

dynamics is also an iteration-free method for AUS.[17] Kim et al. have proposed another automated method, called forcebiased multicanonical sampling,[18] where a single simulation run is divided into several blocks and the bias function is updated block by block. Thus, this block-by-block method is a blend of the iteration-by-iteration and the step-by-step updating. Recently, we proposed an McMD-based method named a virtual-system-coupled McMD (V-McMD) to further achieve energy enhancement.[19] In this method, we introduced a virtual degree of freedom, which interacts with the bio-molecular system (real system) in an arbitrary way without breaking detailed balance condition. This virtual degree of freedom can be regarded as a virtual system. V-McMD was applied to a biomolecular dimer formation and produced a free-energy landscape for binding process. Although V-McMD sampled both the bound and unbound states, the bound state was predominantly sampled in the room-temperature energy range. Then, a new method is required to evenly sample both states at the room temperature. To surmount the problem mentioned above, in this study, we developed a variant of AUS, a “virtual-system-coupled AUS” (V-AUS), where the virtual system interacts with the biomolecular system. We applied this method to molecular docking of two flexible peptide chains, amyloid-b (Ab30–35) peptides, in a periodic boundary box filled by an explicit solvent. The resultant free-energy landscape at 300 K comprised various complex structures, such as parallel and antiparallel b-sheets, a-helix, and unbound conformations. We have performed conventional AUS of this system and shown that convergence of a distribution from the V-AUS is more rapid that that from the conventional AUS.

Materials and Methods Brief explanation of McMD Before proceeding to the currently proposed method, V-AUS, we explain McMD briefly because McMD and AUS have a similarity. In multicanonical sampling, an effective potential energy function EMcMD is introduced as: EMcMD 5E1RTln½Pene cano ðE; TÞ

(1)

where E and T are the original potential energy of the system and the simulation temperature, respectively. The function Pene energy distribution at T formally cano ðE; TÞ is a canonical Ð given as: Pene nðEÞexp½2bEdE, where Ac is a norðE; TÞ5A c cano malization factor (or the inverse of the partition function), b5ðRTÞ21 , and nðEÞ is a density of microscopic states in the energy slice ½E; E1dE wherever the system is in the conformational space. The potential energy E is a function of the conformation r of the system: E5EðrÞ. The r is a high-dimensional vector consisting of all of the atomic coordinates: r5½r1 ; r2 ; :::; rN , where ri is the Cartesian coordinate of an atom i (i.e., ri 5½xi ; yi ; zi ) and N the number of constituent atoms of the system. The force acting on the atom i is computed as F McMD 52ri EMcMD . An MD simulation at T with using i WWW.CHEMISTRYVIEWS.COM

FULL PAPER

WWW.C-CHEM.ORG

F McMD produces a flat energy distribution Pene McMD ðE; TÞ in a wide i energy range, when Pene ðE; TÞ is given accurately in the cano Ð range:[3,8] Pene ðE; TÞ / nðEÞexp½2bE dE5const. This McMD McMD simulation, called McMD, guarantees a high sampling efficiency in the wide potential energy range if the simulation is long enough. Note that the function form of Pene cano ðE; TÞ is unknown to start McMD. Therefore, we perform the McMD simulations iteratively, where Pene cano ðE; TÞ is gradually refined and Pene ðE; TÞ reaches a flat distribution. Detailed technique McMD for the iteration was given elsewhere.[2]

(

F i 52ri E    Bi 52RTri ln Pstr cano ðk; TÞ

(7)

The term F i is the original force derived from the potential energy E, and Bi acts as a bias to produce Pstr AUS ðk; TÞ5const. Note that the derivative is taken for Pstr cano ðk; TÞ, whose value varies with k, the bias Bi does not vanish in eq. (7). The flat distribution assures that sampling is enhanced along the k axis. The parameter k is called usually a “reaction coordinate,” along which the conformational variation is monitored. The temperature T is set to 300 K in this work because we are interested in the conformational distribution at the room temperature.

Adaptive umbrella sampling In statistical mechanics, a canonical configurational distribution (i.e., probability to detect the system at r) at T is given formally as: qcano ðr; TÞ / exp½2bEðrÞ. Here, we introduce a structural parameter k expressed by a function of r: k5aðrÞ. Then a distribution function along k is calculated by contracting (or projecting) qcano ðr; TÞ to the k axis as: ð Pstr ðk; TÞ5 dðaðrÞ2kÞ qcano ðr; TÞdr; cano

(2)

where dðaðrÞ2kÞ is the Dirac delta function. Now, we introduce an energy function EAUS by replacing Pene cano ðE; TÞ by Pstr ðk; TÞ in eq. (1): cano EAUS 5E1RTln½Pstr cano ðk; TÞ

(3)

The second term is the so-called potential of mean force (PMF). A canonical simulation using EAUS at T is generally called AUS. This simulation produces the following configurational distribution: qAUS ðr; TÞ

5AAUS exp½2bEAUS ðrÞ   5AAUS exp½2bE exp 2ln½Pstr cano ðk; TÞ

(4)

qcano ðr; TÞ 5A AUS str Pcano ðk; TÞ 0

where AAUS is a normalization factor and A0 AUS is a constant. Contracting qAUS ðr; TÞ to the k axis, we obtain a flat function along the k axis: Pstr AUS ðk; TÞ

ð 5Ak dðaðrÞ2kÞqAUS ðr; TÞ dr 5

ð Ak dðaðrÞ2kÞexp½2bE dr Pstr cano ðk; TÞ

(5)

Pstr ðk; TÞ 5Ak cano Pstr cano ðk; TÞ 5const where Ak is a constant. The force acting on the atom i is defined as: F AUS 52ri EAUS 5F i 1Bi i where

(6)

Virtual-system-coupled AUS The virtual system was first introduced for accelerating McMD of biomolecular system.[19] Here, we introduce a virtual degree of freedom to AUS to increase its sampling efficiency and assume that this degree of freedom represents the state of a virtual system, which interacts with the molecular system (real system). Thus, we termed this method “V-AUS.” The virtual degree of freedom is referred to as v, and its variation is discretized as: v5fv1 ; v2 ; :::; vnV g, where vj represents the j-th virtual state and nV is the number of the virtual states (nV 57, for this study). Because the entire system comprised the real molecular system and the virtual system, the state parameters identifying the entire system are k and v. We denote the entire sampling range for k as ½kmin ; kmax , where kmin and kmax are the lower and upper bounds, respectively. Up to here, a definite form for relation (interactions) between the virtual system and molecular system is not given. Suppose that the virtual system is in the j-th state in a time interval ½t; t1Dt, during which the real system is restrained so that k up fluctuates in a zone zj 5½klow j ; kj  (Scheme 2a). This restraint on k is the only interaction between the virtual and real systems. The procedure for restraining k within the zone is given elsewhere.[2] The zone zj overlaps to zj21 and zj11 , although zj21 and zj11 do not overlap to each other as shown in Scheme 2a. The actual zone partitioning is shown later. As mentioned above, k is confined in zj during the time period ½t; t1Dt. Then, the virtual state may transition to vj21 or vj11 at the end of the period (time5t1Dt): If up klow j21  kðt1DtÞ  kj21 , the virtual state switches to vj21 with a transition probability of pj!j21 (0 < pj!j21  1) as shown by the open circle in Scheme 2b. Once the transition has taken place actually, the conformation is restrained in zj21 in the next period ½t1Dt; t12Dt. If the virtual state has not transitioned to zj21 , the conformation remains in the zone zj . Similarly, the virtual state may transition to vj11 with a transition probability of up pj!j11 (0 < pj!j11  1) if klow j11  kðt1DtÞ  kj11 as shown by the filled circle in Scheme 2b. This procedure continues with executing virtual-state transitions at time t1nDt (n51; 2; 3; :::). The virtual state v1 or vnV can switch only to v2 or vnV 21 if up k satisfies the inequalities: klow or 2  kðt1nDtÞ  k2 up low knV 21  kðt1nDtÞ  knV 21 . Importantly, this interstate transition satisfies the detailed balance condition.[19] The virtual system is a virtual object literally. Thus, its physical properties are set Journal of Computational Chemistry 2015, 36, 1489–1501

1491

FULL PAPER

WWW.C-CHEM.ORG

The summations are taken over atoms involved in groups gA and gB , which consist of atoms in peptide A and B, respectively, and NA and NB are numbers of atoms in gA and gB , respectively. Each group needs not contain all of the atoms in each peptide. In fact, we selected some of peptide main-chain atoms for gA and gB as shown later. The bias Bi acting on the atom i is expressed as: Scheme 2. a) Plane of structural parameter k (reaction coordinate) and virtual state v. Four virtual states, v1 ; :::; v4 , are shown for simplicity, although seven virtual states are introduced in actual sampling. Thick line at virtual up state vj shows accessible zone of k in the virtual state: zj 5½klow j ; kj . A black thick line is a zone that overlaps the adjacent zones, whereas a gray min thick line is one that does not overlap. The klow and kup 4 coincide with k 1 and kmax , respectively (see main text). b) Transitions from vj to the adjacent virtual states vj21 or zj11. Conformation represented by the filled black circle in vj may transition to vj11 because the conformation can be involved in zone zj11 . Similarly, Conformation represented by the open circle in vj may transition to vj21 because the conformation can be involved in zone zj21 . A conformation in vj cannot transition to either adjacent virtual state because zone zj21 and zj11 do not overlap to each other.

conveniently for sampling. Then, one can set arbitrarily the intervirtual state transition probability. In this study, the transition probability is set to 1.0. So, a transition occurs unconditionally. str The structural distributions Pstr cano and PAUS are now characterized not only by k but also by v: (

str Pstr cano ðk; TÞ ! Pcano ðk; v; TÞ str Pstr AUS ðk; TÞ ! PAUS ðk; v; TÞ

(8)

Then, EAUS is given as: EAUS 5E1RTln½Pstr cano ðk; v; TÞ;

(9)

and the force acting on an atom i is:    52ri E1RTln Pstr F AUS i cano ðk; vk ; TÞ :

(10)

Up to here, the structural parameter k is general, and many types of k are possible in actual application. We apply the current method to molecular docking of two flexible peptides (say peptide A and B) in this study. A natural selection of k is a molecular distance rAB of the two peptides. k 5rAB h i1=2 5 ðrA 2rB Þ2 ;

(11)

  8 RT dln½Pstr > cano ðk; v; TÞ r A 2r B > B 52 > i < NA dk rAB   str > RT dln½Pcano ðk; v; TÞ rB 2rA > > : Bi 52 NB dk rAB

ði 2 gA Þ :

(13)

ði 2 gB Þ

The bias is not applied to atoms excluded from gA or gB. Note that Xbias satisfies an action–reaction law: X NA NB B 52 B . i i i2gA i2gB Note that V-AUS with a single virtual-state (nV 51) is equivalent to a conventional AUS. We performed single virtual-state AUS and seven virtual-state AUS, and compared the convergence of Pstr AUS ðk; v; TÞ to a flat distribution. Trivial trajectory parallelization To further enhance the sampling, we used a trivial trajectory parallelization (TTP) method,[20–22] where a number of independent runs are performed using the same potential energy (EMcMD for McMD and EAUS for AUS) but starting from different initial conformations. The resultant trajectories are simply connected in an arbitrary order. Imagine that the last snapshot of a trajectory, denoted as trjA , is connected to the first snapshot of another trajectory, denoted as trjB . Then, the energy variation DE5Efirst ðtrjB Þ2Elast ðtrjA Þ is usually large in an all-atom model. One may consider that the canonical transition probability for this connection is frequently very small if DE > 0: Ptrans ðtrjA ! trjB Þ5exp½2bDE  1. However, we have proved that the transition probability for this connection is always one regardless of a large DE:Ptrans ðtrjA ! trjB Þ51.[22] Consequently, the connected trajectories are equivalent to a single trajectory obtained from a single simulation. Consequently, TTP generates a statistically acceptable ensemble, a canonical ensemble, by simply connecting short simulation trajectories. Advantageously, conformational jumps at the trajectory connection points increase sampling efficiency considerably. The resultant ensemble for the currently studied two-peptide system demonstrated a good convergence as shown in the Results section. Actual procedure for V-AUS

where rA and rB are the representative positions of peptides A and B, respectively: 8 NA 1 X > > > r 5 ri A > > NA i2g < A > NB > 1 X > > > r 5 ri B : N

:

B i2g B

1492

Journal of Computational Chemistry 2015, 36, 1489–1501

(12)

The distribution function Pstr cano ðk; v; TÞ is unknown a priori. Starting from an inaccurate Pstr cano ðk; v; TÞ function, we perform the V-AUS simulation iteratively, through which the distribution converges to an accurate one.[2] The iterative procedure is strðk21Þ done as follows: suppose that Pcano ðk; v; TÞ is the inaccurate distribution estimated from the ðk21Þ-th iteration. Then, the ðkÞ strðk21Þ potential energy is defined as: EAUS 5E1RTln½Pcano ðk; v; TÞ. ðkÞ The k-th iteration is performed using EAUS , and a distribution WWW.CHEMISTRYVIEWS.COM

FULL PAPER

WWW.C-CHEM.ORG

strðkÞ

function PAUS ðk; v; TÞ is computed numerically. Now, eq. (5) is strðkÞ rewritten using the numerical PAUS as: strðkÞ PAUS ðk; v; TÞ

ð

h i ðkÞ 5Ak dðaðrÞ2kÞexp 2bEAUS dr ð Ak dðaðrÞ2kÞexp½2bE dr 5

strðk21Þ

Pcano 5

Ak Pstr cano ðk; v; TÞ strðk21Þ

Pcano

(14)

ðk; v; TÞ

ðk; v; TÞ

:

strðk21Þ

As we have assumed that Pcano ðk; v; TÞ is inaccurate, Pstr cano strðk21Þ ðk; v; TÞ in eq. (14) do not converge to Pcano ðk; v; TÞ, and strðkÞ then PAUS ðk; v; TÞ is not a flat function. Equation (14) is rewritten as: strðkÞ

21 Pstr cano ðk; v; TÞ 5 Ak PAUS ðk; v; TÞ strðk-1Þ Pcano ðk; v; TÞ:

3 strðkÞ

(15)

strðk21Þ

As PAUS ðk; v; TÞ and Pcano ðk; v; TÞ are known after the kth iteration, the left-hand side of eq. (15) is a given function. strðkÞ Then, Pcano ðk; v; TÞ is defined as follows: strðkÞ

strðkÞ

Pcano ðk; v; TÞ5Ak 21 PAUS ðk; v; TÞ 3

strðk -1Þ Pcano ðk; v; TÞ

(16)

and this function is used to update the potential energy for ðk11Þ strðkÞ the ðk11Þ-th iteration as: EAUS 5E1RTln½Pcano ðk; v; TÞ. One strðkÞ may raise an objection that Pcano ðk; v; TÞ is not uniquely defined because the parameter Ak is not determined uniquely. However, forces acting on atoms [eq. (13)] are determined strð0Þ uniquely without specifying Ak . We set as Pcano ðk; v; TÞ51 to initiate the first iteration. This means that the first iteration ð1Þ was a canonical MD simulation (i.e., EAUS 5E) coupled with the virtual system. The final snapshot from the k-th iteration was used for the initial conformation for the ðk11Þ-th iteration. Generation of the initial conformations for the first iteration is explained later. We repeat the iteration until a flat distribution is obtained: Pstr AUS ðk; v; TÞ5const. The production run is the simulation with the flat distribution. Simulation protocol The V-AUS algorithm was implemented in a computer program myPresto/psygene-G,[23] where the electrostatic interactions are computed by a non-Ewald method, a zero-dipole summation method.[24,25] The simulation was performed with following condition: the time step 5 1 fs, a SHAKE algorithm to constrain the covalent bonds between heavy atoms and hydrogen atoms,[26] and temperature control by a constanttemperature method.[27] The force field for the peptides was an AMBER-based hybrid force field[28] defined as: E ðwÞ5ð12w ÞE94 1wE96 , where E94 and E96 are the AMBER

parm9429 and parm9630 force fields, respectively, and w is the mixing rate. We have assessed EðwÞ by comparing the freeenergy landscapes of short peptides between McMD simulations and quantum chemical calculations, and found that w  0:75 is optimal.[28] Besides, with setting as w50:75, a peptide with a helical propensity folded into a a-helix, while a peptide with a b-hairpin propensity did into a b-hairpin.[28] Thus, we used Eð0:75Þ in this study. The force field for a water molecule is a TIP3P water model.[31] V-AUS was executed by the TSUBAME 2.5 supercomputer at the Tokyo Institute of Technology, where each run uses eight cores of the CPU (Intel Xeon X5670) and eight GPUs (NVIDIA Tesla K20X). Molecular system computed We prepared a system consisting of two identical peptides whose amino-acid sequence is Ace-1Ala-2Ile-3Ile-4Gly-5Leu6 Met-Nme, where Ace and Nme are the acetyl and N-methyl groups, respectively, introduced to cap the N- and C-termini of the peptides. This peptide, denoted as Ab30–35, is a segment (residues 30–35) of the amyloid-b peptide (Ab1–42), whose original length is 42. Interestingly, Ab30–35 itself has an ability to form amyloid fibrils (PDB ID 5 2y3j), where parallel b-sheets are stacked in layers.[32] It is unclear whether a parallel b-sheet is the most thermodynamically stable for the two-chain system. However, it is likely that the parallel b-sheet is one of the stable states. To generate the initial conformation of simulation, we prepared two Ab30–35 chains, whose conformations and orientations are random. Then, the chains apart from each other were put in a periodic box (dimensions: 40340340 A˚3) and filled the box with water molecules using a tool of computer program myPresto.[33] No ions were introduced because the net charge of Ab30–35 is zero. The system consisted of 6206 atoms (10332 peptide atoms and 2000 water molecules). Next, we performed a constant-pressure (1.0 atm) and constant-temperature (300 K) MD simulation (NPT) to adjust the box size. The resultant box size was 39:31339:31339:31 ˚ 3. The final snapshot from the NPT simulation is shown in A Figure 1a. We, furthermore, randomized this conformation at a ˚ 3), high temperature (700 K) at the constant volume (39.313 A and picked conformations randomly from the trajectory. The ensemble of picked conformations was denoted as Qrand . As mentioned before, we performed multiple runs for the TTP. Then, the first V-AUS iteration was initiated from conformations in Qrand . As explained later, we repeated V-AUS iterations for 18 times to refine Pstr AUS ðk; v; TÞ and the 19-th iteration was used for the production run. The number of runs and simulation length in the iteration are listed in Table 1. ˚ and kmax 520:0 A˚ in this study. In the We set kmin 54:7 A crystal structure, the k (5rAB ) value is 5.0 A˚ for two contacting b-strands in the parallel b-sheet, which is slightly larger than kmin . The value of kmax is approximately a half of the side length of periodic box, and is large enough so that the two chains can change their mutual orientations freely as shown later. We introduced seven virtual states. Table 2 lists the Journal of Computational Chemistry 2015, 36, 1489–1501

1493

FULL PAPER

WWW.C-CHEM.ORG

Figure 1. a) Final snapshot from the NPT simulation at 300 K, which is used to initiate V-AUS. b) The eA and eB (red arrows) are unit vectors parallel to broken lines, which connect the Ca atoms from 2Ile to 5Leu. As the two chains are identical, identifiers “A” and “B” are not assigned to the chains.

partitioned zones zj (j51; :::; 7). We defined the atomic groups gA and gB in eq. (12) by N, CA, and C atoms of 1Ala, . . ., 6Met for either peptide. Thus, each group involved 18 mainchain atoms.

Results and Discussion In this study, we have introduced a novel variant of AUS, where virtual system is coupled with the bio-molecular system. In this section, we analyze not only statistical validity of the method but also some aspects of the molecular complex formation of the amyloid-b (Ab30–35) peptides, because this system has a biophysical and medical significance. Flat distribution function Figure 2 depicts fluctuations of k of four trajectories picked from the 150 production runs, each of which illustrates typical moves of k. The red-colored line trajectory exemplifies that a

stable complex was formed in a former period until 5:53106 steps, whereas the complex was broken after that. In the magenta-colored trajectory, the two chains moved freely until 63106 steps, and a stable complex was formed from 63106 to 93106 steps although the complex was broken finally. The green line shows that a relatively stable complex was formed from 23106 to 43106 steps, and the blue line does that the two chains explored various configurations without forming a stable complex structure. One may consider from Figure 2 that the sampling would be insufficient because most of trajectory shuttled, at most, once or twice in the k axis during the 103106 steps. However, the trajectories were integrated with satisfying the detailed balance condition according to the TTP procedure, which has been proven to be a physically correct acceleration method for structural sampling.[22] We show later that the set of trajectories yielded a well-converged conformational distribution, which spread widely in the conformational space. strðkÞ The distribution function PAUS ðk; v; TÞ converged with repeating the AUS sampling. The first iteration produced a distribution deviated considerably from flatness (Fig. 3a). As

Table 1. Number of multiple runs and simulation length for AUS iteration. Table 2. Virtual-state zones. Iteration no. #1–#7 #8–#16 #17–#1 8 #19[d]

Number of runs[a]

Simulation length (3 106 steps)[b]

30 60

10 10

120 150

5[c] 10

[a] All of the last snapshots of the preceding iteration were used as the initial conformations for the following iteration. When increasing the number of runs to start a new iteration, insufficient initial conformations were taken from the random-conformation pool Qrand. See main text for definition of Qrand. [b] Listed values are simulation steps for each of multiple runs. [c] Number of runs increased twice from iteration 16 to 17. To maintain the total computational time, we reduced the simulation length by half. [d] Iteration 19 is the production run. Thirty conformations were taken from Qrand to compensate insufficient initial conformations, and those conformations were equilibrated for 5 3 106 steps prior to the production run.

1494

Journal of Computational Chemistry 2015, 36, 1489–1501

Virtual state no. #1 #2 #3 #4 #5 #6 #7

up [a] ½klow j ; kj 

up [a] ½slow j ; sj 

Dsj [b]

[4.700,6.230] [5.465,7.760] [6.230,9.290] [7.760,11.585] [9.290,13.880] [11.585,16.940] [13.880,20.000]

[0.00,0.10] [0.05,0.20] [0.10,0.30] [0.20,0.45] [0.30,0.60] [0.45,0.80] [0.60,1.00]

0.10 0.15 0.20 0.25 0.30 0.35 0.40

up [a] Zone for a virtual state j is represented in two ways: ½klow j ; kj  and up ; s . The former is the original form defined in text, and the latter ½slow j j min low is a normalized form defined as klow and j 5sj Dk1k up min , where Dk5kmax 2kmin , kmin 54:7 A˚, and kmax 520:0 A˚. kup j 5sj Dk1k low [b] Zone width Dsj 5sup increases monotonically with increasing j 2sj the virtual-state no. This is because the conformation fluctuates more rapidly with increasing k (see Fig. 2). Similar zone partitioning was used for a V-McMD study.[19]

WWW.CHEMISTRYVIEWS.COM

WWW.C-CHEM.ORG

Figure 2. Fluctuations of k. Four trajectories from the production run (the 19-th iteration) are shown as a function of time. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

repeating the iteration, the distribution function became smoother (Figs. 3b and 3c), and the 19-th iteration yielded a considerably flat distribution (Fig. 3d). Thus, the iteration 19 was used for the production run with storing snapshots every 5000 steps, and 300,000 snapshots were stored in total. Figure strð19Þ 4 is an integrated distribution PAUS ðk; TÞ of the seven virtualstrð19Þ state distributions PAUS ðk; vi ; TÞ of Figure 3d. The integration

FULL PAPER

method for the virtual-system-coupled computation was given elsewhere.[19] We have performed conventional AUS (i.e., V-AUS with a single virtual state in the current theoretical scheme; nV 51) with the same number of multiple runs, the same simulation length, and the same k range (kmin 54:7 A˚ and kmax 520:0) as those for the V-AUS. The initial conformations for the first iteration were the same as those for the V-AUS. Figure 5 plots the strðkÞ distributions PAUS from the two simulations. In the first iteration, the V-AUS produced a more uneven distribution than the conventional AUS. However, the V-AUS distribution flattened gradually in proceeding with the iteration, whereas the conventional-AUS distribution showed knobs in a small k range. We quit the conventional AUS at the eighth iteration because the knobs did not diminish with changing the positions of knobs in the k axis. Note that we cannot decide when we start the production run unless a flat distribution is obtained. We compared variation of k between the V-AUS and the conventional AUS for each of 60 multiple runs. The variation for a single run was defined as: Dk5k1 2k2 , where k1 and k2 are the maximum and minimum values of k during the run, respectively. Figure 6a plots Dk for the eighth iteration. The behavior of Dk was similar up to the 49-th run between the two simulations. Above the 50-th run, however, Dk from the conventional AUS decreased more rapidly than that from the V-AUS. Figures 6b and 6c illustrate trajectories of k with the smallest to fourth smallest Dk of the eighth iteration for the VAUS and conventional AUS simulations, respectively. The V-

strðkÞ

Figure 3. Structural distribution functions PAUS ðk; v; TÞ computed from iterative V-AUS simulations. The distribution is shown in log scale. The x-axis is the strð1Þ strð10Þ strð15Þ strð19Þ intermolecular distance k (5rAB ). a) PAUS ðk; v; TÞ from the first iteration, b) PAUS ðk; v; TÞ from the 10th, c) PAUS ðk; v; TÞ from the 15th, and d) PAUS ðk; v; TÞ from the 19th (i.e., the production run). Seven curves in each panel are distributions from seven different virtual states ðv1 ; :::; v7 Þ. Fluctuations of k. Four trajectories from the production run (the 19th iteration) are shown as a function of time. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Journal of Computational Chemistry 2015, 36, 1489–1501

1495

FULL PAPER

WWW.C-CHEM.ORG

strð19Þ

Figure 4. Distribution PAUS ðk; TÞ (T5300 K) integrated from seven virtualstrð19Þ state distributions PAUS ðk; vi ; TÞ (i51; :::; 7) of Figure 3d.

AUS provided more dispersed k than the conventional AUS did. The narrow variations of k from the conventional AUS (Fig. 6c) indicate that the conformations were trapped in the small k range, which caused the knobs of the broken line in Figure 5d. The smallest and the second smallest Dk for the eighth iteration of the V-AUS were 2.82 A˚ (red-colored line in ˚ (green), respectively. The values of Dk of Fig. 6b) and 2.86 A ˚ , respecthese two trajectories increased to 5.03 and 14.53 A tively, in the ninth iteration.

resenting the molecular orientation of each molecule. Figure 1b explains detailed definition for eA and eB . To compute this 2D distribution, all of the stored snapshots were projected on the 2D plane with the same weight without reweighting. The distribution was uneven for small k, where some particular complex structures were selectively formed. In contrast, the distribution became plain with increasing k, which indicates that the chains less interfered the mutual molecular orientations because of increment of the interchain contacts. To analyze the interchain contacts quantitatively, we computed the minimum heavy atomic distance between the chains for all of the stored snapshots, and assigned a contact when the mini˚ . Figure 7b depicts the mum distance was smaller than 5.5 A ratio of snapshots with no interchain contact as a function of ˚ , and reaches 90% at k. The ratio rapidly increased for k > 14 A max ˚ ˚ ) and the periodick520 A. Therefore, the value of k (20.0 A box size were sufficiently large to sample all of the complex forms. strð19Þ The canonical structural distribution Pcano ðk; v; TÞ from the production run is plotted in form of PMF (Fig. 8). The distributions from different virtual states overlapped well to each other. The probability increased with decreasing k, and a ˚ . Then, the probability again shoulder appeared at k55:5 A ˚ ). In contrast, increased up to the lower bound (i.e., kmin 54:7 A strð19Þ ˚ Pcano ðk; v; TÞ was plateau for k > 17 A.

Two-dimensional conformational distribution Figure 7a demonstrates a 2D conformational distribution P2D AUS ð k; eA  eB Þ constructed on a plane by the molecular distance k (5rAB ) and a mutual molecular orientation eA  eB . The quantity eA  eB is a scalar product of unit vectors eA and eB , each rep-

Two-dimensional free-energy landscape The free-energy landscape (PMF) on the 2D space at 300 K is strð19Þ computed from P2D AUS ðk; eA  eB Þ (Fig. 7a) and Pcano ðk; v; TÞ (Fig. 8) as:

strðkÞ

Figure 5. Distributions PAUS ðk; TÞ from V-AUS (solid line) and conventional AUS (broken line) for a) the first (k51), b) third (k53), c) sixth (k56), and d) strðkÞ eighth (k58) iterations. The V-AUS distributions are integrated from seven virtual-state distributions PAUS ðk; vi ; TÞ as shown in Figure 4.

1496

Journal of Computational Chemistry 2015, 36, 1489–1501

WWW.CHEMISTRYVIEWS.COM

WWW.C-CHEM.ORG

FULL PAPER

Figure 6. a) Variations Dk from iteration 8 of V-AUS (red-colored circle) and conventional AUS (blue-colored circle). Sixty multiple runs were sorted with descending order of Dk. Definition for Dk is given in the main text. b) Trajectories of k from V-AUS with the smallest Dk (red-colored line), second smallest (green), third smallest (blue), and fourth smallest (magenta). c) Those from the conventional AUS. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

PMFðk; eA  eB ; TÞ5 2RTln½P2D AUS ðk; eA 

strð19Þ eB Þ3Pcano ðk; v; TÞ:

(17)

Figure 9 demonstrates the free-energy landscape. We also display tertiary structures of peptides picked at some locations in the 2D landscape. The lowest free-energy basin was

assigned to a wide region [k  6 A˚, eA  eB  2 0:5]. In fact, antiparallel b-sheets (see conformation c1 in Fig. 9) dominated ˚ , eA  eB  2 0:8] of this lowest free-energy a part [k  6 A basin. However, this lowest free-energy basin involved conformations deviated from the antiparallel b-sheets as suggested by the inequality eA  eB  2 0:5, whose angle between eA and

Figure 7. a) Conformational distribution in 2D space constructed by intermolecular distance k (5rAB ) and mutual molecular orientation eA  eB . See Figure 1b for more detailed definition of eA  eB . All of the stored snapshots were projected on the 2D plane with the same weight without reweighting. The distribution is normalized so that the largest distribution is 1.0. b) Ratio of conformations with no interchain contact as a function of k. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Journal of Computational Chemistry 2015, 36, 1489–1501

1497

FULL PAPER

WWW.C-CHEM.ORG

Figure 8. Canonical structural distribution in form of PMF defined as strð19Þ PMF52RTln½Pcano ðk; v; TÞ, where T5300 K. Different curves correspond strð19Þ to distributions from different virtual states: Pcano ðk; vj ; TÞ where j51; :::; 7. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

2eB is p=4. Besides, the strands bended frequently and interstrand hydrogen-bond patterns had some varieties. The peptide A b30–35 has a glycine residue at the center of the peptide (see Molecular System Computed subsection). It is likely that this glycine brought distortion in the strands. A semistable free-energy basin was found in a parallel strand region [5:5  k  6:5 A˚, eA  eB  0:8], from which a parallel b-sheet is shown as c2 in Figure 9. Again, the peptide structures in this range were not uniform because of some distortion of strands and varieties in the hydrogen-bond patterns. The red-filled circle in Figure 9 represents the position of the parallel b-sheet in the crystal structure. Thus, the sampled parallel strands have more loose packing than that the crystal structure does. We presume that the interstrand packing becomes tighter as

an aggregate grows, where bending strands are rearranged and the genuine hydrogen bonds are formed. Structural polymorphism is a well-known character of the amyloid-b peptide and other amyloid fibrils,[32,34–38] although an atomistic mechanism for the polymorphism has not yet been understood well. Another segment Ab16–21 (residues 16– 21) from Ab16–21 folds into either the parallel or antiparallel bsheet fibrils.[32] Molecular simulation is a useful technique to study early events in fibril formation, which is undetectable experimentally.[39–43] Our study suggests that Ab30–35 may have polymorphism only in an early stage of fibril formation where both the parallel and antiparallel b-sheets are allowed. Referring to the experimental fact that the parallel b-sheets are stacked in the fibril,[32] a population selection mechanism[44,45] may work for the fibril formation with discarding the antiparallel b-sheet. ˚, We found another semistable region [k  5:5 A 0:2 < eA  eB < 0:4], where cross-structures (c3 in Fig. 9) were frequently formed. Probably, the cross-structures are not an ingredient for building up a fibril. In other words, those complexes are on an off-pathway for the fibril formation existing only in the early stage of aggregation. An experiment of hen egg white lysozyme has revealed that non-b-sheet oligomers exist as off-pathway aggregates.[37] Figure 9 indicates that the structural contributors for the ˚ ) of Pstrð19Þ shoulder (k55:5 A cano ðk; v; TÞ in Figure 8 are the antiparallel b-sheets, parallel b-sheets, and the cross-structures. strð19Þ Note that Pcano ðk; v; TÞ is the equilibrated distribution for the two-chain system. Thus, this distribution may emerge in the early stage of a multichain system as a transitional state, where the system is approximated by a quasiequilibrium. We also found b-helical structures for range of k > 7 A˚ (c4 and c5 in Fig. 9), although those helical structures did not

Figure 9. Free-energy landscape PMFðk; eA  eB ; TÞ at 300 K on 2D space constructed by k and eA  eB . See main text for the definition of PMF. The lowest PMF is set to 0.0 kcal mol21. Tertiary structures picked from the landscape are labeled as “ci .” Red-filled circle represents position of the parallel b-sheet in the crystal structure, whose complex structure is named “native.”[32]

1498

Journal of Computational Chemistry 2015, 36, 1489–1501

WWW.CHEMISTRYVIEWS.COM

FULL PAPER

WWW.C-CHEM.ORG

Figure 10. Dependency of helical hydrogen bonds (H-bonds) Na and interpeptide H-bonds Nb on the molecular distance k. See main text for detailed definition of Na and Nb .

form a particular free-energy basin. In fact, the helical structures were widely spread in the 2D landscape. Those helical structures were mixed in random complexes (c6 and c7 ) or in unbound forms (c8 and c9 ). Then, we analyze dependency of the helix formation on k. We assigned a helical hydrogen bond (H-bond) to a peptide when the distance between the carbonyl oxygen of the i-th residue and the amido nitrogen of ˚ . The number of the ði11Þ-th residue was smaller than 2.0 A helical H-bonds is denoted as Na . We also computed the number of intermain-chain H-bonds (denoted as Nb ), which contribute to the b-sheet formation because the side-chains of Ab30–35 do not contribute to an interpeptide H-bond. Figure 10 plots the dependence of Na and Nb on k, which demon-

˚ ). strates that helix is formed in a wide range of k (7  k  20A In contrast, the interpeptide H-bonds replaced the helical H˚ . This is natural because strands can be bonds for k < 7 A closer to each other in a b-sheet than in a helix. Last in this subsection, we consider a structural contributor ˚ ) of Pstrð19Þ for the shoulder (k55:5 A cano ðk; v; TÞ in Figure 8. Figure 9 indicates that the contributor is not single but three: the antiparallel b-sheets (the lowest free-energy basin), parallel bsheets (a semistable basin), and the cross-structures (another strð19Þ semistable basin). Note that Pcano ðk; v; TÞ is the equilibrated distribution for the two-chain system. Thus, this distribution may emerge in the early stage of a multichain system as a transitional state.

Convergence of distribution We checked the statistical significance of the obtained ensemble. First, we divided randomly the entire (150) trajectories of the production runs into two halves, each of which consisted of 75 trajectories, and computed the free-energy landscape at 300 K using each of halves. Figure 11 demonstrates that the halves produce similar PMF profiles. More quantitatively, we calculated cross-correlations among the PMFs as follows: First, we denote the two halves and the entire trajectories as H1, H2, and EN. Then PMFs are denoted as PMFðaÞ ðk; eA  eB ; TÞ, where a5H1; H2; or EN, respectively. The cross-correlation between two PMFs is defined as: ða;bÞ Ccorr 5

where

8 > > > > > > > > > > > > > >
> > Dk3DðeA  eB Þ > > > ðð > > > > > PMFðaÞ PMFðbÞ dk dðeA  eB Þ > > > : crða;bÞ 5 Dk3DðeA  eB Þ

Figure 11. Free-energy landscapes computed from a) a half (75 trajectories) of the entire (150) trajectories and b) the other half. Division into the halves was done randomly. See caption of Figure 9 for the meaning of xand y-axes. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

(18)

;

:

(19)

The two quantities Dk ð5kmax 2kmin Þ and DðeA  eB Þ ð5maxð ˚ and 2.0, respectively. The eA  eB Þ2minðeA  eB ÞÞ are 15.3 A computed cross-correlation coefficients were very high: ðEN;H1Þ ðEN;H2Þ ðH1;H2Þ Ccorr 50:994, Ccorr 50:995, and Ccorr 50:979. Furthermore, we computed the cross-correlation coefficients for the 2D canonical distribution with replacing PMFðk; eA  eB ; TÞ by strð19Þ P2D AUS ðk; eA  eB Þ3Pcano ðk; v; TÞ in eq. (19). The resultant crosscorrelation coefficients were 0.97 between H1 and EN, 0.97 between H2 and EN, and 0.88 between H1 and H2, which also indicate high correlation. Note that trajectories in H1 and H2 are from independent simulation runs. Thus, the resultant conformational ensemble is considered to be a good approximation for the equilibrated ensemble. The results have indicated that relatively short trajectories, where each run shuttled once or twice in the k axis during the run, converge to an Journal of Computational Chemistry 2015, 36, 1489–1501

1499

FULL PAPER

WWW.C-CHEM.ORG

ensemble, when the trajectory connection satisfies the detailed balance condition.

Conclusion strðkÞ

The structural distribution, PAUS ðk; v; TÞ, obtained from V-AUS converged to a flat function more rapidly than the conventional AUS did. Furthermore, the snapshots obtained from the TTPconverged to a unique ensemble. Thus, the entire trajectories generated an ensemble statistically enough, although each trajectory was short. Unlike multicanonical sampling, VAUS sampled both the small (bound) and large (unbound) basins equally along the reaction coordinate axis. Therefore, VAUS is an effective sampling tool when the small basin has an important biological role in the networks among the basins. Early attempt to find a reaction coordinate in protein systems began with using less computer resources than available today,[46] where the reaction coordinate was defined as a low potential energy path. In AUS, various reaction coordinates are possible such as single-pathway-specific, multipathway-specific, and pathway-ensemble reaction coordinates. The current reaction coordinate k5rAB belongs to the pathway-ensemble one. V-AUS is also applicable to single-molecular conformational changes with introducing a reaction coordinate that specifies the conformational changes of the molecule effectively. Fukunishi et al. proposed a filling potential (FP) method and a smooth reaction-path generation (SRPG) method, which compute a binding free energy of protein–ligand docking along a protein–ligand dissociation path.[33,47] In the FP method, the ligand position is restricted on the dissociation path, and the free energy along the path is computed by a histogram method. In the SRPG method, after setting a number of sites in the path, sampling is performed at each site, and an average force is computed at those sites. The binding free energy is related to a line-integration of the average force along the path. As the dissociation path is defined as a 1D line in the real 3D space, the computational cost for these methods is less than that for the currently used V-AUS method where the ligand moves freely in the 3D space. However, the FP and SRPG methods have no ability to sample conformations out of the defined path. Therefore, the FP and SRPG methods are useful when the bound state is well known in advance. Contrarily, V-AUS is useful for sampling the conformational space vastly, where various types of complexes exist. Generally, AUS requires accurate estimation of EAUS (or Pstr cano ), and EAUS is updated through iterative simulations. Metadynamics is another method to update EAUS ,[17] where Pstr cano is updated at each step of a simulation. Therefore, metadynamics is a method suitable for automatization. However, because EAUS varies during the single simulation, this method is categorized into nonequilibrium sampling. If the dynamics of a system is a mixture of fast and slow processes and only the slow process leads the system to a major basin, nonequilibrium sampling may raise a significant error in the resultant free energy for the major basin.[2] strð19Þ The canonical distribution Pcano ðk; v; TÞ reached plateau for ˚ k > 17 A (Fig. 8). This is not necessarily means that the pep1500

Journal of Computational Chemistry 2015, 36, 1489–1501

˚ . The function Pstrð19Þ tides are in the bulk state for k > 17 A cano ðk; v; TÞ is not a radial distribution but a distance distribution strð19Þ function. Thus, Pcano ðk; v; TÞ should increase with increasing k in the bulk state. To estimate the binding free energy, which involves contribution from a cratic entropy[48–51] in the completely unbound state, the periodic box size should be larger than the currently used one.

Acknowledgments The simulation was performed in TSUBAME 2.5 Supercomputer in the Tokyo Institute of Technology, Japan, provided through the HPCI System Research Project (Project ID: hp140032). Keywords: virtual system  adaptive umbrella  multicanonical  free-energy landscape  flexible docking

How to cite this article: J. Higo, B. Dasgupta, T. Mashimo, K. Kasahara, Y. Fukunishi, H. Nakamura. J. Comput. Chem. 2015, 36, 1489–1501. DOI: 10.1002/jcc.23948

[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29]

A. Mitsutake, Y. Sugita, Y. Okamoto, Biopolymers 2001, 60, 96. J. Higo, J. Ikebe, N. Kamiya, H. Nakamura, Biophys. Rev. 2012, 4, 27. B. A. Berg, T. Neuhaus, Phys. Rev. Lett. 1992, 68, 9. U. H. E. Hansmann, Y. Okamoto, F. Eisenmenger, Chem. Phys. Lett. 1996, 259, 321. A. Kidera, Proc. Natl. Acad. Sci. USA 1995, 92, 9886. Y. Iba, G. Chikenji, M. Kikuchi, J. Phys. Soc. Jpn. 1998, 67, 3327. N. Hori, G. Chikenji, R. Berry, S. Takada, Proc. Natl. Acad. Sci. USA 2009, 106, 73. N. Nakajima, H. Nakamura, A. Kidera, J. Phys. Chem. B 1997, 101, 817. N. Nakajima, J. Higo, A. Kiedra, H. Nakamura, J. Mol. Biol. 2000, 296, 197. N. Nakajima, Chem. Phys. Lett. 1998, 288, 319. J. Ikebe, S. Sakuraba, H. Kono, J. Comput. Chem. 2014, 35, 39. J. Higo, N. Nakajima, H. Shirai, A. Kidera, H. Nakamura, J. Comput. Chem. 1997, 18, 2086. H. Okumura, Y. Okamoto, Bull. Chem. Soc. Jpn. 2007, 80, 1114. G. H. Paine, H. A. Scheraga, Biopolymers 1985, 24, 1391. M. Mezei, J. Comput. Phys. 1987, 68, 237. F. Wang, D. P. Landau, Phys. Rev. E 2001, 64, 056101. A. Laio, M. Parrinello, Proc. Natl. Acad. Sci. USA 2002, 99, 12562. J. G. Kim, Y. Fukunishi, A. Kidera, H. Nakamura, Phys. Rev. E 2004, 70, 057103. J. Higo, K. Umezawa, H. Nakamura, J. Chem. Phys. 2013, 138, 184106. J. Higo, N. Kamiya, T. Sugihara, Y. Yonezawa, H. Nakamura, Chem. Phys. Lett. 2009, 473, 326. T. Sugihara, J. Higo, H Nakamura, J. Phys. Soc. Jpn. 2009, 78, 074003. J. Ikebe, K. Umezawa, N. Kamiya, T. Sugihara, Y. Yonezawa, Y. Takano, H. Nakamura, J Higo, J. Comput. Chem. 2011, 32, 1286. T. Mashimo, Y Fukunishi, N. Kamiya, Y. Takano, I. Fukuda, H. Nakamura, J. Chem. Theory Comput. 2013, 9, 5599. I. Fukuda, Y. Yonezawa, H. Nakamura, J. Chem. Phys. 2011, 134, 164107. I. Fukuda, N. Kamiya, Y. Yonezawa, H. Nakamura, J. Chem. Phys. 2012, 137, 054314. J. Ryckaert, G. Ciccotti, H Berendsen, J. Comput. Phys. 1977, 23, 327. D. J. Evans, G. P. Morriss, Phys. Lett. A 1983, 98, 433. N. Kamiya, Y. S. Watanabe, S. Ono, J. Higo, Chem. Phys. Lett. 2005, 401, 312. W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, D. M. Ferguson, D. C. Spellmeyer, T. Fox, J. W. Caldwell, P. A. Kollman, J. Am. Chem. Soc. 1995, 117, 5179.

WWW.CHEMISTRYVIEWS.COM

FULL PAPER

WWW.C-CHEM.ORG

[30] P. Kollman, R. Dixon, W. Cornell, T. Fox, C. Chipot, A. Pohorille, Comput. Simul. Biomol. Syst. 1997, 3, 83. [31] W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey, M. L. Klein, J. Chem. Phys. 1983, 79, 926. [32] J.-P. Colletier, A. Laganowsky, M. Landau, M. Zhao, A. B. Soriaga, L. Goldschmidt, D. Flot, D. Cascio, M. R. Sawaya, D. Eisenberg, Proc. Natl. Acad. Aci. USA 2011, 108, 16938. [33] Y. Fukunishi, Y. Mikami, H. Nakamura, J. Phys. Chem. B 2003, 107, 13201. [34] A. T. Petkova, R. D. Leapman, Z. Guo, W.-M. Yau, M. P. Mattson, R. Tycko, Science 2005, 307, 262. [35] M. F€andrich, J. Meinhardt, N. Grigorieff, Prion 2009, 3, 89. [36] I. Usov, J. Adamcik, R. Mezzenga, ACS Nano 2013, 7, 10465. [37] Y. Zou, Y. Li, W. Hao, X. Hu, G. Ma, J. Phys. Chem. B 2013, 117, 4003. [38] K. C. Stein, H. L. True, PLOS Pathog. 2014, 10, e1004328. [39] W. Hwang, S. Zhang, M. Karplus, Proc. Natl. Acad. Sci. USA 2004, 101, 12916. [40] G. Wei, J.-E. Shea, Biophys J. 2006, 91, 1638. [41] T. Takeda, D. K. Klimov, Biophys. J. 2009, 96, 442.

[42] [43] [44] [45] [46] [47] [48] [49] [50] [51]

L. Larini, J.-E. Shea, Biophys J. 2012, 103, 576. Z. Su, C. L. Dias, J. Chem. Phys. B 2014, 118, 10830. J. Foote, C. Milstein, Proc. Natl. Acad. Sci. USA 1994, 91, 10370. B. Ma, M. Shatsky, H. J. Wolfson, R. Nussinov, Protein Sci. 2002, 11, 184. R. Elber, M. Karplus, Chem. Phys. Lett. 1987, 139, 375. Y. Fukunishi, D. Mitomo, H. Nakamura, Method. J. Chem. Inf. Model 2009, 49, 1944. W. Kauzmann, Adv. Protein Chem. 1959, 14, 1. J. Janin, Proteins 1996, 24, i. J. Hermans, L. Wang, J. Am. Chem. Soc. 1997, 119, 2707. J. Villa, M.  Strajbl, T. M. Glennon, Y. Y. Sham, Z. T. Chu, A. Warshel, Proc. Natl. Acad. Aci. USA 2000, 97, 11899.

Received: 19 March 2015 Revised: 21 April 2015 Accepted: 24 April 2015 Published online on 4 June 2015

Journal of Computational Chemistry 2015, 36, 1489–1501

1501

Virtual-system-coupled adaptive umbrella sampling to compute free-energy landscape for flexible molecular docking.

A novel enhanced conformational sampling method, virtual-system-coupled adaptive umbrella sampling (V-AUS), was proposed to compute 300-K free-energy ...
1MB Sizes 1 Downloads 7 Views