Efficient calculation of rate constants: Downhill versus uphill sampling Konstantin V. Klenin Citation: The Journal of Chemical Physics 141, 074103 (2014); doi: 10.1063/1.4892565 View online: http://dx.doi.org/10.1063/1.4892565 View Table of Contents: http://scitation.aip.org/content/aip/journal/jcp/141/7?ver=pdfcov Published by the AIP Publishing Articles you may be interested in Combine umbrella sampling with integrated tempering method for efficient and accurate calculation of free energy changes of complex energy surface J. Chem. Phys. 141, 044108 (2014); 10.1063/1.4887340 BDflex: A method for efficient treatment of molecular flexibility in calculating protein-ligand binding rate constants from Brownian dynamics simulations J. Chem. Phys. 137, 135105 (2012); 10.1063/1.4756913 Comparison of enveloping distribution sampling and thermodynamic integration to calculate binding free energies of phenylethanolamine N-methyltransferase inhibitors J. Chem. Phys. 135, 024105 (2011); 10.1063/1.3604534 Kinetics and reaction coordinate for the isomerization of alanine dipeptide by a forward flux sampling protocol J. Chem. Phys. 130, 225101 (2009); 10.1063/1.3147465 Multiple state transition path sampling J. Chem. Phys. 129, 224107 (2008); 10.1063/1.3029696

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 137.149.200.5 On: Sat, 22 Nov 2014 22:39:20

THE JOURNAL OF CHEMICAL PHYSICS 141, 074103 (2014)

Efficient calculation of rate constants: Downhill versus uphill sampling Konstantin V. Klenina) Steinbuch Centre for Computing, Karlsruhe Institute of Technology, P.O. Box 3640, D-76021 Karlsruhe, Germany

(Received 17 April 2014; accepted 28 July 2014; published online 18 August 2014) The classical transition state theory (TST), together with the notion of transmission coefficient, provides a useful tool for calculation of rate constants for rare events. However, in complex biomolecular reactions, such as protein folding, it is difficult to find a good reaction coordinate, so the transition state is ill-defined. In this case, other approaches are more popular, such as the transition interface sampling (TIS) and the forward flux sampling (FFS). Here, we show that the algorithms developed in the frames of TIS and FFS can be successfully applied, after a modification, for calculation of the transmission coefficient. The new procedure (which we call “downhill sampling”) is more efficient in comparison with the traditional TIS and FFS (“uphill sampling”) even if the reaction coordinate is bad. We also propose a new computational scheme that combines the advantages of TST, TIS, and FFS. © 2014 AIP Publishing LLC. [http://dx.doi.org/10.1063/1.4892565] time can be calculated as

I. INTRODUCTION

It is well known that the simulation times required for modeling of rare transition events can be essentially smaller than the mean transition time.1–6 Consider a system consisting of two metastable states, A and B, separated by a free energy barrier. We assume that the states can be distinguished by some order parameter λ, so that λ < λTS if the system is in state A and λ > λTS if it is in state B, where λTS is a certain threshold. The transition state formally corresponds to λ = λTS . The mean period τ TS of random oscillation between states A and B is equal to the inverse flux through the hypersurface λ = λTS and can be calculated, for example, by biased molecular dynamics (MD) simulations. Let us introduce two more thresholds: λA in region A and λB in region B. These thresholds are located at arbitrary points that can be well sampled in unbiased simulations, but not too far from the barrier. We can now define the following historydependent, “primed,” counterparts of states A and B: the system is in state A if λ ≤ λA and it remains in this state until reaching λB , in which case it changes to B . Similarly, the system remains in state B until reaching λA . The mean life-time of state A is τA→B = pA τ  , where τ  is the mean period of oscillations between states A and B , and pA is the probability of state A . The quantity τ A→B is practically equivalent to the transition time from A to B, i.e., the inverse rate constant. The transmission coefficient is defined as κ TS = τ TS /τ  . Suppose that, from the biased MD simulations, one has an equilibrium ensemble of single MD steps crossing the hypersurface λ = λTS in the direction from A to B. Then κ TS is the probability that such a step, being continued backwards and forwards in time until reaching either λA or λB , will generate a path satisfying the following two conditions: (i) it starts at λA and ends at λB ; (ii) the first passing through λTS occurs at the “seed” step used for the path generation. Thus, the transition

a) Electronic mail: [email protected]

0021-9606/2014/141(7)/074103/6/$30.00

τA→B = pA τTS /κTS ,

(1)

where we approximated pA by the corresponding unprimed value, pA , which is the probability of the “unprimed” state A. The computation is performed in two stages: (i) from the biased MD simulations, one obtains pA τ TS as well as the set of the “seed” MD steps to be used in (ii) the unbiased simulations that provide the coefficient κ TS and the ensemble of the transition paths. This approach is based on the transition state theory7–13 and is sometimes called reactive flux method (RFM).14 Its advantage is that the computational effort does not depend on the transition times. However, for large biomolecules like proteins, the transmission coefficient proved to be too small.1, 14 To overcome this difficulty, other methods were developed, such as the transition interface sampling (TIS)15–17 and the forward flux sampling (FFS).18 In the above considerations, the exact position of the threshold λTS does not play principal role: it can be placed anywhere between λA and λB . In particular, it can coincide with λA . Then the expression for the transition time becomes τA→B = pA τA /κA ,

(2)

where τ A is the inverse outgoing flux crossing λA , and the new transmission coefficient κ A is the probability that, after such a crossing, the system will reach λB before λA . The τ A value results from averaging over long observation time comprising many transition events. But if during the observation time the system always remains in state A , then the inverse outgoing flux becomes pA τA . It is the same combination that is present in Eq. (2). Its value is available from direct MD simulations of state A. Although the new transmission coefficient, κ A , is even smaller than the former one, κ TS , there are special procedures that allow its efficient computation. The space between the thresholds λA and λB can be divided into k relatively thin slices by interfaces λ0 , λ1 , . . . , λk , so that λA = λ0 < λ1 < . . . < λk = λB . Then the coefficient κ A (also called crossing

141, 074103-1

© 2014 AIP Publishing LLC

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 137.149.200.5 On: Sat, 22 Nov 2014 22:39:20

074103-2

Konstantin V. Klenin

J. Chem. Phys. 141, 074103 (2014)

probability) is factorized as κA = PA (λ0 → λ1 )PA (λ1 → λ2 ) . . . PA (λk−1 → λk ),

(3)

where the multipliers PA (λi → λi+1 ), i = 0, 1, . . . , k − 1, have the following meaning. Consider an ensemble of the paths that pass successively through the interfaces λA and λi . Then PA (λi → λi+1 ) is the probability that such a path reaches the interface λi+1 before λA . TIS and FFS provide different algorithms of generation of the required path ensembles. II. THEORY

Assuming that the threshold λTS coincides with the interface λm , we can factorize the classical transmission coefficient κ TS as κTS = PTS (λm→λ0 ) · PA (λm→λk ) = PTS (λm→λm−1 )PTS (λm−1→λm−2 ) . . . PTS (λ1→λ0 ) ×PA (λm→λm+1 )PA (λm+1→λm+2 ) . . . PA (λk−1→λk ). (4) The transmission coefficient κ TS is usually larger than the crossing probability κ A and, hence, can be calculated more efficiently. We will refer to the simulation procedures based on Eq. (4) as the downhill sampling, because the sampling is performed in the direction from the top of the barrier down to either of the metastable states. In contrast, Eq. (3) implies that the sampling in the opposite, uphill, direction is unavoidable. Below we demonstrate that the downhill sampling is more efficient than the uphill sampling. It should be noted that a combination of TIS and RFM was first used by Juraszek et al.19 in their approximate method of partial path sampling (see the Appendix for details). Equation (4) is exact. III. METHODS

We use a computational scheme that generalize TIS and FFS. First, consider the uphill sampling (Eq. (3)). We assume, as before, that the space between the thresholds λA and λB is divided into k thin slices by the interfaces λ0 , λ1 , . . . , λk . We also assume that we are initially given an equilibrium set of n00 “seed” MD steps that pass through the threshold λ0 in the outgoing direction (from A to B). The computation of κ A is performed as follows (Fig. 1). Each “seed” step is stochastically continued forwards in time until reaching λB (transition paths) or returning to λA (recurrent paths). After completion, the recurrent paths are reversed in time. For each of the new paths, we register the first outgoing crossing with each of the interfaces λi (i = 1, 2, . . . , k − 1). Let n0i be the total number of the registered crossings with the interface λi . With an increase of the index i, the value of n0i can become very small or even equal to zero. Now, we are going to generate additional paths, so that the number of independent paths passing through each interface will be at least equal to a certain number N, which is a parameter of the method (N ≤ n00 ). Let λj be the first interface for which the number of the registered crossings is less than N:

FIG. 1. Illustration of the sampling scheme. Left panel: During preliminary MD simulations of state A, the MD steps that cross the interface λ0 = λA in the outgoing direction are registered (filled circles). Their total number is denoted by n00 (here n00 = 4). Central panel: The earlier registered steps are stochastically continued forwards in time until reaching λB (transition paths, not shown) or returning to λA (recurrent paths). After completion, the recurrent paths are reversed in time. For all new paths, the first outgoing crossings with each of the interfaces λi (i ≥ 1) are registered (filled circles). The total number of the crossings registered for the interface λi is denoted by n0i (here, n01 = 2, n02 = 1, n03 = 0). The first interface, for which this number is less than a preselected minimum N, is denoted by λj (here, N = 2, j = 2). Right panel: Construction of an additional path. An earlier registered crossing with the interface λj is used as a “seed” for a new path that is continued until reaching λB (not shown) or λA (dashed line). If the new path is recurrent, it is reversed in time. The first crossing of the new path with the interface λj is used as another “seed.” This procedure is repeated M more times (here M = 1) until the final path (solid line) is generated. For all new paths, the first outgoing crossings with each of the interfaces λi (i ≥ j) (open circles) are taken into account in the calculation of the numbers nji of the paths that reach the interface λi after starting from a “seed” at the interface λj (here, n22 = 2, n23 = 2, n24 = 0).

n0(j−1) ≥ N > n0j . The missing (N − n0j ) paths are constructed by the Monte Carlo (MC) technique similar to the shooting algorithm.20 The MC move is performed as follows. Suppose, we are given a path that starts at the interface λ0 and proceeds to λj . Its first step passing through λj serves as a “seed” for the new stochastic path that is continued forwards in time until reaching either λB or λA . If the new path is recurrent, it is reversed in time, so that the last crossing with λj becomes the first one. Every move is accepted unconditionally. The initial path is constructed from a randomly chosen crossing with λj that was registered earlier. In total, we perform (N − n0j ) MC runs with different initial paths, each run consisting of M moves. The M value is another parameter of the method. The first crossings of the final paths with each of the following interfaces (λj+1 , λj+2 , and so on) are registered to be possibly used later as “seeds” for the new paths. All paths generated during the MC runs, including intermediate ones, are taken into account in the calculation of the numbers nji of the paths that reach the interface λi (i = j, j + 1, . . . , k) after starting from a “seed” at the interface λj . The similar procedure is repeated iteratively for the rest of the interfaces (λj+1 , λj+2 , . . . , λk − 1 ) until all the numbers nli , defined in a similar way, are known (0 ≤ l ≤ i ≤ k), so that all the multipliers in Eq. (3) can be calculated PA (λi → λi+1 ) =

n0(i+1) + n1(i+1) + . . . + ni(i+1) n0i + n1i + . . . + nii

.

(5)

Note that, if the parameter M, which determines the length of the MC runs, is equal to zero, then this algorithm is practically identical to FFS, whereas with an increase of M it becomes similar to TIS. In the case when N/n00 ≤ κ A , the procedure reduces to the direct sampling without additional paths.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 137.149.200.5 On: Sat, 22 Nov 2014 22:39:20

074103-3

Konstantin V. Klenin

J. Chem. Phys. 141, 074103 (2014)

In the downhill sampling based on Eq. (4), we start from an equilibrium set of “seed” MD steps that pass through λTS in the direction to state B. This set is available from biased MD simulations. The same computational scheme is used twice. First, the time is reversed and the crossing probability PTS (λTS → λA ) to reach the interface λA starting from the interface λTS is calculated. Second, the direction of time is restored and the previous transition paths are continued forwards until reaching λB or λA . That reproduces the situation of the uphill sampling (starting from the interface λTS ) and makes it possible to calculate the probability PA (λTS → λB ) in exactly the same way. The unbiased inverse flux τ TS through the hypersurface (b) λ = λTS can be obtained from the biased one, τTS , as τTS (b) = τTS ρ(λTS )/ρb (λTS ), where ρ b (λ) and ρ(λ) are the biased and unbiased probability densities of λ, respectively: ρ b (λ) ∼ ρ(λ)exp [ − Ub (λ)/kB T], with Ub (λ) being the bias potential chosen in such a way that ρb (λ) ≈ const (kB is the Boltzmann constant, T is the temperature). As a measure for the efficiency, we use the “relative cost” ε = δ · S1/2 , where δ is the relative statistical error of the computed value and S is the number of MD steps. It is easy to show that, if two values X1 and X2 are characterized by the costs ε1 and ε2 , respectively, then the product X = X1 X2 is characterized by the cost ε = ε1 + ε2 , provided that the computational effort is distributed between X1 and X2 in the optimal way, which is the case when S1 /ε1 = S2 /ε2 . IV. ILLUSTRATIVE EXAMPLES

We illustrate the advantage of the downhill sampling by two simple model systems with ill-defined transition states. The first system consists of a particle that moves in the twodimensional space according to the overdamped Langevin dynamics xi (t + t) = xi (t) + (Fi /kB T )Dt + (2Dt)1/2 ξi ,

(6)

where x = (x1 , x2 ) is the position, t is the time, t = 1 is the time step, D = 0.0005 is the diffusion coefficient, and Fi is the force. The vector components ξ i are uncorrelated random variables with the standard normal distribution. The geometry of the system is shown in Fig. 2(a). The numerical results together with the relative costs for the uphill and downhill sampling are presented in Table I. Figures 2(b)–2(d) show the dependence of the relative cost ε of the uphill sampling on the following parameters of the method: (b) the distance λ between neighboring interfaces, (c) the number M of MC steps used for construction of additional paths, and (d) the minimal number N of paths at each interface divided by the number of the initial “seeds” n00 . Note that small M values turn out to be more favorable, so the value M = 0 is used below, which makes our procedure similar to FFS. The relative costs of the downhill sampling, plotted in Fig. 2(e) as functions of N/n00 , are shown separately for the two multipliers of the transmission coefficient, PTS (λTS → λA ) and PA (λTS → λB ). The leftmost point of each plot corresponds to the direct sampling without additional paths.

FIG. 2. Model system 1. (a) The position x is restricted to the rectangle with reflecting boundaries. State A corresponds to λ < 0, state B to λ > 0, λ = x1 . The -shaped potential barrier between the states is placed at the angle of 45◦ . The force F/kB T is constant on each slope, being equal to (−5, −5) on the left and to (5, 5) on the right. The height of the barrier is U0 /kB T = 10. (b)-(e) Dependence of the relative cost ε on the parameters of the method: (b)-(d) for the crossing probability κ A (uphill sampling), (e) for the multipliers PTS (λTS → λA ) (triangles) and PA (λTS → λB ) (circles) of the transmission coefficient κ TS (downhill sampling).

The quality of the parameter λ considered as a reaction coordinate can be estimated by the probability PA (λTS → λB ) ≈ 0.06, which is far below the ideal value of 0.5. Thus, our model corresponds to the case of a bad reaction coordinate. At the same time, this probability is not so small as to cause difficulty for the classical RFM. In fact, the increase of N only leads to a loss of efficiency, as follows from Fig. 2(e). The computation of κ TS can be performed approximately 50 times faster than the computation of κ A : this ratio can be estimated from the optimal ε values as [ε(κ A )/ε(κ TS )]2 , where ε(κ TS ) = ε[PTS (λTS → λA )] + ε[PA (λTS → λB )]. This is in striking contrast to the analytical study of van Erp14 who considered a system with a similar potential barrier (but different dynamic behavior) and predicted that TIS should be considerably more efficient than RFM. One has to take into account, TABLE I. Numerical results obtained for the two model systems by the uphill and downhill samplings together with the optimal relative cost ε. System 1

 τ pA A κA τ A → B (uphill)

pA τ TS PTS (λTS → λA ) PA (λTS → λB ) κ TS τ A → B (downhill)

System 2

Value

ε

Value

ε

1.13 × 102 9.21 × 10−6 1.23 × 107

40 1340 1380

2.20 × 101 7.05 × 10−9 3.11 × 109

30 1050 1080

7.12 × 104 9.25 × 10−2 6.32 × 10−2 5.85 × 10−3 1.22 × 107

350 40 140 180 530

6.70 × 105 4.25 × 10−4 5.01 × 10−1 2.13 × 10−4 3.15 × 109

350 80 200 280 630

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 137.149.200.5 On: Sat, 22 Nov 2014 22:39:20

074103-4

Konstantin V. Klenin

J. Chem. Phys. 141, 074103 (2014)

= λTS . Consider the paths that start at the interface λi−1 and end (without returning back) just after the first crossing with the interface λi . Let us denote by ai+ the set of the final points of these paths. Similarly, let ai− be the ensemble of the final points of the paths that start at λi+1 and end after the first crossing with λi . The ensembles ai+ (i = 1, 2, . . . , k) and ai− (i = 0, 1, . . . , k − 1) can be considered as nodes in a transition network with the following non-zero transition probabilities (i = 1, 2, . . . , k − 1):

FIG. 3. Model system 2. (a) The potential energy U(x). (b) and (c) Relative cost ε versus parameter N/n00 for: (b) the crossing probability κ A (uphill sampling), (c) the multipliers PTS (λTS → λA ) (triangles) and PA (λTS → λB ) (circles) of the coefficient κ TS (downhill sampling).

however, that the preliminary MD simulations for the downhill sampling are more time consuming than for the uphill sampling (see Table I, lines pA τA and pA τ TS ). It reduces the advantage of the downhill sampling but does not eliminate it. The overall ε2 -ratio is [ε(pA τA /κA )/ε(pA τTS /κTS )]2 ≈ 7. In the second illustrative example we consider the case when the reaction coordinate is perfect but there is an intermediate on top of the barrier. The particle moves in the onedimensional potential shown in Fig. 3(a). The dynamics is defined by the one-dimensional analogue of Eq. (6). The results are presented in Table I. The relative costs are displayed in Figs. 3(b) and 3(c). For this system, the probability PTS (λTS → λA ) is small (≈4 × 10−4 ), so the performance can be improved by adjusting the parameter N (Fig. 3(c)). As before, the downhill sampling is more efficient than the uphill sampling: the ratio of the optimal ε2 values is [ε(κ A )/ε(κ TS )]2 ≈ 14. V. CONCLUSIONS

We showed that the computational procedure based on the classical transition state theory can be very efficient even in the case when the true transition state ensemble is not available. The transmission coefficient can be calculated by the same algorithms as were developed in the frame of the transition interface sampling and the forward flux sampling. APPENDIX: COMPARISON WITH THE PARTIAL PATH SAMPLING

The transition interface technique was first used for calculation of the classical transmission coefficient κ TS by Juraszek et al.19 who proposed the method called “transition state partial path TIS” (TS-PPTIS). In this appendix, we compare our approach with TS-PPTIS in order to reveal similarities and differences. The main idea of the partial path sampling21 can be described as follows. We assume, again, that the space between the thresholds λA and λB is divided into k slices by the interfaces λ0 , λ1 , . . . , λk , so that λ0 = λA , λk = λB , and λm

+ ) = Pi−1 (λi → λi+1 ), wi++ = w(ai+ → ai+1

(A1a)

− wi+− = w(ai+ → ai−1 ) = 1 − wi++ ,

(A1b)

− wi−− = w(ai− → ai−1 ) = Pi+1 (λi → λi−1 ),

(A1c)

+ wi−+ = w(ai− → ai+1 ) = 1 − wi−− .

(A1d)

Let us define the committors qAB (ai+ ) and qAB (ai− ) as the probabilities that the system, starting from the node ai+ (or, respectively, ai− ), will reach the state aB+ = ak+ before the state aA− = a0− . These quantities can be found from the system of linear equations22 + − qAB (ai+ ) = wi++ qAB (ai+1 ) + wi+− qAB (ai−1 ),

(A2a)

− + qAB (ai− ) = wi−− qAB (ai−1 ) + wi−+ qAB (ai+1 ),

(A2b)

with ak+ = 1 and a0− = 0. Then, the crossing probability is (PPTIS21 ) κA = PA (λ0 → λ1 )qAB (a1+ ).

(A3)

A In a similar way, one can define the committors qTS (ai− ) and + B qTS (ai ) and express through them the transmission coefficient as − A + (am−1 )qAB (am ) κTS = PTS (λm → λm−1 )qTS

(A4a)

+ B − κTS = PTS (λm → λm+1 )qTS (am+1 )qBA (am ),

(A4b)

or as

or, for better efficiency, as an average value of these two expressions (TS-PPTIS19 ). The advantage of this sampling method is that the required MD paths are always short: there is no need for paths that would extend for more than three consecutive interfaces. However, the method is only approximate, as it is based on the so called “memory loss” assumption: all the points belonging to the same ai+ (or ai− ) ensemble are treated on the same basis, regardless of their actual position. The validity of this assumption depends on the peculiarities of the system and on the distances between the interfaces. In particular, the PPTIS approach is not suitable for multi-channel reactions (at least, when the channels are not known a priori). The sampling scheme is based on the shooting algorithm,20 which, being a sort of a MC move, allows construction of a new path by modification of the old one. A simplified version of the shooting algorithm is as follows (cf. Sec. III). Consider three consecutive interfaces λi−1 , λi , and λi+1 and an initial path that starts either at λi−1 or λi+1 , passes

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 137.149.200.5 On: Sat, 22 Nov 2014 22:39:20

074103-5

Konstantin V. Klenin

through λi , and ends either at λi−1 or λi+1 . The new path coincides with the old one until the first crossing with the interface λi and then it is continued independently and stochastically until reaching λi−1 or λi+1 . After completion, the new path is reversed in time and is accepted unconditionally. After a large number of such MC moves, one obtains an equilibrium ensemble of paths that are confined between λi−1 and λi+1 and pass through λi . This ensemble allows computation of the transition probabilities wi++ , wi+− , wi−− , and wi−+ , as well as the values of Pi (λi → λi+1 ) and Pi (λi → λi−1 ). Thus, all the quantities required for calculation of κ A and κ TS become available. In the frame of PPTIS, the sampling is always performed in the both directions simultaneously (from A to B and from B to A). Hence, one cannot choose between uphill and downhill sampling. The advantage of TS-PPTIS [Eq. (A4)] over PPTIS [Eq. (A3)] is of different nature. The crossing probability κ A [Eq. (A3)] strongly depends on the position of λA , which should be chosen before sampling. In contrast, the dependence of the transmission coefficient κ TS on λA is weak. One can gain in efficiency without considerable loss in accuracy by the following substitutions of the committor indices in Eq. (A4): A → m−n, B → m+n, where the optimal value of n is to be found by the enumerative search starting from 1. The approach proposed in this study differs from TSPPTIS in three essential points. (1) Our approach is mathematically exact. The “memory loss” assumption is not used. In particular, our approach remains valid in the case of multichannel reactions. (2) The efficiency of our approach is due to sampling in the downhill direction, whereas the efficiency of TS-PPTIS is achieved by a shorter length of the sampling paths and by the optimal choice of the sampling area. (3) For construction of the sampling paths, we use, whenever possible, the “seed” MD steps available from the previous stages (or iterations) of the computational process. In particular, the initial MD steps that cross the hypersurface λ = λTS are taken from the global MD simulations of the whole system with a biased potential. This procedure is more reliable than the shooting algorithm, which is vulnerable to local traps. This is another reason why our approach is suitable for multi-channel reactions, while TS-PPTIS is not. To complete the comparison between the two approaches, we calculated the transmission coefficient κ TS of System 1 [Fig. 2(a)] by the TS-PPTIS method. Because of the restrictions imposed by the “memory loss” approximation, the space between the thresholds λA and λB could not be divided into more than k = 4 slices. The ensembles of partial paths, which contained an equal number of elements, were generated by the simplified version of the shooting algorithm described above. We found that, for k = 4, the relative cost ε was ∼330, which is noticeably above the optimal ε value of 180 reported in Table I. The systematic error (due to the approximate nature of the method) was ∼1%. It is interesting to note that at k = 2 (with λ0 = λA , λ1 = λTS , λ2 = λB ) the TSPPTIS method becomes mathematically exact and the sampling is performed only in the downhill direction. The ε value in this case was found to be ∼240. Thus, the gain from downhill sampling overweighed the loss from larger lengths of the paths.

J. Chem. Phys. 141, 074103 (2014)

One should also mention the possibility of hybrid computational algorithms that combine the features of TS-PPTIS and our approach. For example, the two space slices between the interfaces λm−1 and λm+1 can be sampled by the shooting algorithm like in TS-PPTIS. In this way, one obtains the ensemble of the partial transition paths that connect the interfaces λm−1 and λm+1 . Then, the starting and the final points of these paths are used as “seeds” for the downhill sampling of the both sides of the barrier. This scheme is mathematically exact and corresponds to the following factorization of the transmission coefficient: κTS = [PTS (λm → λm−1 )Pm−1 (λm → λm+1 )] ×[Pm+1 (λm−1 → λA )PA (λm+1 → λB )].

(A5)

In principle, the ensembles of partial paths can be obtained during the preliminary biased MD simulations of the whole system, provided that the bias potential Ub (λ) is flat within the corresponding space slices. For example, if Ub (λ) = const for λm−1 < λ < λm+1 , then the transition time can be found as τA→B =

pA τ (λm−1 , λm+1 ) , [Pm+1 (λm−1 → λA )PA (λm+1 → λB )]

(A6)

where τ (λm−1 , λm+1 ) is the mean time between occurrences of the transitions λm−1 → λm+1 renormalized for the unbiased system. In conclusion, we will show how our computational scheme (presented in Sec. III) can be optimized in the case when the “memory loss” assumption holds true. We do not impose any restrictions on the distances between the interfaces: these distances can be arbitrary small as before. Instead of this, we assume that the “memory loss” takes place within s space slices. Suppose that we perform sampling from A to B and we are given a “seed” MD step at the interface λi . Using this “seed,” we start a new path and continue it until reaching λB or λi−s . In the latter case, we continue the path further until reaching λi−s−1 or returning back to λi . Constructing other paths in a similar way, we obtain the following transition probabilities [cf. Eq. (A1)]: + ) = Pi−s (λi → λi+1 ), wi++ = w(ai+ → ai+1

(A7a)

− ) = 1 − wi++ , wi+− = w(ai+ → ai−s

(A7b)

−− − − = w(ai−s → ai−s−1 ) wi−s

= Pi (λi−s → λi−s−1 ),

i > s,

−+ − = w(ai−s → ai+ ) = 1 − wi−− , wi−s

i > s.

(A7c) (A7d)

Here, i is in the range from 1 to k−1 and the index i−s in the first two equations should be substituted by 0, if i < s. The notation ai+ stays for the set of the final points of the paths that start at λi−s and end just after the first crossing of − represents the final points of the paths that λi . Similarly, ai−s start at λi and end after crossing λi−s . As soon as the transition probabilities w are known, we can calculate the committors q and the crossing probability κ A .

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 137.149.200.5 On: Sat, 22 Nov 2014 22:39:20

074103-6

Konstantin V. Klenin

The optimized computational scheme differs from the original one (as described in Sec. III) only in one respect: some paths are finished earlier than before. The unidirectional character of sampling is preserved. Hence, the optimized scheme can also be used to increase efficiency of the downhill sampling, whenever the “memory loss” assumption is applicable. 1 P. G. Bolhuis, D. Chandler, C. Dellago, and P. L. Geissler, Annu. Rev. Phys.

Chem. 53, 291 (2002). K. Faradjian and R. Elber, J. Chem. Phys. 120, 10880 (2004). 3 D. J. Wales, Int. Rev. Phys. Chem. 25, 237 (2006). 4 S. Yang, J. N. Onuchic, A. E. García, and H. Levine, J. Mol. Biol. 372, 756 (2007). 5 N.-V. Buchete and G. Hummer, Phys. Rev. E 77, 030902 (2008). 6 F. Noé and S. Fischer, Curr. Opin. Chem. Biol. 18, 154 (2008). 7 H. Eyring, J. Chem. Phys. 3, 107 (1935). 8 M. G. Evans and M. Polanyi, Trans. Faraday Soc. 31, 875 (1935). 9 J. Keck, Discuss. Faraday Soc. 33, 173 (1962). 2 A.

J. Chem. Phys. 141, 074103 (2014) 10 C. H. Bennett, in Algorithms for Chemical Computations, ACS Symposium

Series Vol. 46, edited by R. E. Christoffersen (American Chemical Society (ACS), Washington, DC, 1977), pp. 63–97. 11 D. Chandler, J. Chem. Phys. 68, 2959 (1978). 12 J. B. Anderson, Adv. Chem. Phys. 91, 381 (1995). 13 D. G. Truhlar, B. C. Garrett, and S. J. Klippenstein, J. Phys. Chem. 100, 12771 (1996). 14 T. S. van Erp, J. Chem. Phys. 125, 174106 (2006). 15 T. S. van Erp, D. Moroni, and P. G. Bolhuis, J. Chem. Phys. 118, 7762 (2003). 16 T. S. van Erp, Phys. Rev. Lett. 98, 268301 (2007). 17 W.-N. Du and P. G. Bolhuis, J. Chem. Phys. 139, 044105 (2013). 18 R. J. Allen, P. B. Warren, and P. R. ten Wolde, Phys. Rev. Lett. 94, 018104 (2005). 19 J. Juraszek, G. Saladino, T. S. van Erp, and F. L. Gervasio, Phys. Rev. Lett. 110, 108106 (2013). 20 C. Dellago, P. G. Bolhuis, and D. Chandler, J. Chem. Phys. 108, 9236 (1998). 21 D. Moroni, P. G. Bolhuis, and T. S. van Erp, J. Chem. Phys. 120, 4055 (2004). 22 F. Noé, C. Schütte, E. Vanden-Eijnden, L. Reich, and T. R. Weikl, Proc. Natl. Acad. Sci. U.S.A. 106, 19011 (2009).

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 137.149.200.5 On: Sat, 22 Nov 2014 22:39:20

Efficient calculation of rate constants: downhill versus uphill sampling.

The classical transition state theory (TST), together with the notion of transmission coefficient, provides a useful tool for calculation of rate cons...
529KB Sizes 0 Downloads 6 Views