Multiple time step molecular dynamics in the optimized isokinetic ensemble steered with the molecular theory of solvation: Accelerating with advanced extrapolation of effective solvation forces Igor Omelyan and Andriy Kovalenko Citation: The Journal of Chemical Physics 139, 244106 (2013); doi: 10.1063/1.4848716 View online: http://dx.doi.org/10.1063/1.4848716 View Table of Contents: http://scitation.aip.org/content/aip/journal/jcp/139/24?ver=pdfcov Published by the AIP Publishing Articles you may be interested in Multiple branched adaptive steered molecular dynamics J. Chem. Phys. 141, 064101 (2014); 10.1063/1.4891807 Polar solvation dynamics of lysozyme from molecular dynamics studies J. Chem. Phys. 136, 185102 (2012); 10.1063/1.4712036 High-pressure effect on the dynamics of solvated peptides J. Chem. Phys. 136, 145103 (2012); 10.1063/1.3700183 Structural aspects of the solvation shell of lysine and acetylated lysine: A Car–Parrinello and classical molecular dynamics investigation J. Chem. Phys. 131, 225103 (2009); 10.1063/1.3268703 Bounding the electrostatic free energies associated with linear continuum models of molecular solvation J. Chem. Phys. 130, 104108 (2009); 10.1063/1.3081148

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

THE JOURNAL OF CHEMICAL PHYSICS 139, 244106 (2013)

Multiple time step molecular dynamics in the optimized isokinetic ensemble steered with the molecular theory of solvation: Accelerating with advanced extrapolation of effective solvation forces Igor Omelyan1,2,3,a) and Andriy Kovalenko1,2,b) 1

National Institute for Nanotechnology, 11421 Saskatchewan Drive, Edmonton, Alberta T6G 2M9, Canada Department of Mechanical Engineering, University of Alberta, Edmonton, Alberta T6G 2G8, Canada 3 Institute for Condensed Matter Physics, National Academy of Sciences of Ukraine, 1 Svientsitskii Street, Lviv 79011, Ukraine 2

(Received 21 June 2013; accepted 2 December 2013; published online 27 December 2013) We develop efficient handling of solvation forces in the multiscale method of multiple time step molecular dynamics (MTS-MD) of a biomolecule steered by the solvation free energy (effective solvation forces) obtained from the 3D-RISM-KH molecular theory of solvation (three-dimensional reference interaction site model complemented with the Kovalenko-Hirata closure approximation). To reduce the computational expenses, we calculate the effective solvation forces acting on the biomolecule by using advanced solvation force extrapolation (ASFE) at inner time steps while converging the 3D-RISM-KH integral equations only at large outer time steps. The idea of ASFE consists in developing a discrete non-Eckart rotational transformation of atomic coordinates that minimizes the distances between the atomic positions of the biomolecule at different time moments. The effective solvation forces for the biomolecule in a current conformation at an inner time step are then extrapolated in the transformed subspace of those at outer time steps by using a modified least square fit approach applied to a relatively small number of the best force-coordinate pairs. The latter are selected from an extended set collecting the effective solvation forces obtained from 3D-RISM-KH at outer time steps over a broad time interval. The MTS-MD integration with effective solvation forces obtained by converging 3D-RISM-KH at outer time steps and applying ASFE at inner time steps is stabilized by employing the optimized isokinetic Nosé-Hoover chain (OIN) ensemble. Compared to the previous extrapolation schemes used in combination with the Langevin thermostat, the ASFE approach substantially improves the accuracy of evaluation of effective solvation forces and in combination with the OIN thermostat enables a dramatic increase of outer time steps. We demonstrate on a fully flexible model of alanine dipeptide in aqueous solution that the MTS-MD/OIN/ASFE/3DRISM-KH multiscale method of molecular dynamics steered by effective solvation forces allows huge outer time steps up to tens of picoseconds without affecting the equilibrium and conformational properties, and thus provides a 100- to 500-fold effective speedup in comparison to conventional MD with explicit solvent. With the statistical-mechanical 3D-RISM-KH account for effective solvation forces, the method provides efficient sampling of biomolecular processes with slow and/or rare solvation events such as conformational transitions of hydrated alanine dipeptide with the mean life times ranging from 30 ps up to 10 ns for “flip-flop” conformations, and is particularly beneficial for biomolecular systems with exchange and localization of solvent and ions, ligand binding, and molecular recognition. © 2013 AIP Publishing LLC. [http://dx.doi.org/10.1063/1.4848716] I. INTRODUCTION

Dynamical processes involving conformational and folding equilibria in biomolecular systems with proteins occur on a wide range of time scales from picoseconds up to minutes.1–6 Molecular dynamics (MD) simulations become an increasingly important tool7–10 not only for interpreting experimental data but also for predicting conformational properties and understanding folding mechanisms in such systems. The pioneering work in this direction dated back 35 years performed MD simulation of proteins on time length of several a) Electronic addresses: [email protected] and [email protected] b) Electronic mail: [email protected]

0021-9606/2013/139(24)/244106/23/$30.00

picoseconds.1 Later on, the observation length steadily increased, reaching 1 μs (Ref. 11) and then 10 μs.12 Recently, a barrier of 1 ms has been overcome using the new specialpurpose supercomputing system for MD simulation of proteins and other biomolecules running computations entirely on massively parallel application-specific integrated circuits interconnected in a high-speed three-dimensional torus network, called Anton.13–15 However, the investigations were targeted at relatively simple biomolecular systems, and even this time scale is not adequate for processes with slow statistics such as protein functioning, aggregation, misfolding, bioadsorption, drug binding in nanomachines, etc. One-millisecond MD simulation of relatively simple biosystems becomes feasible also for cutting-edge large supercomputing clusters,

139, 244106-1

© 2013 AIP Publishing LLC

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-2

I. Omelyan and A. Kovalenko

particularly with graphic processing unit (GPU) accelerators combined with conventional CPU chips.16 At the same time, modern capabilities of conventional high performance computing systems allow MD simulation of complex proteins on time scales typically limited to tens or hundreds nanoseconds.17 This is, of course, not sufficient to provide an adequate picture of conformational and folding behavior. Time steps in MD simulations should be rather small to avoid numerical instabilities and achieve a desired accuracy. This imposes severe limitations on the efficiency of MD. Larger time steps are desirable to decrease the number of discretization points and reduce the computational costs when integrating the equations of motion. In a multiple time stepping (MTS) concept, faster components of motion are integrated with smaller (inner) time steps while the slowest components are handled with the largest (outer) time step to reduce computational effort. Many MTS MD techniques have been developed over the last decades to enlarge time steps of the MD integration and thus improve the efficiency of MD simulations. These include the generalized Verlet integrator,18 reversible reference system propagator algorithm (RESPA),19–21 normal mode theories,22–25 mollified impulse schemes,26–28 Langevin dynamics approaches,29–31 and canonical Nosé-Hoover like32–35 and isokinetic36 thermostatting versions of RESPA. In these techniques, each component of motion is integrated on its own physical time scale, which significantly speeds up the computations since the costly long-range interactions can be sampled less frequently than the cheap short-range strong forces. However, outer time steps in the microcanonical MTS integrators18–21 (e.g., RESPA) cannot be made very large by simply increasing the number of inner time steps, even though the distant interactions are sufficiently small. A rapid energy growth occurs when the time interval between the weak force updates exceeds half of the period related to the fastest motion,37–39 which is well-known as resonance instabilities.40–42 A more recent development coupled the canonical chain method32 with the isokinetic dynamics36 into the impulsive isokinetic Nosé-Hoover chain RESPA (INR) algorithm which demonstrated MD simulation of water with outer time steps on the order of 100 fs.43, 44 An efficient way to accelerate molecular simulation is to analytically account for solvation effects by running MD of the biomolecule on a solvation free energy surface which is obtained from integral equation theory of liquids based on the first principles of statistical mechanics. A methodology that has shown substantial success for a number of systems in solution is Ornstein-Zernike type45 integral equation theory of molecular liquids, also known as reference interaction site model (RISM) molecular theory of solvation.46 As distinct from MD simulations which sample the phase space directly by evolving trajectories of a set of explicit solvent molecules, RISM theory operates with spatial correlation functions of molecular interaction sites statistically mechanically averaged over the entire ensemble. It begins with a force field of interaction potentials between molecular interaction sites and uses a diagrammatic analysis of the solvation free energy to construct a site-site Ornstein-Zernike, or RISM, integral equation for the total and direct correlations

J. Chem. Phys. 139, 244106 (2013)

between interaction sites of molecules in solution.47 Converging the RISM integral equation complemented with an appropriate closure relation between the total and direct correlation functions yields the solvation structure, and the solvation thermodynamics is then calculated at once as a simple integral of the correlation functions which is obtained by performing thermodynamic integration analytically.45, 46 The three-dimensional (3D) extension of this approach, 3DRISM theory, gives 3D maps of density distributions of solvent interaction sites around a solute macromolecule (or supramolecule) of arbitrary shape.48–57 An important component of 3D-RISM theory has been the closure relation proposed by Kovalenko and Hirata (KH approximation).51, 55 For simple and complex solvents and solutions of a given composition, including buffers, salts, polymers, ligands and other components at a finite concentration, the 3D-RISMKH molecular theory of solvation properly accounts for chemical functionalities by representing in a single formalism both electrostatic and non-polar features of solvation, such as hydrogen bonding, structural solvent molecules, salt bridges, solvophobicity, and other electrochemical, associative, and steric effects. For real systems, solving the 3DRISM-KH integral equations is far less computationally expensive than running MD simulations which must be long enough to sample all relevant exchange and binding events. This enables handling complex systems and processes occurring on large space and time scales, frequently not even feasible for MD. The 3D-RISM-KH theory describes both simple and complex associating solvents,51–65 including ionic liquids,65 in a wide range of thermodynamic conditions58, 59 and composition,60, 61 and a variety of local and confined environments in different systems such as soft matter,60, 61, 66 solidliquid interfaces,51, 55, 67, 68 carbon nanotubes,63 synthetic organic supramolecular nanoarchitectures,69–73 and biomolecular systems.66, 74–88 The RISM/3D-RISM-KH approach provided an insight into a number of experimentally observed phenomena in soft matter systems, including the structural transitions and related thermodynamic anomalies for the formation of micromicelles and the transition from the tetrahedral to zigzag hydrogen bonding network in water-alcohol mixtures in the whole range of concentrations,60, 61 the phase transitions58, 59 and microscopic structure of interfaces of nonpolar and polar molecular liquids,89, 90 and the microscopic structure of gels formed by oligomeric polyelectrolyte gelators in different solvents.66 The 3D-RISM-KH theory readily produces potentials of mean force in solution,52–55, 57 which in particular was used to predict self-assembly forces in chemical supramolecular nanoarchitectures,69–72 selective ion binding,85, 86 proteinligand binding,66, 74–78 protein-protein interaction,78–82 and aggregation of cellulose nanocrystallites in hemicellulose hydrogel.88 In particular, the 3D-RISM-KH theory yields effective solvation forces91, 92 acting on each atom of the solute macromolecule (or supramolecule) immersed in solvent which is considered to be quasi-equilibrated at every current conformation of the solute. The statistically averaged solvation forces so obtained from 3D-RISM-KH can be then used together with the direct solute-solute interaction potentials in MD simulations to integrate the equations of motion.91, 92

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-3

I. Omelyan and A. Kovalenko

This allows one to contract the degrees of freedom of solvent molecules and perform MD evolution of just the solute molecule steered by 3D-RISM-KH effective solvation forces on a multidimensional landscape of a quasi-equilibrium solvation free energy. This is similar to MD in implicit solvent, e.g., using the Poisson-Boltzmann or Generalized Born model with empiric terms of solvent accessible surface area representing non-electrostatic effects (PBSA/GBSA).93 As distinct, 3D-RISM-KH effective solvation forces obtained from the first principles of statistical mechanics provide full molecular picture of solvation structure, as if full-scale equilibrated MD of all-atom explicit solvent were run at each conformation of the solute molecule. While decoupling the dynamics of the solute from that of solvent and contracting the latter, the MTS-MD/3D-RISM-KH description readily accounts for statistics of slow and/or rare solvation events, such as localization and exchange of structural solvent molecules and ions in pockets of proteins and other function-related confined spaces of biomolecules, protein-ligand binding, protein-protein interaction and aggregation. This can provide fast and accurate access to structural and thermodynamic properties of complex chemical and biomolecular systems in solution. The first hybrid MTS-MD/3D-RISM-KH simulation was carried out by Miyata and Hirata for a flexible model of the acetylacetone molecule in water.91 The consideration was limited to RESPA19–21 in the microcanonical ensemble, applied without any extrapolation of the solvation forces. This made impossible to apply outer time steps larger than 5 fs. The reason is the presence of MTS resonance instabilities24, 38–42 which arise in conventional microcanonical MD simulations, and so in hybrid MTS-MD/3D-RISM-KH, due to a subtle interplay between the strong intramolecular solute-solute interactions (fast motion) and relatively weak intermolecular solute-solvent forces (slow motion). The accuracy of conventional MTS-MD simulations can be improved despite the instabilities by performing processed phase-space transformations.94, 95 Combining these transformations with an energy-constrained scheme, we demonstrated in MTSMD simulations of water that outer time steps up to 16 fs are possible.96 However, such outer steps cannot exceed the theoretical limit of 20 fs inherent in the microcanonical description. In hybrid MTS-MD/3D-RISM-KH without solvation force extrapolation the procedure of converging the 3D-RISM-KH integral equations to calculate effective solvation forces has to be repeated too frequently (every 5 fs in Ref. 91), slowing down the simulation drastically. Luchko et al.92 extended the MTS-MD/3D-RISM-KH approach to the canonical ensemble by employing the Langevin thermostat to damp the MTS instabilities. Furthermore, they proposed the solvation force-coordinate extrapolation (SFCE) to calculate effective solvation forces at inner time steps, based on those obtained from 3D-RISM-KH at outer time steps. This enabled larger outer time steps up to 20 fs, as was shown on the example of alanine dipeptide in aqueous solution. Nevertheless, it is smaller than 100 fs outer time steps workable in standard MTS-MD simulations with the best previously known integrator, isokinetic NoséHoover chain RESPA (INR).43, 44 Mention that INR couples each degree of freedom in the system to an individual heat

J. Chem. Phys. 139, 244106 (2013)

bath to eliminate MTS instabilities. However, such a strong coupling can impede transitions between different conformational states of the solute macromolecule, lowering the statistical convergence. On the other hand, Langevin dynamics incorporates artificial friction and random forces to stabilize the solutions.25, 28–31 Although these forces satisfy the fluctuationdissipative theorem, the target temperature is not guarantied. In addition, employing a large viscosity coefficient (to ensure the stability) can deviate the system from the true canonical distribution. Quite recently, we introduced an optimized isokinetic Nosé-Hoover chain (OIN) canonical ensemble for efficient elimination of MTS instabilities in MD simulations.97 It generalizes the INR method43, 44 and other canonical-isokinetic schemes36, 98, 99 so that each own set of Nosé-Hoover chain thermostats is coupled with some optimal number of degrees of freedom in the system. We applied the OIN algorithm to conventional MTS-MD simulations of the rigid and flexible models of water to demonstrate its advantage over the standard canonical, isokinetic, and canonical-isokinetic schemes. Together with some modification of the original SFCE scheme of solvation force extrapolation proposed in Ref. 92, we further incorporated the OIN integrator in a MTSMD/OIN/SFCE/3D-RISM-KH multiscale approach and examined it on a fully flexible model of alanine dipeptide in aqueous solution.97 We demonstrated in all these cases that the OIN ensemble significantly overcomes the limitations on outer time step size. It particular, large outer time steps from several hundreds femtoseconds up to one picosecond can be employed to predict spatial distribution functions with a high accuracy. This provided an effective speedup of hybrid MTSMD/OIN/SFCE/3D-RISM-KH simulation by a factor of 1520 compared to conventional MD with explicit solvent.97 Strictly speaking, the particle quasi-dynamics generated in hybrid MTS-MD/OIN/3D-RISM-KH simulation differs from conventional MD, as the solvent dynamics is decoupled from the solute one and is contracted by 3D-RISMKH. In particular, the solute quasi-dynamics steered by effective solvation forces does not obey the Maxwell velocity distribution and cannot yield real-time correlation functions. However, we rigorously proved that the configurational part of the extended partition function obtained in this hybrid description does correspond to the true canonical distribution of the physical system at a given temperature in coordinate space.97 This is a very important feature which allows us to readily reproduce the correct conformational properties, including 3D spatial distribution functions of solvent around the biomolecule. The statistical-mechanical 3DRISM-KH account for effective solvation forces provides efficient sampling of slow and/or rare solvation events in function-related confined spaces, and is particularly beneficial for biomolecular systems with exchange and localization of solvent and ions, ligand binding, and molecular recognition. Such a quasi-dynamical sampling of quasi-equilibrium conformational properties and solvation structure appears to be much more efficient than “real-time” microcanonical or canonical simulations. In the present paper, we further develop the MTSMD/OIN/3D-RISM-KH multiscale method by proposing an

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-4

I. Omelyan and A. Kovalenko

advanced approach to extrapolation of solvation forces. Rather than dealing with usual atomic positions as in the previous extrapolation scheme,92, 97 we introduce a discrete nonEckart transformation of coordinate space which exactly accounts for rotations of the solute molecule. As a result, the transformed 3D-RISM-KH effective solvation forces become particularly smooth, since they vary mainly due to internal vibrational motion of atoms which have a relatively small magnitude. The effect of torsional motion is diminished as well by storing solute coordinates and corresponding effective solvation forces calculated by converging the 3D-RISM integral equations at successive outer configurations over an extended time interval. The best basis knots that provide the smoothest coordinate dependence of the solvation forces is then selected recurrently among the extended set. Finally, the extrapolated forces are obtained as a sum over their values in the basis set by using a modified least square fit approach with a balanced norm for the expansion coefficients, complemented with the inverse non-Eckart transformation. The resulting approach of advanced solvation force extrapolation (ASFE) allows us to substantially improve the extrapolation accuracy with practically no increase in computational cost. On the other hand, at the same overall precision the new ASFE method together with the OIN ensemble makes possible to apply outer time steps much larger than for SFCE used previously, significantly reducing the frequency of the computationally expensive procedure of converging the 3D-RISM-KH integral equations to obtain the effective solvation forces. The paper is structured as follows. Sec. II briefly describes the 3D-RISM-KH molecular theory of solvation, as well as MTS-MD simulation steered with 3D-RISM-KH effective solvation forces. Sec. III introduces the ASFE method and compares it with the previous extrapolation schemes. Section IV incorporates ASFE into the MTS-MD/3D-RISMKH equations of motion in the OIN ensemble, and derives the MTS-MD/OIN/ASFE/3D-RISM-KH integrator algorithm. The latter is applied to and validated on alanine dipeptide in aqueous solution in Sec. V. Concluding comments are given in Sec. VI. II. MULTIPLE TIME STEP MOLECULAR DYNAMICS STEERED BY EFFECTIVE SOLVATION FORCES OBTAINED FROM THE 3D-RISM-KH MOLECULAR THEORY OF SOLVATION

Consider a liquid system composed of solute and solvent molecules. In many biomolecular cases the number of solvent molecules is much larger than that of solute ones. This significantly complicates conventional MD simulations because a large portion of computational time is spent on calculating solute-solvent and particularly solvent-solvent forces. Solvent related computational expenses increase dramatically in the cases of small solute concentrations and for events with rare statistics such as slow exchange and binding of solvent in protein pockets. MD simulation can be accelerated by employing the 3D-RISM-KH molecular theory of solvation to contract detailed information on molecular trajectories and dynamics of solvent.91, 92, 97 The 3D-RISM-KH theory operates

J. Chem. Phys. 139, 244106 (2013)

with statistically-mechanically averaged solvation structure in terms of 3D spatial correlation functions of interaction sites of solvent molecules, in particular, 3D site density distributions. Beginning with intermolecular interaction potentials between the solute macromolecule and solvent molecules, that is a force field, the 3D site correlation functions are calculated numerically by converging the 3D-RISM-KH integral equations derived analytically from the first principles of statistical mechanics. The thermodynamic integration is performed analytically, and the solvation free energy is obtained in a closed analytical form as a simple integral of the correlation functions. This gives the potentials of mean force and thus the effective solvation forces acting on each solute atom, statistically mechanically averaged over the whole ensemble of solvent molecules. The effective solvation forces so obtained are then used together with the direct solute-solute atomic interaction potentials to integrate the equations of motion in the combined MTS-MD/3D-RISM-KH simulations. Below we briefly summarize the main constituents of this coupling.

A. 3D-RISM-KH molecular theory of solvation

The 3D-RISM-KH theory51–57 starts with an interaction potential uuv α (r) between the solute which can be a supramolecule as well as macromolecule and interaction site α of solvent molecules defined by a molecular force field, and yields the solvation structure in terms of the probability density ραv gαuv (r). The latter corresponds to finding interaction site α of solvent molecules at 3D space position r around the solute which is factorized into the average number density ραv and the 3D site distribution function gαuv (r). Here, superscripts “u” and “v” denote solute and solvent species, respectively. Note that non-bonded potentials of a molecular force field are typically represented with Coulomb and Lennard-Jones pairwise terms; however, this is not required in general, as the 3D-RISM-KH theory operates with 3D solute-solvent site interaction potentials tabulated on a 3D grid in a rectangular box containing the solute molecule and specified analytically in the whole space outside the box,53–57, 100 like in the selfconsistent-field multiscale coupling of 3D-RISM-KH with electronic structure theories.51, 55, 57, 62–64 The 3D-RISM integral equation48–51, 55 can be derived from the 6-dimensional, molecular Ornstein-Zernike integral equation45 by averaging over orientations of solvent molecules around each interaction site α to contract orientational degrees of freedom of solvent,51, 55  uv vv  hγ (r) = (r ) , (1) dr cαuv (r − r )χαγ α

huv α (r)

where is the 3D site total correlation function related to the 3D site distribution function gαuv (r) = 1 + huv α (r) which can be probed in scattering experiment. cαuv (r) is the 3D site direct correlation function which represents effective pairwise (“direct”) interactions which lead through all chains of subsequent two-body interactions in the Ornstein-Zernike type construct (1) to the effective interactions and distributions gαuv (r) in dense liquid phase; the values of cαuv (r) at long range are related to the interaction potential uuv α (r), while those inside

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-5

I. Omelyan and A. Kovalenko

the solute-solvent repulsive core substantially contribute to the solvation thermodynamics, namely, the solvation free energy and the partial molar volume of the solute molecule immersed in the solvent. χγvvα (r) is the site-site susceptibility of pure solvent. To determine the correlation functions, the 3D-RISM integral equation (1) has to be complemented with another relation between the two functions hα (r) and cα (r) which is called a closure and also involves the interaction potential uuv α (r) between the solute molecule and interaction sites α of solvent molecules. The exact closure has a nonlocal functional form which can be expressed as an infinite diagrammatic series in terms of multiple integrals of the total correlation function.45 However, these diagrams are cumbersome and the series suffers from the standard drawback of poor convergence, which makes useless to truncate the series at reasonably low-order diagrams and thus renders the exact closure functional computationally intractable. Therefore, it is replaced in practice with amenable approximations with analytical features that properly represent physical characteristics of the system, such as the long-range asymptotics of the correlation functions related to electrostatic forces and their short-range features related to the solvation structure and thermodynamics. Examples of closure relations to Ornstein-Zernike type integral equations (including RISM) suitable for liquids of polar and charged species are the so-called hypernetted chain (HNC) closure and the mean spherical approximation (MSA) to the Ornstein-Zernike integral equation,45 since both they enforce the long-range asymptotics of the electrostatic potentials of site charges, which is critical for systems with polar and ionic species. Other well-known approximations such as the Percus-Yevick, Modified Verlet, Martynov-Sarkisov, and Ballone-Pastore-Galli-Gazzillo closures45 well reproduce solvation features for species with short-range repulsion but do not properly account for electrostatics in the interaction potential. By generalization, the 3D-HNC and 3D-MSA closures can be constructed to complement the 3D-RISM integral equation (1).48–50, 55 While enforcing the electrostatic asymptotics, the 3D-HNC closure strongly overestimates association effects and therefore the 3D-RISM-HNC equations diverge for macromolecules with considerable site charges solvated in polar solvents or electrolyte solutions, which is almost always the case for biomolecules in aqueous solution. The 3D-MSA closure is free from that shortcoming but suffers from another serious deficiency of producing nonphysical negative values of the distribution function in the areas adjacent to the associative peaks. Those drawbacks are overcome in the approximation proposed by Kovalenko and Hirata (KH closure), the 3D version of which reads51, 55  uv ⎧ uα (r) uv uv uv ⎪ ⎪ ⎨exp − k T + hα (r) − cα (r) for gα (r)  1 B uv , gα (r) = uv ⎪ (r) u ⎪ α uv uv uv ⎩1 − + hα (r) − cα (r) for gα (r) > 1 kB T (2) where kB T is the Boltzmann constant times the solution temperature. The 3D-KH closure (2) couples in a nontrivial way the HNC approximation and the MSA linearization, the latter automatically applied to spatial regions of solvent den-

J. Chem. Phys. 139, 244106 (2013)

sity enrichment gαuv (r) > 1 such as strong peaks of association and long-range tails of near-critical fluid phases, and the former to spatial regions of density depletion gαuv (r) < 1 (including the repulsive core), while keeping the long-range asymptotics peculiar to both the HNC and MSA. The distribution function and its first derivative are continuous at the joint boundary gαuv (r) = 1 by construct. The KH closure approximation consistently accounts for both electrostatic and non-polar features of solvation (associative and steric effects) for macromolecular and supramolecular solutes in simple and complex liquids, ionic liquids, non-electrolyte and electrolyte solutions in various chemical systems,51–68 supramolecular nanoarchitechires,69–73 and biomolecular systems.66, 74–88 The 3D-KH closure underestimates the height of strong associative peaks of the 3D site distribution functions because of the MSA linearization applied to them.51, 100 However, it somewhat widens the peaks and so 3D-RISM-KH quite accurately reproduces the coordination numbers of solvation structure in different systems, including microheterogeneities (micromicellar aggregates) in water-alcohol solutions,57, 60 solvation shells of a metal-water51, 55 and metal oxide-water67, 68 interfaces, and structural water solvent localized in biomolecular confinement.74, 87 For example, the coordination numbers of water strongly bound to the MgO surface are calculated from the 3D-RISM-KH theory with a 90% accuracy and the peak positions within a 0.5 Å deviation, compared to MD simulations.67 The 3D solvation map gαuv (r) of function-related structural water in the GroEL chaperon complex (shown to be strongly correlated to the rate of protein folding inside the chaperon cavity) obtained in an expensive MD simulation with explicit solvent involving ∼1 million atoms is reproduced from the 3D-RISM-KH theory in a relatively short calculation on a workstation with an accuracy of over 90% correlation for the 3D density map and about 98% correlation for the 3D positions of density maxima.87 Note that different approximate 3D bridge functions, or bridge corrections, to the 3D-HNC approximation can be constructed, such as the 3D-HNCB closure which reproduced the MD simulation results for the height of the 3D distribution peaks of water solvent around the bovine pancreatic trypsin inhibitor (BPTI) protein.100 A related promising approach is the partial series expansions of order n (PSE-n) of the HNC closure.101 The PSE-n closures interpolate between the KH and HNC approximations and therefore combine numerical stability with enhanced accuracy, as demonstrated for aqueous solutions of alkali halide ions.101, 102 However, a particular appeal of the 3D-KH closure is that it provides adequate accuracy and existence of solutions if expected from physical considerations for a wide class of solution systems “from the wild,” including various chemical and biomolecular solutes in different solvents, co-solvents, and electrolytes,51–92 and moreover, with ligand molecular fragments considered as part of solvent for efficient 3D mapping of binding affinities in such problems as molecular recognition and fragment based drug design.74, 75, 77 Construction of bridge functions for such systems with different solvent components other than water constitutes a significant challenge not addressed so far, while the 3D-KH closure constitutes a reliable choice for a given system.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-6

I. Omelyan and A. Kovalenko

J. Chem. Phys. 139, 244106 (2013)

vv The solvent site-site susceptibility function χαγ (r) determines the response of the solvent to the insertion of the solute molecule. It breaks down into the intra- and intermolecular terms, vv vv (r) = ωαγ (r) + ραv hvv χαγ αγ (r).

(3)

vv ωαγ (r)

The intramolecular correlation function normal

vv ized as dr ωαγ (r) = 1 represents the geometry of solvent vv (r) = 0 for interaction sites α and γ on differmolecules. [ωαγ vv it has ent species.] For rigid species with site separations lαγ vv vv vv 2 the form ωαγ (r) = δ(r − lαγ )/(4π (lαγ ) ) which is specified vv vv (k) = j0 (klαγ ), in calculations in the reciprocal k-space as ωαγ where j0 (x) is the zeroth-order spherical Bessel function. The site-site total correlation function hvv αγ (r) is obtained in advance to the 3D-RISM-KH calculation from the dielectrically consistent RISM theory (DRISM)103 coupled with the KH closure (DRISM-KH approach55 ) which can be applied to bulk solution of a given composition, including polar solvent, co-solvent, electrolyte, and ligands at a given concentration. The 3D-KH closure approximation (2) to the 3D-RISM equation (1) possesses an exact free energy differential, which allows Kirkwood’s thermodynamic integration to be performed analytically and yields the solvation free energy of the solute macromolecule (supramolecule) in a closed form in terms of the 3D site correlation functions,51, 55  

1 uv 2 v hα (r) − huv μsolv = kB T ρα dr α (r) 2 α  1 uv uv uv − cα (r) − hα (r)cα (r) , (4) 2 where (x) is the Heaviside function. To properly treat electrostatic forces in electrolyte solution with polar molecular solvent and ionic species, the longrange electrostatic asymptotics of all the correlation functions are separated out and handled analytically.53–56, 64, 81, 100 The spatial convolution in the 3D-RISM integral equation (1) is calculated by means of 3D fast Fourier transform (3D-FFT). The DRISM-KH equations are discretized on a uniform radial grid with resolution 0.01-0.1 Å, and the 3D-RISM-KH equations are on a uniform 3D rectangular grid with spacing 0.1-0.5 Å in a 3D box of size including 2-3 solvation shells around the solute macromolecule (supramolecule). The analytical forms for the non-periodic electrostatic asymptotics in the direct and reciprocal space are separated out from all the correlation functions before and then added back after doing the transform. Note that even though the solvent susvv (r) has a long-range electrostatic part, no aliasceptibility χαγ ing occurs in the backward 3D-FFT of the short-range part of huv α (k) on the 3D box supercell since the short-range part of the convolution product huv α (r) contains merely 2-3 oscillations (solvation shells) and vanishes outside the 3D box for the physical reason.56 Accordingly, the electrostatic asymptotics terms in the thermodynamic integral (4) are handled analytically and reduced to one-dimensional integrals easy to compute.53–56, 64, 81 The 3D-RISM integral equation (1) with the 3D-KH closure (2) are converged typically to a root mean square accu-

racy of 10−5 –10−8 and the DRISM-KH equation to an accuracy of 10−8 –10−12 by using the modified direct inversion in the iterative subspace (MDIIS) numerical solver.52–56, 104 MDIIS accelerates convergence of integral equations of liquid state theory by optimizing each iterative solution in a Krylov subspace of typically last 10–20 successive iterations and then making the next iterative guess by mixing the optimized solution with the approximated optimized residual. The computational expenses of converging the 3D-RISM-KH equations are further substantially reduced with several strategies, including a high-quality initial guess to the direct correlation functions cαuv (r), pre- and post-processing of the 3D solutesolvent potentials, long-range asymptotics, and forces, several cutoff schemes, an adaptive solvation box,92 as well as the core-shell-asymptotics treatment of solvation shells decreasing memory and corresponding CPU load in MDIIS by an order of magnitude.56 B. Coupling 3D-RISM-KH with MTS-MD

Rather than explicitly considering individual solvent molecules as in conventional MD simulations, the combined MTS-MD/3D-RISM-KH approach91, 92, 97 deals with 3D solute-solvent site spatial distribution functions gαuv (r) statistically averaged at each solute conformation. The 3D distributions are obtained from the first principles of statistical mechanics by converging the 3D-RISM-KH integral equations (1) and (2). Accordingly, detailed interactions with solvent molecules are mapped onto effective solvation forces acting on each atom of the solute macromolecule (or supramolecule) by averaging over the solvent distributions. This property can be obtained by taking a derivative of the solvation free energy with respect to solute atomic coordinates,  ∂μsolv = ραv ∂ri α



∂uuv iα (r − ri ) , ∂ri (5) where the 3D solute-solvent site interaction potential is assumed to be composed of pairwise interaction potentials uuv iγ (r) between solute atoms i located at ri and solvent interaction site α at r,  uuv uuv α (r) = iα (r − ri ). fi ≡ f(ri ) = −

dr gαuv (r)

i∈u

Equation (5) readily follows from the form (4) for the 3D-KH closure (2) as well as from that for the 3D-HNC closure91 but holds for any closure satisfying the general condition of pathindependence of the solvation free energy (including the exact closure).92 With the direct interaction potentials and corresponding forces between solute atoms evaluated in a standard way, the MTS-MD/3D-RISM-KH equations of motion are solved only for solute atoms, while solvent effect is accounted for in terms of the 3D solute-solvent site distributions and effective solvation forces. When handling the 3D-RISM-KH integral equations, the solute macromolecule (supramolecule) is placed in a rectangular box of suitable size which can adapt in course of simulation to solute conformational changes to minimize

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-7

I. Omelyan and A. Kovalenko

J. Chem. Phys. 139, 244106 (2013)

computational expenses. This is unlike conventional MD simulation which usually involves a cubic box with periodical boundary conditions and the cumbersome Ewald summation techniques105, 106 to calculate electrostatic interactions.

III. ADVANCED METHOD FOR EXTRAPOLATION OF EFFECTIVE SOLVATION FORCES

As distinct from MD simulations with explicit solvent which evaluate short-range forces between molecular repulsive cores steeply changing in space and time with movement of solvent molecules, effective solvation forces obtained from the 3D-RISM-KH molecular theory of solvation are statistically mechanically averaged over the 3D spatial distributions of solvent molecules and therefore are much smoother in space. This allows extrapolation of the effective solvation forces for a large number of inner time steps so as to reduce the frequency of converging the 3D-RISM-KH integral equations and thus to dramatically decrease the computational cost of the MTS-MD/3D-RISM-KH multiscale simulation method.

A. Previous extrapolation scheme

The extrapolation can be done with the solvation force-coordinate extrapolation (SFCE) approach first introduced in the context of combined MTS-MD/3D-RISM-KH simulation.92 It is based on the above-mentioned fact that effective solvation forces vary smoothly in space and in time with conformational evolution. Let fi,k be the effective solvation force (5) acting on atom i = 1, . . . , M of the solute macromolecule at previous outer time step k = 1, . . . , N at which the 3D-RISM-KH integral equations were solved. Note that each 3D vector fi,k depends on the multidimensional position {r1,k , r2,k , . . . , rM,k } of all M atoms. The force vectors fi,k and position vectors ri,k for given i are ordered in such a way that larger values of k correspond to earlier time moments tk , i.e., t N < t N − 1 < · · · < t2 < t 1 . In the previous extrapolation scheme,92 the solvation forces are handled as follows. First, the actual solute atom coordinates r(t) ≡ {r1 (t), r2 (t), . . . , rM (t)} at each inner point t of the next outer time interval ]t1 , t0 [ are virtually extrapolated by a linear combination of their previous outer values as r˜ i (t) =

N 

ak (t)ri,k .

(6)

k=1

The expansion coefficients ak (t) in Eq. (6) can then be obtained as the best representation of the solute atom coordinates r(t) at time t in terms of their projections onto the basis of N previous outer positions ri,k by minimizing the norm of the difference between ri (t) and their approximated counterparts r˜ i (t). This leads to the following least-square problem: 2 M N  1  ak (t)ri,k = min . ri (t) − M i=1 k=1

(7)

Having ri,k and ri (t), Eq. (7) can be solved for ak (t) using the well-developed procedures,107, 108 such as for instance the QR-factorization or normal equation method. Now, the desired unknown forces fi (t) at t ∈ ]t1 , t0 [ can be extrapolated on the basis of their known previous outer values by employing a linear expansion procedure which is quite similar to Eq. (6) for the coordinates ri (t). This yields ˜fi (t) =

N 

ak (t)fi,k ,

(8)

k=1

where i = 1, 2, . . . , M and the weighting coefficients ak (t) in Eq. (8) are the same as those in Eq. (6). This is justified by the fact that fi (t) is a function of only r(t). Thus, a better representation of the coordinates by r˜ (t) should provide a more accurate mapping of the interactions, expecting a small difference between the actual forces fi (t) and their approximated values ˜fi (t) at each inner time step. We see that the coordinate mapping is imaginary in the sense that ri (t) are never replaced with r˜ i (t), and it is necessary to find just the weights for the actual force extrapolation. Note also that the coefficients ak (t) do not depend on atomic site i. Therefore, the molecular net force is exactly equal to zero despite the approximation be ˜fi (t) = 0 if cause, as follows from Eq. (8), we have that M i=1 M i=1 fi,k = 0. B. Improvement by normalization

The original extrapolation approach described in Sec. III A is relatively simple but has a lot of disadvantages. First of all, it is not invariant with respect to the translational transformation. In other words, the result should not depend on the choice of the origin of the laboratory system of coordinates. Indeed, in view of Eqs. (6)–(8), adding any constant vector r to the positions ri (t) and ri,k , we come in general to different values of ˜fi (t). To some extent this negative feature of the original extrapolation can be diminished by considering the atomic coordinates with respect to the center of mass of the solute molecule and always putting it at the origin with the zeroth components (0, 0, 0) [this origin must coincides with the geometric center of the 3D-RISM solvation box]. However, such a manipulation does not reproduce some important limiting cases. The most notorious example is that the previous SFCE approach does not lead to exact results for any N even if the forces are constant. In this context it is worth mentioning that the functions fi (t) = fi (r(t)) can be expanded in a power series of a deviation of the current coordinate vector r from the closest basis point ri,K ∈ ri,k at each i = 1, 2, . . . , M as  M  ∂fi  fi (r) = fi ({ri,K }) + (rj − rj,K ), (9) ∂rj,K ri,K j =1 where ∂fi /∂rj,K is the Hessian (3M × 3M) matrix, and the second- as well as higher-order spatial inhomogeneities O[(rj − rj,K )2 ] have been neglected. We see that fi (r) has the constant zeroth-order part representing by the first term in the right-hand side of Eq. (9). It immediately follows from Eq. (8) that this term can be reproduced exactly by imposing the

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-8

I. Omelyan and A. Kovalenko

following normalizing coefficients:

condition N 

J. Chem. Phys. 139, 244106 (2013)

on

the

ak = 1 .

weighting

(10)

k=1

Note that putting N = 1 reduces the normalizing scheme, Eqs. (6)–(8) and (10), to constant-force extrapolation. The second term in the right-hand side of Eq. (9) is linear in coordinates. That is why it can be extrapolated using the dual linear expansions (6) and (8). By requiring a minimum for the squared coordinate norm, Eq. (7), we try in fact to map numerically the complex Hessian matrix in Eq. (9) by using a finite number N of knots ri,k . For instance, in the case of an one-dimensional particle in an external force field f (x), it can be readily shown with Eqs. (6)–(8) and (10) that for N = 2 we have the equality f (x) = f (x1 ) + (x − x1 )(f (x2 ) − f (x1 ))/(x2 − x1 ). As seen in the limit (x2 − x1 ) → 0, this equality represents a truncated Taylor series because then the finite difference ( f (x2 ) − f (x1 ))/(x2 − x1 ) coincides with the spatial derivative ∂f/∂x of f and the approximation becomes exact. A similar force extrapolation for a one-particle system in the case of N = 2 was proposed earlier in the context of a surgery simulator.109 Other types of extrapolation were also considered110, 111 but for conventional MD simulations. In our multi-dimensional case, each of the (3M × 3M  1) elements of the Hessian matrix is mapped in a very complex way using a large number (N  1) of basis points ri,k . Note that the normalized scheme is already translationally invariant because, according to Eq. (10), addition of any constant coordinate vector r to ri and ri,k in Eq. (7) does not change the squared norm. Moreover, the presence of the normalization condition (10) can be easily handled within both the QR-factorization and normal equation methods. In  the factorization approach, the existence of the constraint N k=1 ak = 1 can be taken into account by shifting all coordinates and forces by the values corresponding to the first (or any other) basis pair, for instance, to ri,1 and fi,1 . This reduces the number of basis knots from N to N − 1. In other words, we should first replace ri with ri − ri,1 and ri,k with ri,k − ri,1 when solving the least-square problem (7) for ak . Note that now k = 2, . . . , N and the coefficients ak obtained differ from those in Eq. (7). Then the force extrapolation (8) modifies to ˜fi (t) = fi,1 +

N 

ak (t)(fi,k − fi,1 ).

(11)

k=2

Equation (11) can be formally obtained from Eq. (8) by re placing a1 with a1 = 1 − N k=2 ak to automatically satisfy the normalizing condition. A more elegant way is to deal with N initial pairs and to use the normal equation method. The existence of the constraint converses the least-square problem to finding the following conditional minimum N  2 M N   1  ak ri,k + 2λ ak − 1 = min, ri − M i=1 k=1 k=1 (12)

where λ is a Lagrangian multiplier to be determined together with N unknown coefficients ak . Differentiating Eq. (12) with respect to ak and λ as well as taking into account that ∂min /∂ak = ∂min /∂λ = 0 lead in view of Eq. (10) to the following set of N + 1 linear equations: ⎛

g11

⎜g ⎜ 21 ⎜ ⎜ .. ⎜ . ⎜ ⎜ ⎝ gN1

g12

...

g1N

g22 .. .

... .. .

g2N .. .

gN,2

...

gNN

1

...

1

1

1

⎞⎛

a1





g1



⎜ ⎟ ⎜ ⎟ 1⎟ ⎟ ⎜ a2 ⎟ ⎜ g2 ⎟ ⎟⎜ ⎟ ⎜ ⎟ .. ⎟ ⎜ .. ⎟ = ⎜ .. ⎟ ⎜ ⎟ ⎜ ⎟ .⎟ ⎟⎜ . ⎟ ⎜ . ⎟ ⎟⎜ ⎟ ⎜ ⎟ 1 ⎠ ⎝ aN ⎠ ⎝ gN ⎠ 0

λ

(13)

1

for the same number of unknowns ak at k = 1, 2, . . . , N and λ. The matrix elements in Eq. (13) represent the scalar product of the multidimensional coordinate vectors of all solute atoms, gkl =

M 1  ri,k ·ri,l , M i=1

gk =

M 1  ri,k ·ri . M i=1

(14)

Note that the expansion coefficients ak found with the normal equation method, Eqs. (12) and (13), in general differ from those obtained from the shifted summation (11) and, of course, from the usual least-square procedure (7). Thus, at each inner step δt inside the outer time interval t ∈ ]t1 , t0 [ of length h, the system of linear equations (13) is being solved h/δt times by standard numerical methods and the force extrapolation (8) is being applied. After the next 3D-RISM-KH calculation, the basis set with N pairs is updated by the new vector of forces and coordinates while the oldest one is discarded. With an appropriate choice of N and h  δt, the SFCE can significantly speed up the combined MTS-MD/3D-RISM-KH simulations. The reason is that solving the relatively small set of linear equations is much faster than converging the 3D-RISM-KH nonlinear integral equations, and is either faster than summing up the solute-solvent and solvent-solvent interactions for a large number of explicit solvent molecules in conventional MD simulations.

C. Advanced extrapolation approach

Our previous investigations97 showed that the use of the above-described normalized force extrapolation together with the OIN thermostat allows efficient acceleration of MTSMD/3D-RISM-KH computations by a factor of 15–20, compared to conventional MD simulations with explicit solvent. This acceleration has been achieved by increasing the maximally possible length of the outer time step h from 20 fs (inherent in the original extrapolation scheme with the Langevin dynamics92 ) to 200 fs.97 Unfortunately, further increase of h appreciably reduces the accuracy of such an extrapolation and thus significantly distorts true results. The only way to overcome the above limitations on the efficiency is to develop a better extrapolation scheme. We will now show how the outer time step can be increased in our new approach up to tens of picoseconds without loss of accuracy.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-9

I. Omelyan and A. Kovalenko

J. Chem. Phys. 139, 244106 (2013)

1. Non-Eckart rotational transformation

The first major idea of the new approach of advanced extrapolation of effective solvation forces is to find such a transformation R = Sr of coordinate space from r to new variables R that renders the solvation forces f(R) ≡ F into the smoothest behavior. Since we deal with a discrete set (k = 1, 2, . . . , N) of the basis coordinate knots ri,k , the desired transformation Ri,k = Sk ri,k can be determined by minimizing the distances between the transformed outer coordinates Ri,k in the 3M-dimensional space as M 1  (Sk ri,k − ri,1 )2 = min . M i=1

(15)

Note that in Eq. (15) the most recent, first point ri,1 is considered as an origin, that is Ri,1 = S1 ri,1 = ri,1 and so k = 2, . . . , N. The current inner coordinate ri (t) should also be transformed by Ri (t) = S(t)ri (t) with 1 M

M 

(S(t)ri (t) − ri,1 )2 = min .

(16)

i=1

Having the transformed coordinates Ri,k and Ri (t), where Ri,k = Sk ri,k ,

Ri (t) = S(t)ri (t) ,

(17)

we can act similarly to Subsections III A and III B by substituting them instead of ri,k and ri (t). As a result, the transformed forces Fi (t) at t ∈ ]t1 , t0 [ can be extrapolated on the basis of their previous values Fi,k = f({Ri,k }) by using the linear approximation F˜ i (t) =

N 

Ak (t)Fi,k .

(18)

k=1

The weights Ak (t) in Eq. (18) are obtained according to Eq. (12) from the following conditional least-square scheme N  2 M N   1  Ak Ri,k + 2 Ak − 1 = min . Ri − M i=1 k=1 k=1 (19) The unknown coefficients Ak and are then found by solving the set of linear equations which can be represented analogously to Eqs. (13) and (14) by formally replacing ak , λ, ri,k , and ri with Ak , , Ri,k , and Ri . A problem now arises how to obtain Fi,k from fi,k without direct calculation of Fi,k = f({Ri,k }), and in which way to return back from F˜ i to the desired extrapolated forces ˜fi in the usual coordinates. This issue can be solved by taking into account that the original solvation forces fi are not only translationally invariant but also satisfy the following orientational condition: f(Sr) = Sf(r) ,

(20)

where S is an arbitrary 3 × 3 rotational matrix. Equation (20) merely states that if the solute molecule is rotated as a whole, the total solvation forces acting on each atom of this molecule is transformed with the same rotation. Remember that all the atomic coordinates are defined with respect to the center of mass of the molecule, and so the rotations should be performed around this point.

Thus, taking into account that Sk and S(t) in Eqs. (15) and (16) are rotational transformations, we obtain from Eq. (20) that Fi,k = f({Ri,k }) = f({Sk ri,k }) = Sk fi,k

(21)

and Fi (t) = f({Ri (t)}) = f({S(t)ri (t)}) = S(t)fi (t).

(22)

In view of Eqs. (21) and (22), no additional direct recalculations are needed and the desired approximated forces in the usual coordinate space at each inner point t can be readily reproduced from Eq. (18) using the inverse rotational transformation ˜fi (t) = S−1 (t)F˜ i (t) = S−1 (t)

N 

Ak (t)Sk fi,k .

(23)

k=1

Moreover, the inverse matrix can be easily evaluated taking into account that the rotational transformation is orthogonal, i.e., S−1 = S+ , where S+ denotes the transposed matrix. We see, therefore, that the proposed transformation completely excludes solute molecule rotations which can be large enough due to the interactions with the solvent and thermostats. This minimizes the distances between outer and inner position points in view of Eqs. (15) and (16). Obviously, a better accuracy of the force extrapolation should then be expected. In other words, the differences between the approximated values ˜fi (t) and their original counterparts fi (t) must decrease. In particular, the new extrapolation scheme (23) is exact, i.e., ˜fi (t) = fi (t), already at N = 1 for the case of rigid models, where the transformed forces F are constant. It should also be precise for flexible molecules in the absence of torsion movements, since the magnitude of atomic vibrations is small. The simplest way to obtain explicit expressions for the elements of the rotational matrix Sk or S(t) is to first represent them in terms of the four components quaternion q = {χ , η, ξ, ζ } as112 ⎞ ⎛ 2 2 2 2(−χ ζ +ηξ ) 2(χ ξ +ηζ ) χ +η −ξ −ζ 2 ⎟ ⎜ 2(−χ η+ξ ζ ) ⎠, χ 2 +ξ 2 −η2 −ζ 2 S =⎝ 2(χ ζ +ηξ ) 2(−χ ξ +ηζ ) 2(χ η+ξ ζ ) χ 2 +ζ 2 −η2 −ξ 2 (24) where q2 ≡ q+ q = χ 2 + η2 + ξ 2 + ζ 2 = 1. Note that in a more familiar description, the quaternion can be cast in the form q = {cos(φ/2), n sin(φ/2)} and so according to Eq. (24) the matrix S rotates the molecule by angle φ along the unit vector n passing through the center of mass. Inserting now the quaternion form (24) into the rotational superposition equation (15) or (16) yields in the presence of the constraint q+ q = 1 that 1 + 1 (25) q q − (q+ q − 1) = min, 2 2 M where  is the Lagrange multiplier,  = M1 i=1 i , and ⎛  ⎞ (ri − ri,1 )2 2(ri ×ri,1 )+ ⎠ i = ⎝ + 2(ri ×ri,1 ) 1(ri + ri,1 )2 − 2(ri r+ + r r ) i,1 i i,1 (26)

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-10

I. Omelyan and A. Kovalenko

J. Chem. Phys. 139, 244106 (2013)

are the symmetric 4 × 4 matrices with ri equal to either ri,k or ri (t) for the cases Sk or S(t), respectively. In our notations, 1 is the identity 3 × 3 matrix, × designates the vector prod+ uct, while ri r+ i,1 and ri,1 ri are tensors. Finally, differentiating Eq. (25) with respect to all the four components of q leads to the eigenvalue problem q = q.

(27)

Since the right-hand side of Eqs. (15) and (16) are always greater than or equal to zero, the matrix  is positive semidefinite, having four (j = 1, 2, 3, 4) eigenvectors qj and the same number of non-negative associated eigenvalues  j ≥ 0. The latter can be sorted in ascending order, such that  1 is the smallest eigenvalue. It coincides with the global minimum in Eqs. (15), (16), and (25), since for any normalized eigenvectors the following equality takes place: q+ j qj = j . The normalized eigenvector q1 corresponding to the smallest eigenvalue  1 is thus the quaternion describing the optimal transformation by rotational matrix S, Eq. (24). This realizes the advanced force-coordinate extrapolation (23). Note that many other rotational transformations were tried until the optimal one has been found. For instance, we could treat the rotations S as those which always align the principal (or geometrical) axes of inertia of the solute molecule along the axes of the fixed laboratory system of coordinates. However, this results in a singularity whenever any two of the tree moments of inertia become very close between themselves during the conformational dynamics. The reason is that the principal or geometrical axes cannot then be determined uniquely. Moreover, such an approach does not lead to a global minimum in the coordinate deviations (15) and (16), making the scheme not optimal. The rotational superposition method proposed above is free from all these drawbacks and is the only one that provides the best transformation. It is worth pointing out also that the rotational superposition introduced for ASFE looks somewhat similar to the non-Eckart scheme originally derived in Ref. 113 in the context of analyzing complex structures of biomolecules encountered in MD simulations or experiment. Note that there is no unique general approach to separate translational, angular, and internal motion of these molecules. Within the well-known Eckart scheme,114–117 such separation can be unambiguously carried out only for molecular structures with one equilibrium state. It does not work for more complex molecules where two or more local equilibrium states exist. This problem was solved in Ref. 113 by exploiting Gauss’ principle of least constraint and minimizing the squared norm

2 M = min, where mi is the i=1 mi δS(t, δt)ri (t + δt) − ri (t) mass of the ith atom and δt is the time step. For instance, in the limit of δt → 0, this allows one to uniquely define the angular velocity (t) = n(t) limδt→0 δφ/δt of an arbitrary molecule at any time t, where δS(t, δt) ≡ S(q(t)) with q(t) = {cos(δφ/2), n(t) sin(δφ/2)}. The standard Eckart method appears as a particular case of the non-Eckart approach when the number of local equilibrium states is equal to one. Moreover, in the absence of internal degrees of freedom, the nonEckart angular velocity coincides exactly with that following from the well-known definition for rigid bodies.

Our non-Eckart like superposition approach differs in several aspects from the original non-Eckart scheme.113 First, it is expanded to a discrete set with arbitrary number N of outer coordinate points. Second, no mass weighting is applied when building the coordinate residuals for the minimization. Finally, our new scheme is aimed at optimizing performance of MD simulations rather than analyzing data from experiment or computer modeling.113, 118–120

2. Extending the basis set and selecting the best subset

Evidently, the accuracy of effective solvation force extrapolation should improve with the number N of basis pairs. However, this increases the number of linear equations, too, which have to be solved frequently, namely, h/δt  1 times at each inner point inside the outer interval h  δt to minimize the coordinate residuals, Eq. (19). As a result, at large N  1 the computational overhead can become unacceptably high, reducing the efficiency of MTS-MD/3D-RISM-KH simulation. The way to remedy this situation constitutes the second major idea of the new ASFE approach. We can extend the basis set from a relatively small number of N  100 to a larger value N  N by collecting the force-coordinate pairs during a broad time interval H = N h  Nh. Then the squared distances R2k (t) =

M 1  (Ri (t) − Ri,k )2 M i=1

(28)

in the 3M-dimensional space between the current inner transformed position Ri (t) and the transformed basis outer coordinates Ri,k can be readily calculated for each k = 1, 2, . . . , N . These distances can now be sorted in an ascending order, and the first N points with k = 1, 2, . . . , N can be selected among the extended set to satisfy the condition R1 < R2 < · · · RN . Forces have to be sorted simultaneously with coordinates to form the subset of best pairs. The latter should then be used in the non-Eckart-like extrapolation scheme described in Subsection III C 1. The selection procedure can further significantly improve the quality of the extrapolation, especially for N  N. The reason is that the choice of outer pairs closest to the current inner point in the transformed space additionally reduces the coordinate region in which the extrapolation is performed. This decreases the coordinate residuals and as a consequence increases the accuracy. In fact, such additional reduction minimizes the change in solvation forces due to torsion movements of the solute molecule. Note that such movements characterized with large amplitude are responsible for transitions of the solute molecule from one to another equilibrium conformational state in which the torsion potential has local minima. Thus, an optimal value for the expanded time interval H = N h should be larger than the maximum among all the mean life times in different equilibrium states. Whenever a transition to another conformation occurs, we can quickly reselect the subset to fit the basis outer points to the current

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-11

I. Omelyan and A. Kovalenko

J. Chem. Phys. 139, 244106 (2013)

configuration. The accuracy of such fitting is especially high if the molecule had been in this conformation previously. It is worth emphasizing that the use of the abovedescribed extension and selection allows us to achieve almost the same quality of extrapolation at N N as at N = N . As distinct, the selection procedure at N N requires only little extra computational effort, even for quite large N of order of several thousands. Indeed, both the 4 × 4 eigenvalue problem involved in the transformation and the selection procedure scale linearly with N (with a small prefactor). On the other hand, the overhead on the least-square problem scales as N3 and becomes substantial for N = N  1.

The dual (virtual coordinate and actual force) extrapolation used guesses that lowering the coordinate residuals should immediately lead to a decrease in the deviations between the approximated and original forces. This is not always the case, especially when N is relatively large and approaches the number of internal degrees of freedom of the solute molecule. Then the least-square solvers to Eq. (19) will try to reduce the coordinate residuals to the global (zeroth) minimum with no regard for the values of weighting coefficients Ak which are exploited in both the coordinate and force extrapolations. As a result, a lot of these weights may assume large negative and positive values, despite the presence of the linear normalizing condition Ak = 1 .

(29)

k=1

It is well known from the general theory of extrapolative and quadrature formulas that the presence of weights with large magnitude appreciably increases the uncertainties and decreases the stability range, making extrapolation outside this range useless. The third idea of ASFE is to effectively balance the two kinds of extrapolation by imposing the additional (nonlinear) constraint N 

A2k = min

(30)

k=1

which should minimize the squared norm of the weight coefficients in the dual extrapolation. In view of Eq. (30), the least-square problem (19) is modified to N  2 M N   1  Ak Ri,k + 2 Ak − 1 Ri − M i=1 k=1 k=1 + ε2

N 

A2k = min,

(31)

k=1

where ε ≥ 0 is the balancing parameter. Of course, ε2 cannot be chosen too large because then the main effort is directed to minimize the squared norm (30) rather than the coordinate residuals. This parameter should be treated as a small quantity 2



Gε11

⎜G ⎜ 21 ⎜ ⎜ .. ⎜ . ⎜ ⎜ ⎝ GN1 1

G12

...

G1N

Gε22 .. .

... .. .

G2N .. .

GN,2

...

GεNN

1

...

1

1

⎞⎛

A1





G1



⎜ ⎟ ⎜ ⎟ 1⎟ ⎟ ⎜ A2 ⎟ ⎜ G2 ⎟ ⎟ ⎜ ⎟ ⎟⎜ .. ⎟ ⎜ .. ⎟ = ⎜ .. ⎟ ⎜ ⎜ ⎟ ⎟ . ⎟⎜ . ⎟ ⎜ . ⎟ ⎟ ⎟ ⎜ ⎟ ⎟⎜ 1 ⎠ ⎝ AN ⎠ ⎝ GN ⎠ 0



1

(32) for the same number of unknowns Ak at k = 1, 2, . . . , N and , where Gεkk = Gkk + ε2 with

3. Balancing the equations to minimize the weights norm

N 

aiming at improving quality of the force extrapolation. Optimal values of ε2 can be chosen for the best accuracy of actual simulations. Differentiating Eq. (31) with respect to Ak and leads to the set of N + 1 linear equations,

Gkl =

M 1  Ri,k ·Ri,l , M i=1

Gk =

M 1  Ri,k ·Ri , M i=1

(33)

and l = 1, 2, . . . , N. Note that the (N + 1) × (N + 1) square matrix in Eq. (32) remains symmetrical, since the ε2 correction concerns only N first diagonal elements. In such a way, performing the non-Eckart rotational transformations (21) and (22) by solving the eigenvalue problem (27) for the extended set with N outer pairs and selecting the first nearest N N points to the current inner transformed coordinate (28), the desired solvation forces can be extrapolated as the weighted sum of their N previous outer transformed values, followed by the inverse transformation (23). The weights are found from the balanced least-square problem (31) by solving the set of N + 1 linear equations (32). This extrapolation procedure is applied h/δt times to achieve the next outer point. At that point, the solvation forces are calculated explicitly by converging the 3D-RISM-KH integral equations. The extended N -set is then updated by the new outer force-coordinate pair, while the oldest one is discarded. All these actions are repeated H/h times for the next outer intervals until the desired simulation time length H is achieved. This completes derivation of the ASFE approach. IV. INCORPORATING ASFE INTO MTS-MD/3D-RISM-KH SIMULATION IN THE CANONICAL OIN ENSEMBLE

The equations of motion for atoms of the solute molecule in hybrid MTS-MD/3D-RISM-KH simulation in the canonical-isokinetic OIN ensemble with solvation force extrapolation can be compactly cast as97 d = L(t) . dt

(34)

Here  = {r, v; ς , w} is the extended phase space and L is the Liouville operator. The extended space, apart from the full set of coordinates r ≡ {ri } and velocities v ≡ {vi } of all atoms of the solute molecule, includes also all thermostat frequencies w ≡ {wl,i } and their conjugated dynamical latter are introduced by means of variables ς ≡ {ςi }. The  2 w2,i − Ll=2 wl,i ), where L is the number of dςi /dt = (τi2 w1,i

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-12

I. Omelyan and A. Kovalenko

J. Chem. Phys. 139, 244106 (2013)

chains per thermostat. The Liouvillian can be split as L=

M 

(Ai + Bi + Cv,w,i + Cw,i + Cς,i )

(35)

i=1

into the kinetic term Ai = vi ·∂/∂ri , the potential part  fi vi · f i ∂ vi · f i ∂ − vi − w1,i , Bi = · mi 2Ti ∂vi 2Ti ∂w1,i and the chain-thermostat one Cv,w,i =

2 τi2 w1,i

4

∂ w2,i vi · + ∂vi



2 τi2 w1,i

4

 − 1 w1,i w2,i

(36)

∂ , ∂w1,i (37)

where Cw,i

 L   ∂ 1 2 wl−1,i − 2 − wl+1,i wl,i = ∂w τ l,i i l=2

and

Cς,i = −

2 τi2 w1,i w2,i



L  l=2

 wl,i

∂ . ∂ςi

(38)

(39)

Note that in the canonical OIN ensemble97 each atom is coupled with its own thermostat by imposing the constraint Ti = 3kB T/2, where 2 mi v2i 3kB T τi2 w1,i + (40) 2 4 2 is the full kinetic energy of the ith subsystem. The quantity τ i is related to the relaxation time, determining the strength of coupling of atom i with its thermostat. The total forces f i = fi(f) + fi are now divided into the fast solute-solute part fi(f) and slow 3D-RISM-KH solutesolvent component fi . In view of Eq. (36), this results in the corresponding splitting of the potential operator as Bi ({ f i }) = Bi ({fi(f) }) + Bi ({fi }) ≡ Bf + Bs . Note that the solute atom-atom interactions could be separated further into the strong bonded and weaker non-bonded forces. (To simplify notations, we do not do that here.) Remember also that the solute-solute forces fi(f) are always calculated directly, while the 3D-RISM-KH solute-solvent ones fi are either evaluated explicitly [Eq. (5)] or approximated with ˜fi [Eq. (23)] as described in Sec. III. Then Bs ({fi }) transforms to Bs ({˜fi }) ≡ B˜ s . Thus, using the MTS decomposition method,96, 98, 99, 117 the solution (h) = eLh (0) to Eq. (34) over the outer time interval h from the initial state (0) can be presented97 as the following product of exponential operators:

Ti =

(h) =

n 

δt

(n ) δt 2

eC 2 eBfs

(n ) δt 2

eAδt eBfs

δt

eC 2 (0) + O(δt 2 ) . (41)

n =1

Here, n = h/δt  1 is the total number of inner steps of time length δt each, C = Cv,w + Cw + Cς , ⎧ δt t ⎪ eBf 2 eBs 2 , only once per h when n = 1 ⎪ ⎨  (n ) δt t δt eBfs 2 = eBf 2 eB˜ s 2 , every t/δt inner step n ⎪ ⎪ ⎩ Bf δt2 e , for all other n (42)

is the generalized velocity propagator, t is the intermediate (δt < t < h) time step, O(δt 2 ) sets the decomposition accuracy, and the subscript i is omitted for simplicity. Note that before changing the coordinates of all particles ri by eAδt we should first update the complete set of velocities vi and frequencies wl,i by eCδt/2 and eBδt/2 for all atoms i = 1, 2, . . . , M and thermostat chains l = 1, . . . , L. A very nice feature of the OIN decomposition is that the action of all the single-exponential operators arising in Eqs. (41) and (42) on  can be handled analytically using elementary functions.97 Therefore, the propagation (t) = [(h)]t/ h of dynamical variables from their initial values (0) to arbitrary future time t can be performed by consecutively applying single exponential transformations of an instantaneous phase space point  in the order defined by Eqs. (41) and (42). As can be seen, the fastest Bf -component of motion is integrated most frequently, namely, n = h/δt times per outer interval h with the smallest (inner) time step δt, while the slow 3D-RISM-KH forces (both original and approximated) are applied impulsively only every t/δt inner step, that is h/t < n times. Note that almost all these impulses (when t h) are obtained by employing the approximated 3D-RISM-KH forces [Eq. (23)] in terms of the operator B˜ s , while the explicit 3D-RISM-KH calculations via Bs [Eq. (5)] are used only once per outer time interval h. Taking into account that solute-solute forces are much cheaper to evaluate than effective solvation ones, the obvious benefits in speedup is guaranteed, compared to single time step propagation (n = 1, δt = h) without extrapolation (t = h). Moreover, the introduction of (impulsive) intermediate time step t > δt gives a possibility to reduce the number of both extrapolative and direct 3D-RISM-KH effective solvation force evaluations from h/δt to h/t. Finally, applying the ASFE approach allows one to further improve the overall efficiency significantly, since the most expensive explicit 3D-RISM-KH recalculations are performed just once per outer step h. It is worth emphasizing that the presence of the OIN thermostatting terms in B and C [see Eqs. (36)–(39)] eliminates MTS resonance instabilities. Note that the latter appear in conventional MD simulations (both in the microcanonical and canonical ensembles) already at relatively small outer time steps (see Sec. I). In hybrid MTS-MD/3D-RISM-KH simulation, an additional source of instability can be uncertainties of effective solvation force extrapolation. The canonicalisokinetic OIN propagator (41) efficiently eliminates both the types of instability by imposing the individual isokinetic constraint (40) on each atom of the solute molecule. Moreover, compared to the previous schemes, applying the accurate ASFE approach enables considerable increase of intermediate and especially outer time steps. In view of the results obtained in Secs. III and IV, we can state that the following hierarchy of time steps, δt t h N h N  h = H ,

(43)

should take place in order to achieve the optimal performance during the MTS-MD/OIN/3D-RISM-KH simulations with extrapolation of effective solvation forces. We thus have five time scales. This completes incorporation of the ASFE approach into a combined method which will be referred to as

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-13

I. Omelyan and A. Kovalenko

MTS-MD/OIN/ASFE/3D-RISM-KH, or just OIN/ASFE/3DRISM-KH. Strictly speaking, the particle dynamics generated in OIN/ASFE/3D-RISM-KH simulation differs from conventional MD. In particular, such a quasi-dynamics does not obey the Maxwell velocity distribution and thus, unlike microcanonical MD, cannot get real-time correlation functions. However, as has been rigorously proven in our previous paper,97 the configurational part of the extended partition function obtained in the extrapolative OIN/ASFE/3D-RISMKH description does correspond to the true canonical distribution of the physical system at temperature T in coordinate space. This is a very important feature which allows us to readily reproduce the correct conformational properties, including 3D spatial distribution functions of solvent interaction sites around the biomolecule. Such a quasi-dynamical sampling appears to be much more efficient than “real-time” (microcanonical or canonical) simulations. V. APPLICATION OF MTS-MD/OIN/ASFE/3D-RISM-KH TO HYDRATED ALANINE DIPEPTIDE A. Numerical details

The OIN/ASFE/3D-RISM-KH approach will now be examined in actual simulations. The system considered was a fully flexible model of alanine dipeptide (M = 22 interaction sites) immersed in water. The chemical structure and geometry of the alanine dipeptide molecule is shown in Fig. 1. Note that alanine represents one of the 20 proteinogenic amino acids, and the methyl group coming from alanine constitutes the simplest case of a side chain which has the largest characteristic life time in “flip-flop” conformations of about 10 ns.121 The Amber 03 force field was used to describe the solute-solute interactions.122 Water was represented by the modified cSPC/E potentials.55, 91, 92 The cutoff radius of the solute-solvent interactions was set to rc = 14 Å. To improve the performance, we used an adaptive solvation box with solvation buffer length rb = 10 Å rather than the box of fixed size 32 Å × 32 Å × 32 Å. The geometric center of the adaptive box was always brought to the center of mass of the solute molecule. The 3D grid resolution was set to δr = 0.5 Å. The 3D-RISM-KH integral equations were converged each time to a root mean square tolerance of δ = 10−4

FIG. 1. Schematic structure and geometry of the alanine dipeptide molecule with its two backbone dihedral angles  and . The total number of atoms is M = 22.

J. Chem. Phys. 139, 244106 (2013)

in 60–80 iterations by using the MDIIS accelerated numerical solver.52–55, 104 Further increase of the cutoff rc and the solvation buffer size rb , and refinement of the grid resolution δr and the convergence tolerance δ did not affect the results noticeably. The inner and intermediate time steps were always chosen to be the same, δt = 1 fs and t = 8 fs, respectively. To investigate the conformational properties (Sec. V C), ten main OIN/ASFE/3D-RISM-KH runs were carried out with ten different values of the outer time step: h = 24; 96; 200; 400 fs; and 1; 2; 4; 8; 16; 32 ps. In these simulations, the number of points from the basis and extended sets in the advanced solvation force extrapolation was equal to N = 32 and N = 4000. Many auxiliary runs with various numbers 1 ≤ N ≤ 500 and N ≤ N ≤ 4000 were performed as well to obtain the whole pattern for the extrapolation accuracy (Sec. V B). For comparison, the previous extrapolation scheme (with normalization but without transformation) was used, too. We employed three chains (L = 3) per atom, and the correlation times of all the OIN thermostats were set to the same value τ i ≡ τ = 100 fs. Note that the inverse of τ can be associated to some extent with the friction viscosity γ inherent in the Langevin dynamics, that is γ ∼ 1/τ . Therefore, τ = 100 fs corresponds to a sufficiently small value of γ = 10 ps−1 . To implement the OIN/ASFE/3D-RISM-KH approach in the compiled module SANDER.RISM.MPI for parallel runs, six source code files from the Amber 11 package123 were modified: SANDER.F, RUNMD.F, MDREAD.F, MD.H, AMBER_RISM_INTERFACE.F, and FCE_C.F. In the first two Fortran files, solving the canonical-isokinetic OIN equations of motion was implemented by adapting the velocity-Verletlike version of the decomposition integration [Eq. (41)] to its leapfrog-like counterpart. (The differences between the standard velocity-Verlet and leapfrog schemes can be found, for instance, in Refs. 112 and 124.) The third and fourth files were altered to organize the input/output for new parameters and observable quantities. The fifth and sixth files were rewritten to implement the ASFE procedure. For comparison, conventional canonical MD simulations of a semi-flexible alanine dipeptide molecule immersed in explicit SPC/E water125 were also carried out using the Langevin (LN) thermostat126 (and SHAKE127, 128 to fix all bonds involving hydrogens). The conventional MD dealt with 1263 water molecules, a basis time step of δt = t = 2 fs, and a friction viscosity of γ = 1 ps−1 . The LN/3D-RISM-KH dynamics of the fully flexible alanine dipeptide in cSPC/E water solvent92 at γ = 5 ps−1 , δt = 1 fs (no SHAKE) and h = t = 4 fs (no extrapolation) was carried out, too, to produce “exact” results (see Sec. V C). These two MD runs done in the original Amber 11 package will be referred to as LN with explicit solvent (LNE) and LN/3D-RISM-KH without solvation force extrapolation. For comparison, the LN/SFCE/3D-RISM-KH simulations employing the previous procedure of the solvation force-coordinate extrapolation (SFCE) at N = 32, δt = 1 fs, γ = 20 ps−1 , and h = 96 fs were carried out as well. For the same reason, SFCE/3D-RISM-KH was used also with the INR algorithm43, 44 at h ≤ 1000 fs for L = 3 and τ = 2 fs. All the runs were started from the same, well equilibrated initial configuration and then carried out at temperature

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-14

I. Omelyan and A. Kovalenko

J. Chem. Phys. 139, 244106 (2013)

T = 300 K and solvent density ρ = 0.99705 g/cm3 . The total duration of the MD simulations was equal to 21 and 51 ns for the LN/3D-RISM-KH and LNE schemes, respectively, and up to 100 ns for the INR/SFCE/3D-RISM-KH and OIN/ASFE/3D-RISM-KH approaches. Extremely long OIN/ASFE/3D-RISM-KH simulations of 1 μs length were performed as well. B. Estimation of the extrapolation accuracy

The accuracy of the extrapolation was estimated by measuring the relative root mean square deviations 1/2  M ˜ 2 ( f − f ) i i i=1 1 (44) =  1/2 2 M 2 i=1 fi of the extrapolated effective solvation forces ˜fi from their original values fi calculated directly by converging the 3DRISM-KH integral equations at each outer time step, where . . . denotes statistical averaging along the whole simulation duration. Note that the deviations for subsequent outer time steps increase from zero (when the inner coordinates coincide with those of the first basis point) to maximal values at the end of the current outer time interval, and so introducing the factor 1/2 in Eq. (44) gives mean values. Note also that such an estimation scheme does not require any extra computational efforts, since it involves outer forces which are already known when updating the whole pair set. Fig. 2 makes a comparison between the relative root mean square deviation  observed in course of the OIN/3DRISM-KH simulations using three different extrapolation approaches: the previous SFCE scheme with normalization (Secs. III A and III B) as well as our new ASFE method

FIG. 2. Uncertainty  in course of the OIN/3D-RIM-KH simulations of hydrated alanine dipeptide with three effective solvation force extrapolation approaches: SFCE, ASFE with balancing, and ASFE without balancing (green, red, and blue curves, respectively).  against the number N of the basis points at two typical outer time steps h = 1 and 8 ps for N = N [parts (a) and (b)] and for N = 4000 [parts (c) and (d), respectively].

FIG. 3. Uncertainty  of the OIN/ASFE/3D-RISM-KH simulations using the ASFE approach with balancing versus the number of points N in the extended basis set, observed at two typical outer time steps h = 1 and 8 ps [parts (a) and (b), respectively] and different initial set length N ≤ N , namely, N = 1; 10; 20; 30; 100 (cyan, blue, green, red, magenta curves, respectively). The upper black curves correspond to the constant force approximation.

(Sec. III C) without balancing (ε2 = 0) and with balancing (ε2 = 0.005 Å2 ). The uncertainty  is plotted against the number of extrapolation basis points N at most characteristic outer time steps h = 1 and 8 ps. In order to demonstrate the importance of the extension-selection technique, we present the results for the initial (N = N) and expanded (N = 4000  N) basis sets (left- and right-hand parts of Fig. 2, respectively). The levels of N = 32 and (N = 32) are marked with dashed vertical and horizontal lines. Fig. 3 shows a more detailed dependence of  on N ≥ N at different fixed N for the ASFE approach (with ε2 = 0.005 Å2 ) at two outer time steps h = 1 and 8 ps. Similar patterns are observed for other values of h. Fig. 4 plots the dependences of (h) on outer time step size in the whole range of h ∈ [12 fs, 32 ps] for the standard SFCE scheme as well as for the new ASFE approach (with ε2 = 0.005 Å2 ) at N = 32 for both N = N and N = 4000. For comparison, the result of the constant force extrapolation (CFE), that is SFCE at N = N = 1, is included as well. Note that both axes in Figs. 2–4 are in logarithmic scales for convenience of presentation. As is evident from Figs. 2–4, the new ASFE approach is superior to the previous SFCE scheme. Indeed, ASFE provides a much better accuracy  for all values of N, N , and h, especially at large outer time steps. Taking into account that conformational properties are very sensitive to any uncertainties (in particular, in solute and solvent models129, 130 ) and thus to inaccuracy of the extrapolation, a severe criterion of   5% must be chosen to accurately reproduce

FIG. 4. Relative deviation  of extrapolated solvation forces versus the outer time step h in the OIN/3D-RISM-KH simulations using the previous SFCE (green) and new ASFE (red) approaches at N = 32 with basis extension (N = 4000, solid curves) and without basis extension (N = N, dashed curves). The upper black solid curve corresponds to the constant force approximation (N = N = 1).

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-15

I. Omelyan and A. Kovalenko

observable quantities. (The level of  = 5% is marked in Figs. 3 and 4 with horizontal dashed lines.) Within the previous scheme, this criterion can be satisfied only at small outer steps of h  12 fs (look at the intersection of the dashed green curve with the horizontal dashed line in Fig. 4). For larger h, the original SFCE scheme reproduces the solvation forces too inaccurately (Fig. 4), even with the maximal numbers N = 500 and N = 4000 considered (see Figs. 2 and 3). At the same time, the new ASFE approach using the balancing and extension with selection succeeds in satisfying the condition   5% already at N ∼ 32 up to huge outer steps of h ∼ 32 ps! The results in Fig. 2 allow us to state that the balancing is necessary to minimize the uncertainties at intermediate values of N. Without it, after achieving a minimum for  at N ∼ 32 the previous and advanced schemes suddenly exhibit a resonance-like behavior with a maxima when the number N of basis points is approximately equal to the number Nf of the degrees of freedom of the solute molecule (Nf = 3M = 66 for alanine dipeptide). As mentioned in Subsection III C 3, such a resonance is explained by an uncontrolled increase of the norm of the extrapolating coefficients at N ∼ Nf . The balancing technique (ε2 = 0) efficiently eliminates the maxima in  by imposing the minimum norm constraint in the normal equation method, and so in this case the function  decreases monotonously with increasing N (and N ). Note that at ε2 = 0, we employed the QR-factorization method to solve the least-square problem. Unlike the normal equations, the QR-factorization starts minimizing the norm of solutions only when N > Nf , and at large N ∼ 500 both the approaches lead to nearly the same results. However, the preference is given to the normal equations with balancing, since such large N ∼ 500 requires too much computational effort. Therefore, an optimal value N = 32 can be recommended in the advanced extrapolation method when solving the system of balanced linear equations in actual OIN/ASFE/3D-RISMKH simulations. We see from Figs. 2–4 that the basis extension (N > N with further selection of the best N basis points) constitutes another important component of the ASFE technique. Indeed, the uncertainty  decreases significantly with increase of N from N to 4000 at any values of N ≤ 500 considered. Choosing N = 4000 allows us to provide a high extrapolation accuracy   5% already at a relatively small number N = 32 not only for small h ∼ 20 fs but also for large h ∼ 1 ps and even huge h ∼ 8 − 32 ps (see the intersection of the solid red curve with the horizontal dashed line in Fig. 4). For example, at h = 8 ps, while the previous SFCE approach (with N = N = 32) and the constant-force approximation (N = N = 1) yield an unacceptably large inaccuracy  ∼ 40% and 60%, respectively, the advanced extrapolation at N = 32 and N = 4000 still provides a high precision of  ∼ 5%, thus decreasing the uncertainties more than ten-fold with practically no additional computational costs. At intermediate h  100 fs, we can use smaller N < 4000 to obtain the same  ∼ 5%. Note that all curves in Fig. 4 should start from the same point (0, 0) because by definition  = 0 if h = 0. However, for h > 0 their behavior is quite different. A nice property of the advanced extrapolation is that the uncertainty 

J. Chem. Phys. 139, 244106 (2013)

increases slightly with the outer time step for h  100 fs but does not change noticeably at larger h. This is contrary to the previous approach in which  increases in the whole range of varying h. That can be explained by the fact that at N  N (when N h ≡ H exceeds the maximum among all the mean life times in different local equilibrium states) the extended basis set, after passing some initial simulation length t  H, already contains all characteristic points in the multidimensional force-coordinate space. Whenever a rapid transition to another local equilibrium conformation occurs thus introducing major uncertainties in the extrapolation, the selection procedure quickly refits the adaptive basis set (with N points) to the current configuration. Moreover, at t  H the accuracy of such fitting is almost independent on h H. It is worth noting also that at the very beginning of simulation, the 3D-RISM-KH integral equations are solved after each t of the N first intermediate steps with no extrapolation to fill out the basis set. Then, the extrapolation starts with N points and the extended set gets completed subsequently step by step in the integration process. Since h can be much larger than t, we cannot put the outer step to be immediately equal to h  t. The reason is that the extrapolation then skews because of the significant non-uniformity of the time intervals between the points from the set. The ASFE approach remedies this issue in such a way that the outer time interval is smoothly increased after each intermediate step from t to h with an increment of t.

C. Evaluation of conformational properties

Our accuracy estimates in Sec. V B will now be confirmed with actual OIN/ASFE/3D-RISM-KH simulations of the conformational properties of hydrated alanine dipeptide. In particular, we study the dipole moment distribution function Q(p) and the Ramachandran free energy plot E = −kB Tln [ϒ(, )].Here, p = |p| is the magnitude of the dipole moment p = M i=1 qi ri of the solute molecule with on atoms i (satisfying the electro-neutrality partial charges q  i condition i qi = 0), and  and  are the torsion angles (see Fig. 1). Such angles (or order parameters for amino acid residues) are of key importance when describing a configurational behavior of alanine dipeptide as well as large proteins. The one- and two-dimensional distribution functions Q(p) and ϒ(, )

are assumed to be normalized, that is dp Q(p) = 1 and dd ϒ(, ) = 1. They represent the probability of the system remaining in a state with dipole moment p, or with order parameters  and . Such a probability is very sensitive to a choice of solute and solvent models,129, 130 as well as to any uncertainties in force evaluation. Thus, a comparison of Q(p) and E(, ) with their “exact” counterparts is a good idea when testing any new approaches. The “exact” (or rather “expected”) values of Q(p) and E(, ) were obtained using the LN run (mentioned in Sec. V A) with small time steps δt = 1 fs and t = 4 fs, a small friction coefficient γ = 5 fs, and no extrapolation (h = t) to minimize the impact of all possible uncertainties on the results.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-16

I. Omelyan and A. Kovalenko

FIG. 5. The instantaneous values of the dipole moment of the alanine dipeptide molecule in aqueous solution, obtained in the MTS-MD simulations using the LNE, LN/3D-RISM-KH, INR/SFCE/3D-RISM-KH, and OIN/ASFE/3D-RISM-KH approaches [parts (a)–(d), respectively].

Figs. 5(a)–5(d) show the instantaneous values p of the dipole moment obtained for the hydrated alanine dipeptide molecule in the conventional LNE simulation as well as in the combined MTS-MD/3D-RISM-KH approaches using the LN, INR, and OIN thermostats versus the observation time t during the first 10 ns (so as to make the changes in p with varying t visible). The INR data for h = 1 ps and τ = 2 fs were taken from Ref. 97. A comparison of the time evolution of the dipole moment p(t) in the LNE approach with that in the LN scheme (h = 4 fs, no extrapolation, γ = 5 ps−1 ) clearly shows that despite the use of larger γ the combined MTS-MD/3D-RISMKH simulations significantly increase (by a factor of 5) the frequency of transitions between conformational states with different values of the mean dipole moment of the hydrated alanine dipeptide molecule. Applying the non-optimal INR thermostat [Fig. 5(c)] slows this frequency down almost to the same level as in conventional MD. As is mentioned in the Introduction, this drag of INR is caused by the too strong coupling with heat baths (each degree of freedom is constrained). On the other hand, no such drag is observed in the optimal OIN ensemble [see Fig. 5(d), where the results relate to h = 1 ps]. Indeed, the transition frequency in OIN/ASFE/3D-RISMKH at τ = 100 fs practically coincides with that in LN/3DRISM-KH at γ = 5 ps−1 . Although smaller τ and larger γ are acceptable as well, it is not recommended to radically change them since this can make the energy exchange between the system and thermostats too intense and thus decrease the transition frequency. The above frequency acceleration is explained by the fact that during the MTS-MD/3D-RISM-KH evolution of the solute molecule steered by effective solvation forces, the solvent always remains quasi-equilibrated. In other words, solvent reequilibration processes caused by conformational changes of the solute molecule in explicit MD are merely contracted with 3D-RISM-KH statistical-mechanical averaging. As a conse-

J. Chem. Phys. 139, 244106 (2013)

quence, the MTS-MD/3D-RISM-KH quasi-dynamics implicitly speeds up the simulation due to a much faster convergence with observation time. This substantially decreases the simulation length necessary to achieve the required accuracy of statistical sampling. The decrease becomes dramatic for complex biomolecular systems with slow and rare solvation events such as exchange and localization of solvent molecules and ions in function-related confined spaces of biomolecules, ligand binding, and molecular recognition. In conventional MTS-MD the solvent enters such pockets or inner cavities of the protein through its conformational changes. This is a very slow process with rare statistics which is as difficult to model explicitly as protein folding. As distinct, the 3D-RISMKH molecular theory of solvation yields the solvent distribution in the protein cavity at once for its “final” conformation, bypassing the “transition” states. Such solvent distribution is statistically mechanically averaged over exchange of solvent molecules and/or ligands (which can be a part of solvent) between the protein cavity and bulk solvent, and is accounted for in 3D-RISM-KH by satisfying the condition of chemical balance in the system analytically. Thus, the OIN/ASFE/3D-RISM-KH approach can efficiently sample phase space for essential solvation and binding events with rare statistics. This is very distinct from conventional MTS-MD with explicit solvent which requires enormous computational time and number of steps in such cases. The use of the 3D-RISM-KH molecular theory of solvation dramatically enhances phase space sampling and statistical convergence, and in this context somewhat resembles such techniques as umbrella sampling, weighted histogram, replica exchange, parallel tempering, and other biasing methods that were designed to alleviate convergence problems of conventional MD and/or Monte Carlo simulations (see, for instance, Ref. 10 and references therein).

Figs. 6(a)–6(d) present the values p = pdp Q(p) of the dipole moment averaged over all conformations of the hydrated alanine dipeptide molecule as a function of the simulation duration for the MTS-MD/3D-RISM-KH with OIN/ASFE and INR/SFCE extrapolation schemes as well as for LN/3D-RISM-KH, versus the LNE approach. It is evident from Fig. 6(a) that for the LNE simulations the integration length t = 51 ns is rather insufficient to obtain reliable results. Indeed, the oscillations of p are significant here even at the end of the MD run. This is in agreement with the fact that for hydrated alanine dipeptide there are several transitions between different conformational states characterized by the mean life time varying from 30 ps, through 250 ps, and up to 10 ns.121 Thus, even for such a relatively simple organic molecule as alanine dipeptide, a long observation time of at least a few hundreds nanoseconds is necessary to suppress statistical noise in explicit solvent MD simulations. On the other hand, the noise oscillations decay much more rapidly in the MTS-MD/3D-RISM-KH simulations due to the quasi-dynamics acceleration described above. Indeed, the blue, red, green, and cyan curves in Figs. 6(b)–6(d) show that we can talk about trusted results for LN/3DRISM-KH and OIN/ASFE/3D-RISM-KH already at a simulation duration of order of 20 ns [the “expected” limiting value limt → ∞ p (t) is marked in Figs. 6(b)–6(d) with the

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-17

I. Omelyan and A. Kovalenko

J. Chem. Phys. 139, 244106 (2013)

FIG. 7. Dipole moment distribution of alanine dipeptide in aqueous solution, obtained in MTS-MD/3D-RISM-KH simulations with the LN/SFCE, OIN/CFE, OIN/SFCE, and INR/SFCE integrators at the outer time steps h = 96; 200; 400; 1000 fs, respectively, versus the “expected” result from LN/3D-RISM-KH at h = 4 fs.

FIG. 6. (a)–(d) The averaged value of the dipole moment of the hydrated alanine dipeptide molecule versus the duration of the MD simulations with the LNE, LN/3D-RISM-KH, LN/SFCE/3D-RISM-KH, INR/SFCE/3D-RISMKH, and OIN/ASFE/3D-RISM-KH integrators at different parameters and outer time steps (see the text for details).

horizontal dashed blue line]. The convergence becomes worse for the INR/SFCE/3D-RISM-KH integrator requiring at least 100 ns for correct statistical sampling [magenta in Fig. 6(c)], just as in conventional MD. Note that in Fig. 6(d), except the LN/SFCE/3D-RISM-KH curve at h = 96 fs, we included the OIN/ASFE/3D-RISM-KH results for the very large outer time steps h = 2, 4, and 8 ps. In all the cases, a good stability of the simulations ia observed. This is indicative of a proper choice for the thermostatting correlation time τ = 100 fs in the OIN ensemble and for the friction coefficient γ = 20 ps−1 in the Langevin dynamics. The choice N = 4000 can also be considered as optimal. The reason is that then, for instance, at h = 2 ps the extended time interval H = N h = 8 ns appears to be comparable with the longest mean life time of 10 ns in the conformational dynamics of hydrated alanine dipeptide.121 Thus, further increase of N does not really improve the accuracy of the extrapolation. Figs. 7 and 8 show the dipole moment distribution function Q(p) of the alanine dipeptide molecule in cSPC/E water obtained in the MTS-MD/3D-RISM-KH simulations with different ensembles, extrapolation schemes, and outer time steps. For comparison, the “exact” values calculated with the LN/3D-RISM-KH algorithm (at h = t = 4 fs, that is, without extrapolation) are presented as well. In the “exact” distribution function, we can point out two clear peaks at p ≈ 3 D and ≈7.5 D as well as an enhancement in the intermediate region p ≈ 5.5 D, corresponding to different conformational states of the solute molecule [see also Figs. 9 and 10]. Note that a similar behavior of Q(p) was theoretically observed earlier92, 129 for various force fields and water solvent models, and was related to experimental (infrared spectroscopy) results for alanine dipeptide in aqueous solution.129 To demonstrate the high sensitivity of Q(p) to uncertainties of the calculations, we first plot the results of the previous

approaches in Fig. 7. These include the LN, INR, and OIN thermostats in the previous extrapolation scheme (SFCE at N = 32), as well as the constant force approximation (CFE, N = 1) in OIN. We see that already at relatively small outer time steps h = 96; 200; 400 fs the LN/SFCE, OIN/SFCE, and especially OIN/CFE curves significantly deviate from the “exact” data. Although INR can produce rather good results, it requires a much longer simulation (by a factor of 5; see the discussion on Figs. 5 and 6 above), which is undesirable. On the other hand, the use of the advanced extrapolation approach in the OIN ensemble leads to the best results which remain of high quality even at extremely large outer time steps. Indeed, it is readily seen in Fig. 8 that the OIN/ASFE/3D-RISM-KH curves related to h = 96 fs (green), 1 ps (magenta), 2 ps (cyan), and 8 ps (red) are practically indistinguishable from the “exact” function (blue). (Some slight uncertainties can be attributed to statistical noise.) Only at h = 32 ps the OIN deviations become visible and achieve a level of order of 10% which is still tolerable. Thus, outer time steps of order of tens of picoseconds are applicable, indeed, being in an excellent agreement with the theoretical predictions made in Sec. V B. Note that the maximal acceptable outer time steps reported earlier were h = 20 fs in the LN/SFCE/3D-RISM-KH approach92 and 200 fs in OIN/SFCE/3D-RISM-KH97 using the previous extrapolation scheme, SFCE. (The value h = 1 ps achieved in the INR/SFCE/3D-RISM-KH approach97 is somewhat artificial

FIG. 8. Dipole moment distribution function of the alanine dipeptide in aqueous solution obtained in the OIN/ASFE/3D-RISM-KH simulations at h = 96 fs; 1; 2; 8; 32 ps. Shown for comparison is the “exact” result from the LN/3D-RISM-KH approach at h = 4 fs.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-18

I. Omelyan and A. Kovalenko

J. Chem. Phys. 139, 244106 (2013)

FIG. 9. Ramachandran free energy surface E(, ) of the hydrated alanine dipeptide molecule, obtained in conventional MD simulation with explicit solvent [part (a)], LN/3D-RISM-KH dynamics at h = 4 fs with no force extrapolation [part (b), “exact” result], as well as in the OIN/ASFE/3D-RISM-KH advanced extrapolative approach at the outer time steps h = 1 and 8 ps [parts (c) and (d), respectively]. The histogram channel size for  and  is 3.6◦ . Energy and angle units are kcal/mol and degrees.

because of the slowdown of the simulations by a factor of 5.) Therefore, the canonical-isokinetic OIN ensemble coupled with the 3D-RISM-KH molecular theory of solvation and the advanced extrapolation approach, ASFE, enables us to increase the maximal outer time step allowed for a high accuracy of 5% in equilibrium properties to at least h = 8 ps, that is 40- to 400-fold! The Ramachandran free energy E(, ) obtained for the hydrated alanine dipeptide molecule in four different MD runs is plotted in Fig. 9 using color image representation as a function of the backbone dihedral angles  and . The runs include the conventional MD (at δt = t = h = 2 fs) with explicit solvent, the Langevin dynamics (LN/3D-RISMKH at δt = 1 fs and t = h = 4 fs, no force extrapolation), as well as the advanced extrapolative OIN/ASFE/3D-RISMKH approach at the outer time steps h = 1 and 8 ps [parts (a)–(d), respectively]. As is well known for the hydrated alanine dipeptide,121, 129–140 there exist two major minima in E(, ), namely, α R at ( = −60◦ to −90◦ ,  = −50◦ to −20◦ ) in the α basin and polyproline (PP) II/C5 ( = −60◦

to −170◦ ,  = 120◦ to 170◦ ) in the β basin. They correspond to α-helical and extended β-strand/β-sheet secondary structures, respectively. The latter itself contains two local minima related to the PPII ( = −70◦ ,  = 160◦ ) and C5 ( = −160◦ ,  = 165◦ ) conformations, resulting in three bright yellow areas. We see from Fig. 9 that all these characteristic features are reproduced not only quantitatively but also quantitatively. In particular, the differences between the OIN/ASFE/3DRISM-KH results (at h = 1 and 8 ps) and the “exact” data (LN/3D-RISM-KH, h = 4 fs) are small and not exceeding a few percent [look at parts (b)–(d)]. This confirms once again that huge outer time steps of order of 10 ps can be reliably used with the proposed extrapolative approach of OIN/ASFE/3D-RISM-KH. A comparison of the MTS-MD/3D-RISM-KH results with those of explicit solvent MD simulations is also important, as it gives a possibility of factoring out the accuracy of the 3D-RISM-KH integral equation theory. As is seen, the differences in this case are somewhat larger. For example, the 3D-RISM-KH

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-19

I. Omelyan and A. Kovalenko

J. Chem. Phys. 139, 244106 (2013)

FIG. 10. Mean dipole moment p(, ) [in Debye (D), part (a)] and solvation free energy E(, ) [in kcal/mol, part (b)] of the hydrated alanine dipeptide molecule as functions of torsion angles  and  (degrees), obtained with the OIN/ASFE/3D-RISM-KH approach at h = 8 ps in an extremely long simulation of 1 μs duration. The histogram channel size for  and  is 1.8◦ .

theory slightly underestimates the free energy near the C5 minimum which appears to be too deep compared to that from explicit solvent MD. Also, it overestimates E(, ) in the whole right-sided region (  −45◦ ). However, the uncertainties never exceed 10%, which is quite acceptable. Moreover, this region does not contain major minima and is characterized with low probabilities of states, and so the deviations here might be caused by an insufficient statistics due to the relatively short length of the explicit solvent MD simulation. Note that the one- and two-dimensional dependences Q(p) and ϒ(, ) are closely related to the full M-body distribution function ({r}) ≡ (r1 , r2 , . . . rM ) of the solute macromolecule comprising M interaction sites. Taking into account that the instantaneous dipole moment p({r}) as well as the torsion angles ({r}) and ({r}) depend on the current conformation {r}, we obtain  Q(p) = d{r}({r})δ[p − p({r})],  (45) ϒ(, ) = d{r}({r})δ[ − ({r})]δ[ − ({r})] ,

Figs. 10(a) and 10(b) plot the two-dimensional functions p(, ) and E(, ) obtained with OIN/ASFE/3D-RISMKH at h = 8 ps in an extremely long, 1 μs simulation. The necessity of increasing the observation length from 100 ns to 1 μs is mainly to better describe the minor minima in the conformational states with low probability ϒ (high energy E) by improving statistics [cf. Fig. 10(b) with Fig. 9(d)]. The dependence p(, ) is also informative since it gives access to mean dipole moments in each (, )-state. A comparison of Fig. 10(a) with Fig. 10(b) shows that in the conformational dynamics there are transitions between α R , β-PPII, and β-C5 conformations characterized with high (∼7.5 D), medium (∼5.5 D), and small (∼3D) values of the average dipole moment of the alanine dipeptide molecule in aqueous solution. The maximal free energy barrier E = Emax − Emin can amount to 5.5 kcal/mol, which is almost ten times larger than the kinetic energy scale kB T ≈ 0.6 kcal/mol at T = 300 K. Thus, the transitions probability exp ( − E/kB T) can be quite tiny, which explains the large mean life times up to 10 ns in some conformational states of alanine dipeptide.121

where the integration is carried out over the whole confor

mational space, and the normalization condition d{r}({r}) = 1 has been applied. In view of Eq. (45), the mean dipole moment of the molecule in the state (, ) can be calculated as  p(, ) = d{r}p({r})({r})δ[ − ({r})]δ[ −({r})].

D. Speedup of MTS-MD by OIN/ASFE/3D-RISM-KH

(46)

Remember that the dipole moment values p = pdpQ(p) shown previously in Fig. 6 correspond to averaging over all conformations, and so we now can equivalently rewrite that

p = ddp(, )ϒ(, ).

To cover 100 ns observation length in the combined LN/3D-RISM-KH simulation of hydrated alanine dipeptide at δt = 1 fs and t = 4 fs with no extrapolation (h = t) requires 8 months of real time on a modern high performance workstation with three CPU cores. The wall time decreases to 16 days for conventional MTS-MD simulations with 1263 explicit water molecules using RESPA algorithm at the same δt = 1 fs and t = 4 fs. Taking into account the 5-fold acceleration of solute evolution with 3D-RISM-KH quasi-dynamics, we obtain that the non-extrapolative LN/3D-RISM-KH approach is, nevertheless, three times slower (8 months/16 days/5

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-20

I. Omelyan and A. Kovalenko

= 3) than explicit solvent MD. This is because in the case of h = t, the 3D-RISM-KH effective solvation forces are calculated directly by the computationally expensive procedure of converging the 3D-RISM-KH integral equations at every intermediate step t. With the capability of larger outer time steps of h = 20 fs in the previous extrapolative LN/SFCE/3DRISM-KH scheme, the simulation can be accelerated additionally by about h/t = 5 times, making it already faster than explicit solvent MTS-MD by a factor of 5/3. What then about the OIN/ASFE/3D-RISM-KH advanced extrapolative approach? In this case, a further substantial relative speedup is achieved by applying huge outer time steps up to h ∼ 8 and 32 ps. The possibility of using such gigantic values of h in this approach follows from the accuracy analysis in Sec. V B and is confirmed with the actual measurement of observable quantities in Sec. V C. Computations show that the simulations of 100 ns length at h = 8 and 32 ps in OIN/ASFE/3D-RISM-KH require about 6 and 4 h of wall time, respectively (on the same workstation with three CPU cores). Therefore, the apparent speedup compared to conventional MD (5 × 16 days/6 h to 5 × 16 days/4 h) reaches a factor of ∼300–500! Computational wall time can be substantially reduced on a supercomputing system with parallelization techniques applied to the linear algebra and 3D-FFT routines used in converging the 3D-RISM-KH integral equations with the MDIIS solver as well as to the OIN integrator and the ASFE scheme. Fig. 11 presents the speedup D of the OIN/ASFE/3DRISM-KH advanced extrapolative approach compared to conventional MD with explicit solvent as a function of outer time step in the whole range from h = 4 fs to 32 ps (red solid curve labeled for simplicity as OIN/ASFE). Shown are also the previous LN/SFCE and OIN/SFCE extrapolative schemes ending at the maximally allowed outer time steps h = 24 fs and 200 fs (green and blue dashed curves, respectively). As distinct, the curve for the OIN/ASFE advanced extrapolative scheme lasts up to h = 32 ps. The corresponding speedup limits for these approaches are 2, 15, and 500, respectively. (The three maximal outer time steps and corresponding speedup limits are marked with dashed vertical and horizontal lines.) Note that compared to the previous LN/SFCE and OIN/SFCE schemes, the total additional overhead in the OIN/ASFE ad-

OIN/ASFE

OIN/SFCE LN/SFCE

J. Chem. Phys. 139, 244106 (2013)

vanced extrapolative scheme constitutes only 10%. (This includes extra computational costs for solving the square-least and eigenvalue problems, the OIN equations, and handling the extension-selection procedures.) This shifts the LN/SFCE and OIN/SFCE curves slightly upwards with respect to OIN. The gigantic 500-fold simulation speedup achieved in the OIN/ASFE/3D-RISM-KH approach shows its advantage over conventional MD with explicit solvent. This speedup allowed us, in particular, to perform a 1-μs long run equivalent to 5 μs of conventional MD simulation with explicit solvent, actually spending only 60 h with three CPU cores on a standard workstation! Note that D takes into account the 5-fold intrinsic acceleration due to the quasi-dynamic character of OIN/ASFE/3D-RISM-KH. Even without it, the apparent 100fold speedup is a great deal. A significantly higher speedup should be expected with the OIN/ASFE/3D-RISM-KH integrator for complex biomolecular systems, including proteins. Further to the above acceleration of solute time evolution in a plain step-tostep comparison to explicit solvent MTS-MD, the 3D-RISMKH molecular theory of solvation produces 3D density distributions and effective forces of solvent quasi-equilibrated for a current solute conformation and thus samples over an ensemble of solvent configurations around the solute molecule as if a multitude of explicit solvent MD simulation were run simultaneously. This provides a dramatic acceleration of sampling over slow and/or rare solvation events such as exchange, localization, and binding of solvent, ions, and ligands in pockets and other function related inner spaces of biomolecules. We see in Fig. 11 that the time gain increases almost linearly with h, provided h  2 ps. For larger outer time steps the function D(h) grows not so quickly, achieving a saturation regime at h  32 ps. This behavior is explained by the fact (Sec. V B) that in OIN/ASFE/3D-RISM-KH the outer time step rises smoothly from t to h with an increment of t. Then, for example, to reach h = 16 ps at t = 8 fs requires to pass 2001 × 2000/2 intermediate time steps, which constitutes a simulation length of 2001 × 1000 × 8 fs = 16 ns or even 64 ns for h = 32 ps. Such length is already comparable to 100 ns used when evaluating D. Moreover, then the extended basis interval H = N h at N = 4000 and h = 16 ps or 32 ps can achieve a value of H = N h = 64 or 128 ns, exceeding the total simulation length H = 100 ns. This means that for h  32 ps and H = 100 ns the extended set will never be filled out completely, reducing the extrapolation accuracy. In addition, at h ∼ 32 ps the relative total overhead of the advanced extrapolation technique becomes comparable with those spent on converging the 3D-RISM-KH integral equations, making no sense to use larger h. In view of that, values of order 32 ps should be considered as an upper limit for the outer time step in the OIN/ASFE/3D-RISM method. VI. CONCLUSION

FIG. 11. Speedup of MTS-MD/3D-RISM-KH with OIN/ASFE (solid red curve), OIN/SFCE (blue dashed curve), and LN/SFCE (green dashed curve) with respect to conventional MD with explicit solvent for alanine dipeptide in aqueous solution as a function of outer time step h.

We have proposed a new approach of ASFE for efficient sampling of biomolecular systems in the multiscale method of coupled MTS-MD/OIN/ASFE/3D-RISM-KH simulations. The ASFE approach comprises a number of techniques we have developed and applied together for the first time:

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-21

I. Omelyan and A. Kovalenko

non-Eckart transformation of coordinate space, modified least-square minimization of the residuals between the extrapolated and original coordinates/forces in the transformed space, maintaining an extended force-coordinate pair set while selecting the best subset for the smallest force residual minimum, balancing the normal equations, and incremental increase of the extrapolation interval. Complemented with the OIN thermostat, these techniques have allowed us to increase the outer time step by about three orders of magnitude compared to the previously used LN/SFCE/3D-RISM-KH simulation method, and thus to dramatically improve the efficiency of OIN/ASFE/3D-RISM-KH simulation. We have demonstrated on a fully flexible model of alanine dipeptide in aqueous solution that the hybrid MTSMD/OIN/ASFE/3D-RISM-KH integrator is able to handle effective solvation forces in huge outer time steps up to tens of picoseconds without affecting the equilibrium and conformational properties. This significantly overcomes the efficiency limitations inherent in LN/SFCE/3D-RISM-KH with the previous extrapolation scheme as well as in conventional MTS-MD, and provides a speedup by a factor of 100– 500 with respect to explicit solvent MD simulations. Further to the acceleration of solute time evolution in a plain step-to-step comparison of OIN/ASFE/3D-RISMKH to explicit solvent MD, the 3D-RISM-KH molecular theory of solvation produces 3D density distributions and effective forces of solvent quasi-equilibrated for a current solute conformation and thus samples over an ensemble of solvent configurations around the solute molecule as if a multitude of explicit solvent MD simulations were run simultaneously. The use of 3D-RISM-KH can provide a dramatic acceleration of sampling over slow and rare solvation events in complex supramolecular and biomolecular systems such as exchange and localization of solvent and ions in pockets and other function-related inner spaces,57, 66, 70, 72, 85–88 and in various processes of molecular recognition,69–78 including protein-ligand binding,66, 74–78 selective ion binding85, 86 and gating,66, 76 association and salt-induced conformational transitions of DNA,83–85 proteinprotein interaction,78–82 and nanoparticles aggregation.88 In particular, the hybrid OIN/ASFE/3D-RISM-KH method is well suitable to evaluate effective interactions (potentials of mean force) between biomolecules and nanoparticles in solution, which are computationally very expensive to get from MD simulations141 but are efficiently calculated from the 3D-RISM-KH theory.78–82, 88 Generalization of the ASFE approach to proteins and other biomolecules in solution will be presented in our subsequent publication. ACKNOWLEDGMENTS

This work was supported by ArboraNano – the Canadian Forest NanoProducts Network, and by the National Institute for Nanotechnology – a joint initiative of the National Research Council of Canada, the University of Alberta, and the Government of Alberta. The computations were carried out on the high performance computing resources provided by WestGrid – Compute/Calcul Canada. I.O. is grateful for the hospitality during his stay at the University of Alberta and

J. Chem. Phys. 139, 244106 (2013)

the National Institute for Nanotechnology. We thank Dr. Tyler Luchko, Dr. Nikolay Blinov, and Dr. Sergey Gusarov for helpful discussions. 1 J. A. McCammon, B. R. Gelin, and M. Karplus, Nature (London) 267, 585

(1977). Brooks, M. Karplus, and B. Petitt, Adv. Chem. Phys. 71, 1 (1988). 3 A. Rojnuckarin, S. Kim, and S. Subramaniam, Proc. Natl. Acad. Sci. U.S.A. 95, 4288 (1998). 4 G. Hernández, F. E. Jenney, Jr., M. W. W. Adams, and D. M. LeMaster, Proc. Natl. Acad. Sci. U.S.A. 97, 3166 (2000). 5 M. Karplus and J. A. McCammon, Nat. Struct. Biol. 9, 646 (2002). 6 Y. Zhang, M. H. Peters, and Y. Li, Proteins: Struct., Funct., Genet. 52, 339 (2003). 7 M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids (Clarendon, Oxford, 1987). 8 D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to Applications (Academic Press, New York, 1996). 9 B. Leimkuhler and S. Reich, Simulating Hamiltonian Dynamics (Cambridge University Press, Cambridge, 2005). 10 M. E. Tuckerman, Statistical Mechanics: Theory and Molecular Simulation (Oxford University Press, New York, 2010). 11 Y. Duan and A. Kollman, Science 282, 740 (1998). 12 P. L. Freddolino, F. Liu, M. Gruebele, and K. Schulten, Biophys. J. 94, L75 (2008). 13 J. L. Klepeis, K. Lindorff-Larsen, R. O. Dror, and D. E. Shaw, Curr. Opin. Struct. Biol. 19, 120 (2009). 14 R. F. Service, Science 330, 308 (2010). 15 D. E. Shaw, P. Maragakis, K. Lindorff-Larsen, S. Piana, R. O. Dror, M. P. Eastwood, J. A. Bank, J. M. Jumper, J. K. Salmon, Y. Shan, and W. Wriggers, Science 330, 341 (2010). 16 J. E. Stone, D. J. Hardy, I. S. Ufimtsev, and K. Schulten, J. Mol. Graphics Modell. 29, 116 (2010). 17 S. Genheden and U. Ryde, Phys. Chem. Chem. Phys. 14, 8662 (2012). 18 H. Grubmüller, H. Heller, A. Windemuth, and K. Schulten, Mol. Simul. 6, 121 (1991). 19 M. E. Tuckerman, B. J. Berne, and G. J. Martyna, J. Chem. Phys. 97, 1990 (1992). 20 S. J. Stuart, R. Zhou, and B. J. Berne, J. Chem. Phys. 105, 1426 (1996). 21 A. Kopf, W. Paul, and B. Dünweg, Comput. Phys. Commun. 101, 1 (1997). 22 G. Zhang and T. Schlick, J. Comput. Chem. 14, 1212 (1993). 23 G. Zhang and T. Schlick, J. Chem. Phys. 101, 4995 (1994). 24 T. Schlick, E. Barth, and M. Mandziuk, Annu. Rev. Biophys. Biomol. Struct. 26, 181 (1997). 25 E. Barth and T. Schlick, J. Chem. Phys. 109, 1617 (1998). 26 B. Garcia-Archilla, J. M. Sanz-Serna, and R. D. Skeel, SIAM J. Sci. Comput. 20, 930 (1998). 27 J. A. Izaguirre, S. Reich, and R. D. Skeel, J. Chem. Phys. 110, 9853 (1999). 28 Q. Ma and J. A. Izaguirre, Multiscale Model. Simul. 2, 1 (2003). 29 J. A. Izaguirre, D. P. Catarello, J. M. Wozniak, and R. D. Skeel, J. Chem. Phys. 114, 2090 (2001). 30 R. D. Skeel and J. A. Izaguirre, Mol. Phys. 100, 3885 (2002). 31 S. Melchionna, J. Chem. Phys. 127, 044108 (2007). 32 G. J. Martyna, M. E. Tuckerman, D. J. Tobias, and M. L. Klein, Mol. Phys. 87, 1117 (1996). 33 A. Cheng and K. M. Merz, Jr., J. Phys. Chem. B 103, 5396 (1999). 34 J. Komeiji, J. Mol. Struct.: THEOCHEM 530, 237 (2000). 35 W. Shinoda and M. Mikami, J. Comput. Chem. 24, 920 (2003). 36 P. Minary, G. J. Martyna, and M. E. Tuckerman, J. Chem. Phys. 118, 2510 (2003). 37 R. Zhou and B. J. Berne, J. Chem. Phys. 103, 9444 (1995). 38 M. Watanabe and M. Karplus, J. Phys. Chem. 99, 5680 (1995). 39 E. Barth and T. Schlick, J. Chem. Phys. 109, 1633 (1998). 40 M. Mandziuk and T. Schlick, Chem. Phys. Lett. 237, 525 (1995). 41 T. Schlick, M. Mandziuk, R. D. Skeel, and K. Srinivas, J. Comput. Phys. 140, 1 (1998). 42 Q. Ma, J. A. Izaguirre, and R. D. Skeel, SIAM J. Sci. Comput. 24, 1951 (2003). 43 P. Minary, M. E. Tuckerman, and G. J. Martyna, Phys. Rev. Lett. 93, 150201 (2004). 44 J. B. Abrams, M. E. Tuckerman, and G. J. Martyna, Computer Simulations in Condensed Matter Systems: From Materials to Chemical Biology (Springer-Verlag, Berlin, 2006), Vol. 1; Lect. Notes Phys. 703, 139 (2006). 2 C.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-22 45 J-P.

I. Omelyan and A. Kovalenko

Hansen and I. McDonald, Theory of Simple Liquids, 3rd ed. (The Netherlands, Amsterdam, 2006). 46 Molecular Theory of Solvation, Understanding Chemical Reactivity, edited by F. Hirata and P. G. Mezey (Kluwer Academic Publishers, Dordrecht, The Netherlands, 2003), Vol. 24. 47 D. Chandler and H. C. Andersen, J. Chem. Phys. 57, 1930 (1972). 48 D. Chandler, J. McCoy, and S. Singer, J. Chem. Phys. 85, 5971 (1986); J. Chem. Phys. 85, 5977 (1986). 49 D. Beglov and B. Roux, J. Phys. Chem. B 101, 7821 (1997). 50 A. Kovalenko and F. Hirata, Chem. Phys. Lett. 290, 237 (1998). 51 A. Kovalenko and F. Hirata, J. Chem. Phys. 110, 10095 (1999). 52 A. Kovalenko and F. Hirata, J. Phys. Chem. B 103, 7942 (1999). 53 A. Kovalenko and F. Hirata, J. Chem. Phys. 112, 10391 (2000). 54 A. Kovalenko and F. Hirata, J. Chem. Phys. 112, 10403 (2000). 55 A. Kovalenko, “Three-dimensional RISM theory for molecular liquids and solid-liquid interfaces,” in Molecular Theory of Solvation, Understanding Chemical Reactivity, edited by F. Hirata and P. G. Mezey (Kluwer Academic Publishers, Dordrecht, The Netherlands, 2003), Vol. 24, pp. 169–275. 56 S. Gusarov, B. S. Pujari, and A. Kovalenko, J. Comput. Chem. 33, 1478 (2012). 57 A. Kovalenko, Pure Appl. Chem. 85, 159 (2013). 58 A. Kovalenko and F. Hirata, Chem. Phys. Lett. 349, 496 (2001). 59 A. Kovalenko and F. Hirata, J. Theor. Comput. Chem. 1, 381 (2002). 60 K. Yoshida, T. Yamaguchi, A. Kovalenko, and F. Hirata, J. Phys. Chem. B 106, 5042 (2002). 61 I. Omelyan, A. Kovalenko, and F. Hirata, J. Theor. Comput. Chem. 2, 193 (2003). 62 S. Gusarov, T. Ziegler, and A. Kovalenko, J. Phys. Chem. A 110, 6083 (2006). 63 D. Casanova, S. Gusarov, A. Kovalenko, and T. Ziegler, J. Chem. Theory Comput. 3, 458 (2007). 64 J. W. Kaminski, S. Gusarov, T. A. Wesolowski, and A. Kovalenko, J. Phys. Chem. A 114, 6082 (2010). 65 M. Malvaldi, S. Bruzzone, C. Chiappe, S. Gusarov, and A. Kovalenko, J. Phys. Chem. B 113, 3536 (2009). 66 A. Kovalenko, A. E. Kobryn, S. Gusarov, O. Lyubimova, X. Liu, N. Blinov, and M. Yoshida, Soft Matter 8, 1508 (2012). 67 V. Shapovalov, T. N. Truong, A. Kovalenko, and F. Hirata, Chem. Phys. Lett. 320, 186 (2000). 68 S. R. Stoyanov, S. Gusarov, and A. Kovalenko, “Multiscale modeling of the adsorption interaction between bitumen model compounds and zeolite nanoparticles in gas and liquid phase,” in Industrial Applications of Molecular Simulations, edited by M. Meunier (CRC Press, Boca Raton, 2011), pp. 203–230. 69 J. G. Moralez, J. Raez, T. Yamazaki, R. K. Motkuri, A. Kovalenko, and H. Fenniri, J. Am. Chem. Soc. 127, 8307 (2005). 70 R. S. Johnson, T. Yamazaki, A. Kovalenko, and H. Fenniri, J. Am. Chem. Soc. 129, 5735 (2007). 71 G. Tikhomirov, T. Yamazaki, A. Kovalenko, and H. Fenniri, Langmuir 24, 4447 (2008). 72 T. Yamazaki, H. Fenniri, and A. Kovalenko, ChemPhysChem 11, 361 (2010). 73 R. Chhabra, J. Moralez, J. Raez, T. Yamazaki, J.-Y. Cho, A. Myles, A. Kovalenko, and H. Fenniri, J. Am. Chem. Soc. 132, 32 (2010). 74 N. Yoshida, T. Imai, S. Phongphanphanee, A. Kovalenko, and F. Hirata, J. Phys. Chem. B 113, 873 (2009). 75 T. Imai, N. Miyashita, Y. Sugita, A. Kovalenko, F. Hirata, and A. Kidera, J. Phys. Chem. B 115, 8288 (2011). 76 A. Kovalenko and N. Blinov, J. Mol. Liq. 164, 101 (2011). 77 D. Nikolic, N. Blinov, D. Wishart, and A. Kovalenko, J. Chem. Theory Comput. 8, 3356 (2012). 78 N. Blinov, L. Dorosh, D. Wishart, and A. Kovalenko, Mol. Simul. 37, 718 (2011). 79 P. Drabik, S. Gusarov, and A. Kovalenko, Biophys. J. 92, 394 (2007). 80 T. Yamazaki, N. Blinov, D. Wishart, and A. Kovalenko, Biophys. J. 95, 4540 (2008). 81 S. Genheden, T. Luchko, S. Gusarov, A. Kovalenko, and U. Ryde, J. Phys. Chem. B 114, 8505 (2010). 82 N. Blinov, L. Dorosh, D. Wishart, and A. Kovalenko, Biophys. J. 98, 282 (2010). 83 Y. Yonetani, Y. Maruyama, F. Hirata, and H. Kono, J. Chem. Phys. 128, 185102 (2008).

J. Chem. Phys. 139, 244106 (2013) 84 Y.

Maruyama, N. Yoshida, and F. Hirata, J. Phys. Chem. B 114, 6464 (2010). 85 Y. Maruyama, N. Yoshida, and F. Hirata, Interdiscip. Sci. Comput. Life Sci. 3, 290 (2011). 86 N. Yoshida, S. Phongphanphanee, and F. Hirata, J. Phys. Chem. B 111, 4588 (2007). 87 M. C. Stumpe, N. Blinov, D. Wishart, A. Kovalenko, and V. S. Pande, J. Phys. Chem. B 115, 319 (2011). 88 R. L. Silveira, S. R. Stoyanov, S. Gusarov, M. S. Skaf, and A. Kovalenko, “Plant Biomass Recalcitrance: Effect of Hemicellulose Composition on Nanoscale Forces that Control Cell Wall Strength,” J. Am. Chem. Soc. (published online). 89 A. Kovalenko and F. Hirata, Phys. Chem. Chem. Phys. 7, 1785 (2005). 90 A. Kovalenko and F. Hirata, “A molecular theory of solutions at liquid interfaces,” in Interfacial Nanochemistry: Molecular Science and Engineering at Liquid-Liquid Interfaces, Nanostructure Science and Technology, edited by H. Watarai and D. J. Lockwood (Springer, 2005), pp. 97–125. 91 T. Miyata and F. Hirata, J. Comput. Chem. 29, 871 (2008). 92 T. Luchko, S. Gusarov, D. R. Roe, C. Simmerling, D. A. Case, J. Tuszynski, and A. Kovalenko, J. Chem. Theory Comput. 6, 607 (2010). 93 A. Onufriev, “Continuum electrostatics solvent modeling with the generalized born model,” in Modeling Solvent Environments, edited by M. Feig (Wiley, 2010), pp. 127–165. 94 I. P. Omelyan, Phys. Rev. E 78, 026702 (2008). 95 I. P. Omelyan, J. Chem. Phys. 131, 104101 (2009). 96 I. P. Omelyan and A. Kovalenko, J. Chem. Phys. 135, 114110 (2011). 97 I. Omelyan and A. Kovalenko, Mol. Simul. 39, 25 (2013). 98 I. P. Omelyan and A. Kovalenko, J. Chem. Phys. 135, 234107 (2011). 99 I. P. Omelyan and A. Kovalenko, J. Chem. Theory Comput. 8, 6 (2012). 100 J. S. Perkyns, G. C. Lynch, J. J. Howard, and B. M. Pettitt, J. Chem. Phys. 132, 064106 (2010). 101 S. M. Kast and T. Kloss, J. Chem. Phys. 129, 236101 (2008). 102 I. S. Joung, T. Luchko, and D. A. Case, J. Chem. Phys. 138, 044103 (2013). 103 J. S. Perkyns and B. M. Pettitt, J. Chem. Phys. 97, 7656 (1992). 104 A. Kovalenko, S. Ten-no, and F. Hirata, J. Comput. Chem. 20, 928 (1999). 105 U. Essmann, L. Perera, M. Berkowitz, T. Darden, H. Lee, and L. Pedersen, J. Chem. Phys. 103, 8577 (1995). 106 I. P. Omelyan, Comput. Phys. Commun. 107, 113 (1997). 107 C. L. Lawson and R. J. Hanson, Solving Least Squares Problems (PrenticeHall, Englewood Cliffs, NJ, 1974). 108 G. Quintana-Ortí, E. S. Quintana-Ortí, and A. Petitet, SIAM J. Sci. Comput. 20, 1155 (1999). 109 G. Picinbono, J.-C. Lombardo, H. Delingette, and N. Ayache, J. Visual. Comput. Animat. 13, 147 (2002). 110 A. Sandu and T. Schlick, J. Comput. Phys. 151, 74 (1999). 111 B. Brutovsky, T. Muëlders, and G. R. Kneller, J. Chem. Phys. 118, 6179 (2003). 112 I. P. Omelyan, Mol. Simul. 22, 213 (1999). 113 G. R. Kneller, J. Chem. Phys. 128, 194101 (2008). 114 C. Eckart, Phys. Rev. 47, 552 (1935). 115 J. D. Louck and H. W. Galbraith, Rev. Mod. Phys. 48, 69 (1976). 116 D. Janežiˇ c, M. Praprotnik, and F. Merzel, J. Chem. Phys. 122, 174101– 174103 (2005). 117 I. Omelyan and A. Kovalenko, Phys. Rev. E 85, 026706 (2012). 118 E. A. Coutsias, C. Seok, and K. A. Dill, J. Comput. Chem. 25, 1849 (2004). 119 P. Liu, D. K. Agrafiotis, and D. L. Theobald, J. Comput. Chem. 31, 1561 (2010). 120 G. Chevrot, P. Calligari, K. Hinsen, and G. Kneller, J. Chem. Phys. 135, 084110 (2011). 121 D. S. Chekmarev, T. Ishida, and R. M. Levy, J. Phys. Chem. B 108, 19487 (2004). 122 Y. Duan, C. Wu, S. Chowdhury, M. C. Lee, G. Xiong, W. Zhang, R. Yang, P. Cieplak, R. Luo, T. Lee, J. Caldwell, J. Wang, and P. Kollman, J. Comput. Chem. 24, 1999 (2003). 123 D. A. Case, T. A. Darden, T. E. Cheatham III, C. L. Simmerling, J. Wang, R. E. Duke, R. Luo, R. C. Walker, W. Zhang, K. M. Merz, B. P. Roberts, B. Wang, S. Hayik, A. Roitberg, G. Seabra, I. Kolossvary, K. F. Wong, F. Paesani, J. Vanicek, J. Liu, X. Wu, S. R. Brozell, T. Steinbrecher, H. Gohlke, Q. Cai, X. Ye, J. Wang, M.-J. Hsieh, G. Cui, D. R. Roe, D. H.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

244106-23

I. Omelyan and A. Kovalenko

Mathews, M. G. Seetin, C. Sagui, V. Babin, T. Luchko, S. Gusarov, A. Kovalenko, and P. A. Kollman, AMBER 11, University of California, San Francisco (2010). 124 I. P. Omelyan, I. M. Mryglod, and R. Folk, Phys. Rev. E 65, 056706 (2002). 125 H. Berendsen, J. Grigera, and T. Straatsma, J. Phys. Chem. 91, 6269 (1987). 126 R. Loncharich, B. Brooks, and R. Pastor, Biopolymers 32, 523 (1992). 127 J. Ryckaert, G. Ciccotti, and H. Berendsen, J. Comput. Phys. 23, 327 (1977). 128 G. Ciccotti, J. P. Ryckaert, and M. Ferrario, Mol. Phys. 47, 1253 (1982). 129 K. Kwac, K.-K. Lee, J. B. Han, K.-I. Oh, and M. Cho, J. Chem. Phys. 128, 105106 (2008). 130 R. Ishizuka, G. A. Huber, and J. A. McCammon, J. Phys. Chem. Lett. 1, 2279 (2010). 131 D. J. Tobias and C. L. Brooks III, J. Phys. Chem. 96, 3864 (1992).

J. Chem. Phys. 139, 244106 (2013) 132 R.

Vargas, J. Garza, B. P. Hay, and D. A. Dixon, J. Phys. Chem. A 106, 3213 (2002). 133 M. Feig, J. Chem. Theory Comput. 4, 1555 (2008). 134 H. Okumura, Adv. Nat. Sci.: Nanosci. Nanotechnol. 1, 033002 (2010). 135 A. L. Ferguson, A. Z. Panagiotopoulos, P. G. Debenedetti, and I. G. Kevrekidis, J. Chem. Phys. 134, 135103 (2011). 136 F. F. García-Prieto, I. Fdez. Galván, M. A. Aguilar, and M. E. Martín, J. Chem. Phys. 135, 194502 (2011). 137 J. Hermans, Proc. Natl. Acad. Sci. U.S.A. 108, 3095 (2011). 138 Y. Yonezawa, I. Fukuda, N. Kamiya, H. Shimoyama, and H. Nakamura, J. Chem. Theory Comput. 7, 1484 (2011). 139 W.-Na Du, K. A. Marino, and P. G. Bolhuisa, J. Chem. Phys. 135, 145102 (2011). 140 R. Chelli and G. F. Signorini, J. Chem. Theory Comput. 8, 830 (2012). 141 G. Pellicane, G. Smith, and L. Sarkisov, Phys. Rev. Lett. 101, 248102 (2008).

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 141.214.17.222 On: Thu, 25 Dec 2014 04:55:22

Multiple time step molecular dynamics in the optimized isokinetic ensemble steered with the molecular theory of solvation: accelerating with advanced extrapolation of effective solvation forces.

We develop efficient handling of solvation forces in the multiscale method of multiple time step molecular dynamics (MTS-MD) of a biomolecule steered ...
2MB Sizes 0 Downloads 0 Views