THE JOURNAL OF CHEMICAL PHYSICS 142, 024114 (2015)
A pure-sampling quantum Monte Carlo algorithm Egor Ospadov and Stuart M. Rothsteina) Departments of Chemistry and Physics, Brock University, St. Catharines, Ontario L2S 3A1, Canada
(Received 27 August 2014; accepted 28 December 2014; published online 13 January 2015) The objective of pure-sampling quantum Monte Carlo is to calculate physical properties that are independent of the importance sampling function being employed in the calculation, save for the mismatch of its nodal hypersurface with that of the exact wave function. To achieve this objective, we report a pure-sampling algorithm that combines features of forward walking methods of pure-sampling and reptation quantum Monte Carlo (RQMC). The new algorithm accurately samples properties from the mixed and pure distributions simultaneously in runs performed at a single set of time-steps, over which extrapolation to zero time-step is performed. In a detailed comparison, we found RQMC to be less efficient. It requires different sets of time-steps to accurately determine the energy and other properties, such as the dipole moment. We implement our algorithm by systematically increasing an algorithmic parameter until the properties converge to statistically equivalent values. As a proof in principle, we calculated the fixed-node energy, static α polarizability, and other one-electron expectation values for the ground-states of LiH and water molecules. These quantities are free from importance sampling bias, population control bias, time-step bias, extrapolation-model bias, and the finite-field approximation. We found excellent agreement with the accepted values for the energy and a variety of other properties for those systems. C 2015 AIP Publishing LLC. [http://dx.doi.org/10.1063/1.4905664]
I. INTRODUCTION
For over fifty years, it has been known that given a wave function with an O(ϵ) error, an error of the same magnitude persists when calculating most properties.1 Unlike the energy, where the error is O(ϵ 2), properties of more general interest are of this type. It is therefore especially important for the electron distribution generated by Monte Carlo methods to be as free from bias as possible when calculating expectation values of operators that do not commute with the Hamiltonian operator. Our paper reports a pure-sampling quantum Monte Carlo (PSQMC) algorithm that achieves this objective within the framework of the fixed-node approximation. It is important to understand how electron distributions derived from “pure-sampling” quantum Monte Carlo differ from those obtained by “mixed-sampling.” The latter are typically generated via the fixed-node diffusion Monte Carlo (FNDMC)2,3 method, the most commonly used variant of quantum Monte Carlo; recent reviews: Refs. 4–7. With FN-DMC, one samples physical properties from the so-called “mixed distribution,” Φ0Ψ. Here, Ψ is an inputted importance sampling function, and Φ0 is the unknown, “exact” lowest energy solution to the Schrödinger equation. In general, Φ0 is not truly exact due to Ψ’s inexact nodal hypersurface. This error biases the simulated electron distribution and physical properties derived from it, giving rise to what is typically called the “fixed-node error.” The electron distribution is further biased by the timestep, a small increment of imaginary time used to propagate a)Author to whom correspondence should be addressed. Electronic mail:
[email protected]. 0021-9606/2015/142(2)/024114/10/$30.00
electron configurations. To remove this bias, one performs the runs at a very small time-step value, after assuring oneself that the time-step bias is negligible with respect to the associated statistical error. Alternatively, one performs runs at a set of different time-steps and then extrapolates the results to zero time-step by polynomial regression. In both approaches, time-step bias is replaced by an extrapolation-model bias, which should be estimated and included in the statistical error.8 Most properties of interest to us, such as the dipole moment, are represented by an operator, A, that does not commute with the Hamiltonian. Its expectation value when sampling from the mixed distribution, ⟨A⟩m = ⟨Ψ|A|Φ0⟩/⟨Ψ|Φ0⟩
(1)
exhibits additional biases due to its explicit dependence on Ψ. (An exception is the local energy, Eq. (9).) These are removed upon sampling from the pure distribution, Φ20, ⟨A⟩ p = ⟨Φ0|A|Φ0⟩/⟨Φ0|Φ0⟩
(2)
without its explicit dependence on Ψ. Over the years, a number of “pure-sampling” schemes have been proposed and tested; recent review: Ref. 9. The earliest is an extrapolated estimate that is biased proportional to the square of the error in the importance sampling function10 ⟨A⟩ p ≈ 2⟨Φ0|A|Ψ⟩/⟨Ψ|Φ0⟩ − ⟨Ψ|A|Ψ⟩/⟨Ψ|Ψ⟩.
(3)
This bias is often hard to assess, and a trial wave function, Ψ, that accurately describes the property of interest is required. Others involve some form of “forward walking” or “sidewalks.”11–13 In this vein, tagging algorithms have been proposed to count the asymptotic number of descendants arising from
142, 024114-1
© 2015 AIP Publishing LLC
024114-2
E. Ospadov and S. M. Rothstein
the branching term in FN-DMC. This quantity, proportional to Φ0/Ψ, achieves pure-sampling when used as a statistical weight in mixed-sampling algorithms. Unfortunately, these approaches are unstable due to the manner in which the electron configurations explicitly or implicitly undergo “branching.” Attempts to reduce their bias by increasing the lengths of the side-walks lead to an increase in the variance of the calculated quantities. Their signal-to-noise ratio goes to zero in the asymptotic limit.14 These issues with descendent counting have been addressed by a scheme incorporating the iteration-value of A in a manner that automatically generates statistical weights that are proportional to its asymptotic number of descendants.15,16 Pure-sampling is also achieved by reptation quantum Monte Carlo (RQMC)17,18 (recent review: Ref. 19), bilinear diffusion quantum Monte Carlo,14,20 and path integral ground state (PIGS) Monte Carlo.21 RQMC is a stable algorithm that has been widely applied in the chemical physics and condensed matter physics literature. In a review article, published in 2006, RQMC was cited as being “probably the best scheme for the moment” to achieve pure-sampling.22 In RQMC, a path (or “reptile”) is generated by a series of Markovian moves of the electron configurations. Reptation is performed by removing configurations from a randomly chosen end of the reptile and replacing them with the same number of new configurations at its opposite end, creating a new reptile. After evoking a Metropolis decision based on branching factors accumulated over the reptiles, “Metropolis branching,”9 one decides whether to accept the new reptile or retain the original one. One samples from the pure distribution at the middle of the chosen reptile. RQMC is an expensive algorithm—the reptation process introduces serial correlation when properties are sampled at the middle of the path, adversely affecting the algorithm’s efficiency. Furthermore, the number of electron configurations employed in reptation affects the sampling and is a difficult algorithmic parameter to optimize. In this paper, we describe a pure-sampling quantum Monte Carlo algorithm that has the conceptual simplicity of “sidewalk” methods and the computational stability of RQMC. Our algorithm generates a main path that consists of several electron configurations, connected by a series of drift and diffusion moves. Sub-paths emanate from the main path and there is no reptation. These sub-paths are either accepted or rejected via a Metropolis decision based on their accumulated branching factors, a feature of RQMC. It is the Metropolis decision that stabilizes the sampling in our approach. The main path provides an imaginary time separation of the electron configurations in which the pure distribution is being sampled. This reduces the serial correlation in expectation values, in contrast with RQMC, where electron configurations may be re-used several times due to its reptation process. As a proof in principle, we calculate the fixed-node energies and other properties for LiH and water molecules. Our approach employs a single-reference importance sampling function with a Slater basis set of atomic orbitals. Our calculations of electronic properties for these small molecules are comparable in accuracy to their other high-level determinations in the literature. Jastrow-Slater importance sampling and/or a recent multi-reference approach23 are not precluded.
J. Chem. Phys. 142, 024114 (2015)
This paper is organized as follows. Section II is devoted to a detailed description of our approach to pure-sampling. In Sec. II A, we describe the Langevin diffusion used to generate the main path Z and two sub-paths, X and Y , as outlined in Sec. II B. In Sec. II C, we derive a Metropolis decision with which we either accept and sample properties from path Y or reject it and sample from path X instead. Although Y is not generated from X by a reptation process, the Metropolis decision is identical to that used in RQMC, Eq. (18). The way in which we sample from the pure and mixed distributions is also the same as in RQMC; illustrated in Fig. 2. In Sec. III, we perform pure-sampling of a host of electronic properties of LiH and water molecules. These quantities exhibit a time-step bias due to the inexactness of the Green’s function governing Langevin diffusion, Eq. (7), necessitating extrapolation to zero-time-step by way of polynomial regression (linear or quadratic). In turn, this introduces an “extrapolation-model bias.” We eliminate this bias by parameterizing the main and sub-path lengths, L, as a function of time-step. The value of L at any given time-step is calculated from its value at the largest time-step, L 0, given by Eq. (21). L 0 values are systematically increased in multiple runs until consistent zero-time-step intercepts (within statistical error) are seen for a number of consecutive choices of L 0; for example, Sec. III, Table I. The final determination of a property value is obtained from a weighted average of consistent intercepts. In light of the algorithm’s similarities to RQMC, in Sec. IV, we compare its efficiency relative to RQMC. By using LiH as an example, we demonstrate that our algorithm is capable of doing accurate mixed- and pure-sampling, simultaneously within a single set of time-steps. The algorithm is almost a factor of two more computationally efficient than RQMC, where runs must be taken at different sets of time-steps in order to achieve both accurate mixed- and pure-sampling.
TABLE I. Ground-state energy and dipole moment of LiH and water molecules in the limit of zero time-step and as a function of path-length parameter, L 0. LiH
Water
L0
E0
µz
E0
µz
11 21 31 41 51 101 151 201 251 301
−8.06090(64)a −8.06643(83)a −8.06896(31)a −8.07011(96) −8.07060(57) −8.07043(119) −8.07093(90) −8.07085(38) −8.07033(140) −8.07074(152)
2.3498(21)b 2.3515(36)b 2.3501(16)b 2.3373(63)a 2.3433(11)a 2.3299(30)a 2.3279(17)a 2.3096(42) 2.3058(39) 2.3055(33)
··· ··· ··· ··· ··· −76.4213(37)a −76.4256(53) −76.4255(66) −76.4259(99) −76.4260(50)
··· ··· ··· ··· ··· ··· 0.7506(25) 0.7474(100) 0.7453(78) 0.7435(38)
2.3067(22)
−76.4257(30)
0.7482(20)
Averagec −8.07070(27) a Not
included in the weighted average due to a lack of convergence with the rest of the data. b Not included in the weighted average because the ground-state energy at this pathlength did not converge. c Variance-weighted average of the property for all included path-lengths. These properties are quoted in Tables II and III.
024114-3
E. Ospadov and S. M. Rothstein
J. Chem. Phys. 142, 024114 (2015)
The paper ends with Sec. V, which is devoted to a summary and conclusions from our work.
is the sum of local energies over the path. The individual local energies are calculated as follows: Eloc(x i ) = HΨ(x i )/Ψ(x i ).
II. A PURE-SAMPLING QUANTUM MONTE CARLO METHOD A. Langevin diffusion
In our approach, as well as in other methods such as FNDMC and RQMC, Langevin diffusion is the process responsible for the evolution of electron configurations through imaginary time. These are mathematical moves of the electrons that have no underlying physical significance. The propagators (Green’s functions) associated with Langevin diffusion are known: Eq. (7), albeit approximately. The process itself consists of two components, drift and diffusion, and can be mathematically described by ∇Ψ(x i ) √ x i+1 = x i + τ + τ χ. Ψ(x i )
(4)
In this context, x i represents 3n coordinates of n electrons located in the ith configuration; τ is the time-step, a small fraction of the imaginary time at which √ the calculation is carried i) out; τ ∇Ψ(x τ χ is the diffusion term. Ψ(x i ) is the drift term; and Finally, Ψ is the guiding or importance sampling function, and χ is a random number generated from a standard 3ndimensional normal distribution.
Langevin diffusion generates variationally distributed electron configurations, so that the 3n-dimensional marginal distribution at the midpoint of path X is the variational electron density, and not the desired pure electron density. In order to sample from the pure electron density, we propose to generate a new path Y from X via an intermediate path Z. The importance of this intermediate path will be discussed later. Our procedure of generating Y is as follows (see Figure 1). Using x 0 as our starting point, do L drift and diffusion moves to generate Z, defined as Z = {z1, z2, ..., z L−1, z L }.
(10)
Next, starting at z L , generate y0 with a single drift and diffusion move. Finally, starting from y0, do two L drift and diffusion moves in order to generate the rest of path Y , defined as Y = { y−L , y−L+1, ..., y−1, y0, y1, ..., y L−1, y L }.
(11)
Similarly to X, our new path has the following (2L +1)3ndimensional probability density: ˆ ) ∝ G( y−L+1 → y−L ;τ)...G( y−1 → y−2;τ) Π(Y × G( y0 → y−1;τ)Ψ2( y0) × G( y0 → y1;τ)G( y1 → y2;τ)...
B. Generating the paths
× G( y L−1 → y L ;τ)e−S(Y ).
Starting with an electron configuration x 0, by using these drift and diffusion moves, we generate a path, X, defined by X = {x −L , x −L+1, ..., x −1, x 0, x 1, ..., x L−1, x L }
(9)
(5)
with x 0 being the middle. The path is generated starting from x 0 and by doing two L drift and diffusion moves, such that the total length of X is 2L + 1. The (2L + 1)3n-dimensional probability density for this path is known to be ˆ Π(X) ∝ G(x −L+1 → x −L ;τ)...G(x −1 → x −2;τ)
(12)
Now that we have both X and Y , we can attempt to sample from the pure distribution at the mid-path configuration of one of these paths and mixed distribution at the ends. This step entails Metropolis branching. C. Metropolis branching
Based on their respective accumulated branching factors, Metropolis branching is the process of deciding whether to sample from path Y or reject it and sample from path X instead.
× G(x 0 → x −1;τ)Ψ2(x 0) × G(x 0 → x 1;τ)G(x 1 → x 2;τ)... × G(x L−1 → x L ;τ)e−S(X ),
(6)
where 2 −1 ∇Ψ(x) (7) G(x → x ′;τ) ∝ exp x ′ − x − τ Ψ(x) 2τ is the Green’s function associated with the drift and diffusion moves, e−S(X ) is the product of branching factors accumulated along the path, and L−1 1 S(X) = τ Eloc(x −L ) + Eloc(x i ) 2 i=−L+1 1 + Eloc(x 0) + Eloc(x L ) 2
(8)
FIG. 1. Visual representation of the process used to generate X and Y in our methodology. Path X is generated from x 0. The intermediate path Z is generated starting from x 0 as well. Finally, path Y is generated starting from z L . All the moves are done using drift and diffusion (Langevin diffusion). For the sake of illustration, the paths are shown running in either the vertical or horizontal direction. They can actually go in any direction and may even cross each other.
024114-4
E. Ospadov and S. M. Rothstein
J. Chem. Phys. 142, 024114 (2015)
× G( y0 → y−1;τ)G( y−1 → y−2;τ)... × G( y−L+1 → y−L ;τ) × G( y0 → y1;τ)G( y1 → y2;τ)... FIG. 2. Schematic representation of an electron distribution along a path accepted by Metropolis branching. The algorithm generates a pure distribution, Φ20, at the middle of the path, and mixed distributions, ΨΦ0, at the ends.
The Metropolis decision, written in its most general form, is as follows: ˆ )W (Y → X) Π(Y . (13) A(X → Y ) = min 1, ˆ Π(X)W (X → Y ) Since the path densities are known, we now need to write down the proposal densities. The one for moving from X to Y can be written as W (X → Y ) = G(x 0 → z1;τ)G(z1 → z2;τ)...
× G( y L−1 → y L ;τ).
(14)
Similarly, its analog for moving from Y to X can be expressed as W (Y → X) = G( y0 → z L ;τ)G(z L → z L−1;τ)... × G(z2 → z1;τ)G(z1 → x 0;τ) × G(x 0 → x −1;τ)G(x −1 → x −2;τ)... × G(x −L+1 → x −L ;τ) × G(x 0 → x 1;τ)G(x 1 → x 2;τ)... × G(x L−1 → x L ;τ).
(15)
Substituting these expressions into Eq. (13), the Metropolis decision takes the following form:
× G(z L−1 → z L ;τ)G(z L → y0;τ)
Ψ2( y0)G( y0 → z L ;τ)G(z L → z L−1;τ)...G(z2 → z1;τ)G(z1 → x 0;τ)e−S(Y ) A(X → Y ) = min 1, 2 . Ψ (x 0)G(x 0 → z1;τ)G(z1 → z2;τ)...G(z L−1 → z L ;τ)G(z L → y0;τ)e−S(X )
Although some of the Green’s functions cancelled upon substitution, thereby reducing the complexity of the Metropolis decision, some have remained. We can further reduce the complexity of Eq. (16) by evoking microscopic reversibility along path Z. (Microscopic reversibility is valid only in the limit of vanishing time-step, as are our calculations of the energy and other properties.) Ψ2(x 0)G(x 0 → z1;τ) = Ψ2(z1)G(z1 → x 0;τ), Ψ2(z1)G(z1 → z2;τ) = Ψ2(z2)G(z2 → z1;τ), ···, Ψ (z L )G(z L → y0;τ) = Ψ2( y0)G( y0 → z L ;τ). 2
Thereby, we derive the following expression: Ψ2(x 0)G(x 0 → z1;τ)G(z1 → z2;τ)...
(16)
location. This is shown in Figure 2, where for convenience we are illustrating the case where proposed path Y is rejected and path X is retained. Finally, let us discuss the importance of the intermediate path Z that links X and Y . Although its electron configurations do not appear in the final Metropolis decision, it serves two vital purposes. First, it allows us to reduce the Metropolis decision to its final and simple form, given by Eq. (18), and second, it introduces sufficient separation in imaginary time between the paths to reduce the serial correlation of sampled properties whenever path Y is accepted. Furthermore, by setting the path-length for Z equal to that of X and Y , we have a single algorithmic parameter that is easily optimized.
× G(z L−1 → z L ;τ)G(z L → y0;τ)
III. RESULTS AND DISCUSSION
= Ψ2( y0)G( y0 → z L ;τ)G(z L → z L−1;τ)...
As a proof in principle, we applied our PSQMC to calculate a number of ground-state properties of LiH and water, at their experimental geometries: Refs. 24 and 25, respectively. Using ADF software,26–28 we generated a single determinant importance sampling function with a QZ4P STO basis set. We do not employ a Jastrow function to account for the electronelectron and nuclear-electron cusp conditions. Instead, to avoid crazy moves and wild values of the local energy, we truncate components of the drift velocity and the local energy in a time-step-dependent manner, such that their associated biases vanish as the time-step, τ, approaches zero.8 Timestep biases in Green’s function (7) and expectation values calculated from PSQMC also vanish in that limit.
× G(z2 → z1;τ)G(z1 → x 0;τ).
(17)
Substituting this result into Eq. (16), we obtain the simplified expression for our Metropolis decision in its final form A(X → Y ) = min 1, e S(X )−S(Y ) . (18) Metropolis branching provides a statistical weight proportional to Φ0/Ψ to the variationally distributed ends of the chosen path, Ψ2, yielding the mixed distribution, (Φ0/Ψ)Ψ2, at those locations. Similarly, it weights the variational distribution at the middle of the chosen path by the factor proportional to (Φ0/Ψ)2, yielding the pure distribution, (Φ0/Ψ)2Ψ2, at that
024114-5
E. Ospadov and S. M. Rothstein
J. Chem. Phys. 142, 024114 (2015)
The respective truncations are given as follows: if |Fi | ≤ 1/τ, Fi Fi = sign[1/τ, Fi ] otherwise
(19)
and (Eloc − ET ) (Eloc − ET ) = sign[1/τ, (Eloc − ET )]
if |Eloc − ET | ≤ 1.0/τ, otherwise, (20)
where ET was set equal to −8.0 Eh and −76.0 Eh for LiH and water, respectively. The energy cutoff is not sensitive to the choice of ET . In practice, these cutoffs are rarely evoked. The path-lengths also depend on the time-step given by L(τ) = L 0(τ0/τ)3/2,
(21)
where L 0 is the path-length at τ0, the largest time-step.8 Note that L increases dramatically as the time-step decreases from the τ0 value. This has the effect of giving rise to nearly constant error-bars on the properties for the entire range of time-steps. L is infinite in the limit of zero τ. The following estimator for the α polarizability is calculated from the total serial correlation of the dipole moment:29,30 α i j = τ cov({ µi }, µ j (x 0)),
(22)
where L { µi } = µi (x l ) + µi (x 0). (23) l=−L This estimator does not rest on the finite-field approximation. PSQMC runs are embarrassingly parallel and require a small amount of both CPU time and memory to function. We employed 100 independent runs per time-step for each molecule to accumulate the error bars. The runs were carried out on machines equipped with Intel Xeon 2.2 GHz CPUs, and each one required less than 1 GB of RAM. For LiH, the runs consumed 3.47 and 66.4 CPU hours per core for the smallest and largest L 0, respectively. For water, the runs consumed 35.5 and 175.3 CPU hours per core for the smallest and largest L 0, respectively. Figures 3 and 4 display plots of the ground-state energy of LiH and water molecules, respectively, as a function of timestep and path-length parameter, L 0. The path length at any time-step is related to L 0 by Eq. (21). These plots were generated as follows: as L 0 is systematically increased, a regression model is chosen such that its τ = 0 intercept (varianceweighted regression) is as consistent as possible with that of the previous L 0. As the path-length increases and passes a critical value, L 0 = 41 for LiH and L 0 = 151 for water, the ground-state energy converges to a consistent value. Any further increase of the path-length beyond the critical value yields equivalent results, albeit with an increase in computational cost associated with an increased number of electron configurations being generated along the paths. While at first, it may seem redundant to perform multiple runs with different path-lengths beyond the critical value, averaging the converged results provides us our final estimate of the property. This determination is free from time-step
FIG. 3. Ground-state energy of LiH as a function of time-step and pathlength parameter, L 0. Path-lengths increase as time-steps decrease; Eq. (21). Convergence occurs for L 0 = 41, . . . , 301.
and regression-model biases. Furthermore, the convergence displayed here also applies to all the ground-state electronic properties that we report in this paper. To our knowledge, this feature of our approach is not present in any other quantum Monte Carlo method. The ground-state energy of LiH was calculated by (variance-weighted) averaging converged results from eight separate values of the path-length parameter, L 0 = 41, ..., 301 [50]. Similarly, the other properties were obtained by averaging the results from L 0 = 201, ..., 301 [50]. For the ground-state energy and other properties of water, L 0 = 151, ..., 301 [50]. See Figures 3–6 and Table I for the cases of the energy and dipole moment. Table II contains our results for LiH. The ground-state energy is in excellent agreement with the literature and agrees exactly with Nemec et al.,31 who used an identical trial wave function. The dipole, quadrupole, and octupole moments also agree with literature. Note that the dipole moment also
FIG. 4. Ground-state energy of water as a function of time-step and pathlength parameter, L 0. Path-lengths increase as time-steps decrease; Eq. (21). Convergence occurs for L 0 = 151, . . . , 301.
024114-6
E. Ospadov and S. M. Rothstein
J. Chem. Phys. 142, 024114 (2015) TABLE II. Ground-state energy and electronic properties of LiH molecule.
FIG. 5. Dipole moment of LiH as a function of time-step and path-length parameter, L 0. Path-lengths increase as time-steps decrease; Eq. (21). Note the fine vertical scale. Convergence occurs for L 0 = 201, . . . , 301.
agrees with an experimentally determined value. All the Hellmann-Feynman forces agree with their theoretical values, albeit with an exception of the z component on the lithium nucleus, which may reflect the fact that the experimental geometry was used rather than the geometry that minimizes the Hartree-Fock energy of trial wave function. The charges on the nuclei are in excellent agreement with both theoretical and experimental literature. Finally, our static polarizabilities are likewise in excellent agreement with high-level theory. Our results for the water molecule are enclosed in Table III. The ground-state energy and electronic properties were obtained by averaging results from L 0 = 151, ... , 301 [50]. The ground-state energy is in excellent agreement with singledeterminant literature results and agrees exactly with Nemec et al. When compared to methods that use a multi-reference approach, the energy is slightly more positive. Nevertheless, this can be easily ameliorated by using multi-determinants, albeit with an increase in computational cost.
FIG. 6. Dipole moment of water as a function of time-step and path-length parameter, L 0. Path-lengths increase as time-steps decrease; Eq. (21). Note the fine vertical scale. Convergence occurs for L 0 = 151, . . . , 301.
Property
This work
RQMC-MHa
Literature
E0 µz Θx x Θz z Ωx x z Ωz z z ⟨1/r Li⟩e ⟨1/r H ⟩e ⟨r 2⟩e 3⟩ ⟨q x/r Li 3⟩ ⟨q y/r Li 3 ⟨qz/r Li⟩ 3 ⟩ ⟨q x/r H 3 ⟩ ⟨q y/r H 3 ⟩ ⟨qz/r H q Li(x x) q Li(z z) q H (x x) q H (z z) αxx αz z
−8.0707(3) 2.307(2) 1.559(4) −3.11(1) −2.80(2) 5.61(2) 6.085(3) 2.2366(8) 22.300(6) −0.002(2) −0.002(2) 0.026(2) −0.0010(8) −0.0006(4) 0.003(1) 0.021(3) −0.040(3) −0.025(1) 0.052(2) 29.6(2) 24.5(7)
−8.0701(4) 2.291(6) −1.55(2) −3.10(3) −2.75(7) 5.50(14) 5.912(4) 2.231(2) 22.28(4) −0.0002(3) −0.0001(3) 0.0010(8) 0.0002(4) −0.0003(4) 0.018(3) 0.0202(5) −0.0404(7) −0.024(2) 0.050(4) 27.92(4) 23.6(2)
−8.0704(2),b −8.070 553(5)c 2.2939,d 2.3140,e 2.314(1)f 1.5485,d 1.554g −3.097,d −3.108g −2.84(2),h −2.9031i 5.68(2),h 5.8062i 6.08j 2.73j 22.596k 0.0l 0.0l 0.0l 0.0l 0.0l 0.0l 0.020 17,m 0.0204(1)n −0.040 34,m −0.0408(1)n −0.026 11,m −0.0246(1)n 0.052 22,m 0.0491(1)n 29.57,d 30.9(4),h 29.76i 25.79,d 24.6(4),h 26.36i
a Reptation
quantum Monte Carlo, Ref. 30. quantum Monte Carlo, Ref. 31. c Explicitly correlated Gaussian (ECG), Ref. 32. d Explicitly correlated coupled cluster (CCSD(T)-R12), Ref. 33. The value for Θ x x is calculated from Θ z z . e Explicitly correlated Gaussian (ECG), Ref. 34. f Experiment, Ref. 35. g Doubly substituted coupled cluster (CCD/BO), Ref. 36. The value for Θ x x is calculated from Θ z z . h Diffusion quantum Monte Carlo, Ref. 29. The value for Ω x x z is calculated from Ω z z z . i MC SCF, Ref. 37. The value for Ω x x z is calculated from Ω z z z . j SCF, Ref. 38. k SCF, Ref. 39. l Exact value. mSCF, Ref. 40. n Experiment, Refs. 35 and 41. b Diffusion
The dipole, quadrupole, and octupole moments are in excellent agreement with the literature, but there is a nominal deviation in the Hellmann-Feynman forces on the oxygen nucleus, which may reflect the fact that the experimental geometry was used rather than the geometry that minimizes the Hartree-Fock energy of trial wave function. The charges on the nuclei are in excellent agreement with both theoretical and experimental literature. Finally, all the components of the static polarizability are also in agreement with the literature. For several cases, our previous results obtained with RQMC underestimate both PSQMC and corresponding literature values for LiH and water. In general, when it comes to pure-sampling of electronic properties for these molecules, PSQMC is more accurate and in many cases more precise than RQMC. However, it would be incorrect to conclude that pure-sampling by PSQMC is inherently more accurate than by RQMC. This is still an open question. The pathlength parameter has been optimized for PSQMC, while algorithmic parameters may not have been the optimal choice for RQMC. There is a limit to how much one can increase L 0 beyond its critical value. Beyond that limit, no regression model can
024114-7
E. Ospadov and S. M. Rothstein
J. Chem. Phys. 142, 024114 (2015)
TABLE III. Ground-state energy and electronic properties of water molecule.
Property E0
µz Θx x Θy y Θz z Ωx x z Ωy yz Ωz z z ⟨1/r O ⟩e ⟨1/r H ⟩e ⟨r 2⟩e 3 ⟩ ⟨q x/r O 3 ⟩ ⟨q y/r O 3 ⟩ ⟨qz/r O 3 ⟩ ⟨q x/r H 3 ⟩ ⟨q y/r H 3 ⟩ ⟨qz/r H q O (x x) q O (y y) q O (z z) q H (y y) q H (aa) q H (bb) α(◦) αxx αyy αz z
This work (QZ4P)
RQMC (TZP)a
−76.426(3)
−76.342(2)
0.748(2) 1.906(2) −1.800(5) −0.109(4) 3.21(2) −1.32(2) −1.89(2) 23.462(6) 5.7902(8) 19.40(1) −0.024(8) −0.019(4) 0.08(2) 0.000(1) −0.0030(9) 0.0010(8) 1.44(5) −1.61(3) 0.20(5) −0.262(9) 0.482(6) −0.198(5) 1.16(2) 9.7(3) 8.8(3) 9.4(1)
0.758(4) 1.918(8) −1.878(7) −0.091(6) 3.62(1) −1.51(2) −2.13(2) 23.27(1) 5.763(2) 19.653(6) ... ... −0.01(1) ... ... ... 1.86(5) −1.8(1) 0.35(6) −0.30(2) 0.52(2) −0.22(2) 0.93(2) ... ... ...
Literature −76.424(03),b −76.4236(2),c −76.4368(4),d −76.437(3),e −76.4388f 0.7238,g 0.7268,h 0.7306i 1.912,g 1.96(2)j −1.804,g −1.86(2)j −0.108,g −0.10(3)j 3.25,g 3.206k −1.34,g −1.332k −1.91,g −1.874k 23.46k 5.773k 19.73,k 18(2)l 0.0 0.0 0.0 0.0 0.0 0.0 1.59,m 1.455(5)n −1.79,m −1.66(1)n 0.20,m 0.21(1)n −0.2932,m −0.2588(3)n 0.5175,m 0.4559(2)n −0.2243,m −0.1971(2)n 1.031,m 1.27n 9.944,o9.88,p9.74,q9.18r 9.052,o8.92,p9.15,q7.90r 9.398,o9.52,p9.36,q8.52r
FIG. 7. Dipole moment of LiH as a function of time-step and path-length parameter, L 0. Path-lengths increase as time-steps decrease; Eq. (21). Convergence occurs for L 0 = 201, . . . , 301. No regression model can adequately fit the data for L 0 = 351: — linear fit; - - - quadratic fit; · · · cubic fit.
Casulleras and Boronat have addressed a similar instability problem with a method related to a tagging algorithm, but where the configuration weights are generated automatically.15,16
a Reptation
quantum Monte Carlo, Ref. 42; Table 3. quantum Monte Carlo, Ref. 31. c Single-determinant diffusion Monte Carlo, Ref. 23. d Multi-determinant diffusion Monte Carlo, Ref. 23. e Explicitly correlated coupled cluster (CCSDTQ-R12), Ref. 43. f Experiment, Ref. 44. g SCF with electron correlation correction from CCSD(T), Ref. 45. h Experiment, Ref. 46. i Experiment, Ref. 47. j Experiment, Ref. 48. k CBS FCI, Ref. 49. The value for Ω x x z is calculated from Ω y y z and Ω z z z . l Experiment, Ref. 50. mCISD, Ref. 51. n Experiment, Ref. 52. o CCSD, Ref. 53. p DFT, Ref. 54. q MC SCF, Ref. 55. r SCF, Ref. 45. b Diffusion
be found to adequately fit the data. We attribute this instability to doing Metropolis branching with branching factors accumulated over a path that is too long. This is analogous to doing weight branching in diffusion Monte Carlo with too many accumulated branching factors.56 An example is L 0 equal to 351 for calculating the dipole moment of LiH. Here, L(τ = 0.002) = 5159 and the Metropolis decision is based upon accumulating 10 320 branching factors; see Eqs. (21) and (8). The resulting numerical instability is obvious upon inspection of Figure 7, where the data for L 0 equal to 351 cannot be adequately fit by weighted regression.
FIG. 8. Electronic properties of LiH calculated from RQMC with reptilelength parameter L 0 = 121. Reptile-lengths increase as time-steps decrease; Eq. (21). Here, the energy is being accurately sampled, but not the other properties at the chosen set of time-steps. E 0 = −8.0701(4); µ z = 2.234(7); Θ z z = −2.87(4); Ω z z z = 5.27(18). All quantities are expressed in atomic units.
024114-8
E. Ospadov and S. M. Rothstein
J. Chem. Phys. 142, 024114 (2015)
IV. EFFICIENCY AND METROPOLIS ACCEPTANCE RATES
We performed 100 independent runs on LiH for RQMC with algorithmic parameters reported in Ref. 30: path-length (total length of the reptile) at the largest time-step (L 0) equal to 121, chop-size at the largest time-step (M0) equal to 20, and 400 000 iterations per run. The reptile-length L at a given timestep is related to L 0 by Eq. (21). The energy is being accurately sampled in Figure 8, but not the other properties at the chosen set of time-steps. The reverse is true for the time-steps shown in Figure 9. Here, there is too much time-step bias for the energy to be accurately determined. Its value in the limit of zero timestep violates the energy variational upper bound. Another set of 100 independent runs on LiH were performed with PSQMC. The data are shown in Figure 10, where L 0 = 201 and 20 000 iterations. Here, L 0 is the length of paths X and Y , emanating from their middles to their ends, and the total length of path Z, each at the largest time-step. See Figure 1. Again, the path-length L at a given time-step is related to L 0 by Eq. (21). In this case, the energy and other properties are accurately sampled simultaneously at the chosen set of timesteps. (We alert the reader to the fact that property values quoted in Figure 10 are based on runs at a single choice of L 0, while those reported in Table II are a weighted average over multiple L 0’s.) In Figure 11, we display the CPU time per run for RQMC and PSQMC performed on Intel Xeon 2.2 GHz CPU at each time-step, for the cases where accurate sampling is observed.
FIG. 10. Electronic properties of LiH calculated from PSQMC with pathlength parameter L 0 = 201. The total length of the sampled path, either X or Y , at the largest time-step is 403. Path-lengths increase as time-steps decrease; Eq. (21). Here, the energy and other properties are accurately sampled simultaneously at the chosen set of time-steps. E 0 = −8.0709(4); µ z = 2.310(4); Θ z z = −3.13(2); Ω z z z = 5.62(8). All quantities are expressed in atomic units.
PSQMC is seen to be more computationally efficient than RQMC by nearly a factor of two. In order to achieve the same efficiency as PSQMC, one could reduce the number of RQMC iterations. However, this
FIG. 9. Electronic properties of LiH calculated from RQMC with reptilelength parameter L 0 = 121. Reptile-lengths increase as time-steps decrease; Eq. (21). Here, the energy value is nonsensical, it disobeys the variational bound, although the other properties are being accurately sampled at the chosen set of time-steps. E 0 = −8.082(3); µ z = 2.291(6); Θz z = −3.10(3); Ω z z z = 5.50(14). All quantities are expressed in atomic units.
FIG. 11. CPU time versus time-step for the RQMC and PSQMC runs shown in Figures 8–10. RQMC reptile-length and PSQMC path-lengths at the corresponding values of the time-steps are shown above the bars. τ A, . . . , τ F are 0.002, . . . , 0.012[0.002] for PSQMC (energy and other properties) and RQMC (energy). τ A, . . . , τ F are 0.02, . . . , 0.12[0.02] for RQMC (other properties).
024114-9
E. Ospadov and S. M. Rothstein
J. Chem. Phys. 142, 024114 (2015)
properties from both the mixed and pure electron distributions, while maintaining the simplicity of fixed-node diffusion Monte Carlo, which samples only from the mixed distribution. In common with RQMC, the heart of the new algorithm is Metropolis branching. This ensures that it is free of population control bias and provides stability when employed with large path-lengths. For the purpose of sampling both the mixed and pure distributions simultaneously, PSQMC is more efficient than RQMC, but one expects the Metropolis acceptance rates for PSQMC to be less than those for RQMC. For the small molecules considered here, the Metropolis acceptance rates are still sufficiently large to properly sample configuration space. Work is in progress in our laboratory to explore this issue in applications to large molecules. Our implementation of the new algorithm converges to a threshold value of the path-length parameter. Averaging results taken at the threshold value and beyond provides estimates of properties that are free from time-step and regression-model bias. We found excellent agreement with the accepted values for the energy and a variety of other properties for LiH and water molecules. ACKNOWLEDGMENTS
We thank SHARCNET for access to their high-performance computing facilities. We thank the referees for their helpful suggestions on how to improve our contribution. Egor Ospadov gratefully acknowledges financial support from Brock University. 1J.
Goodisman, J. Chem. Phys. 38, 304 (1963). B. Anderson, J. Chem. Phys. 63, 1499 (1975). 3P. J. Reynolds, D. M. Ceperley, B. J. Alder, and W. A. Lester, Jr., J. Chem. Phys. 77, 5593 (1982). 4A. Lüchow, WIREs Comput. Mol. Sci. 1, 388 (2011). 5J. Kolorenˇ c and L. Mitas, Rep. Prog. Phys. 74, 026502 (2011). 6B. M. Austin, D. Y. Zubarev, and W. A. Lester, Jr., Chem. Rev. 112, 263 (2012). 7Advances in Quantum Monte Carlo, edited by S. Tanaka, S. M. Rothstein, and W. A. Lester, Jr. (American Chemical Society, Washington, DC, 2012). 8M. F. DePasquale, S. M. Rothstein, and J. Vrbik, J. Chem. Phys. 89, 3629 (1988). 9S. M. Rothstein, Can. J. Chem. 91, 505 (2013). 10P. A. Whitlock, D. M. Ceperley, G. V. Chester, and M. H. Kalos, Phys. Rev. B 19, 5598 (1979). 11P. Reynolds, R. Barnett, B. Hammond, and W. A. Lester, Jr., J. Stat. Phys. 43, 1017 (1986). 12R. Barnett, P. Reynolds, and W. A. Lester, Jr., J. Comput. Phys. 96, 258 (1991). 13K. J. Runge, Phys. Rev. B 45, 7229 (1992). 14S. Zhang and M. Kalos, J. Stat. Phys. 70, 515 (1993). 15J. Casulleras and J. Boronat, Phys. Rev. B 52, 3654 (1995). 16A. Sarsa, J. Boronat, and J. Casulleras, J. Chem. Phys. 116, 5956 (2002). 17S. Baroni and S. Moroni, Phys. Rev. Lett. 82, 4745 (1999). 18S. Baroni and S. Moroni, in Quantum Monte Carlo Methods in Physics and Chemistry, Proceedings of the NATO Advanced Study Institute Vol. 525, edited by M. Nightingale and C. Umrigar, (Springer Science & Business Media, Ithaca, New York, 1999), pp. 313–341. 19W. K. Yuen and S. M. Rothstein, in Advances in the Theory of Quantum Systems in Chemistry and Physics, Progress in Theoretical Chemistry and Physics Vol. 22, edited by P. Hoggan, E. Brändas, J. Maruani, P. Piecuch, and G. Delgado-Barrio (Springer, 2012), pp. 327–342. 20F. Arias de Saavedra and M. H. Kalos, Phys. Rev. E 67, 026708 (2003). 21A. Sarsa, K. E. Schmidt, and W. R. Magro, J. Chem. Phys. 113, 1366 (2000). 22M. D. Towler, Phys. Status Solidi B 243, 2573 (2006). 23B. K. Clark, M. A. Morales, J. McMinis, J. Kim, and G. E. Scuseria, J. Chem. Phys. 135, 244105 (2011). 2J.
FIG. 12. Metropolis acceptance rates versus time-steps, with corresponding path-lengths for runs shown in Figure 11.
would increase the RQMC error bars that are already larger than those for PSQMC. The method used to construct the paths in PSQMC has the beneficial effect of reducing the serial correlation for puresampling, but at the cost of reduced acceptance rates relative to those of RQMC. A comparison is made for LiH in Figure 12. As expected, the acceptance rates decrease as the path-lengths increase, but the acceptance rate for PSQMC at the smallest time-step (largest path-length) is still large enough to sample configuration space effectively. The same holds true for water: PSQMC acceptance rates at the smallest time-step range from 58% to 48% for runs with L 0 = 201–301.
V. SUMMARY AND CONCLUSIONS
We described a pure-sampling algorithm (PSQMC) that shares some important features of RQMC, but not its use of reptation to generate paths. Both algorithms allow for sampling
024114-10 24G.
E. Ospadov and S. M. Rothstein
Herzberg and K. Huber, Molecular Spectra and Molecular Structure IV. Constants of Diatomic Molecules (Van Nostrand, New York, 1979). 25G. Herzberg, Molecular Spectra and Molecular Structure II. Infrared and Raman Spectra of Polyatomic Molecules (Van Nostrand, New York, 1945). 26G. te Velde, F. M. Bickelhaupt, E. J. Baerends, C. Fonseca Guerra, S. J. A. van Gisbergen, J. G. Snijders, and T. Ziegler, J. Comput. Chem. 22, 931 (2001). 27C. Fonseca Guerra, J. G. Snijders, G. te Velde, and E. J. Baerends, Theor. Chem. Acc. 99, 391 (1998). 28ADF2010, SCM, Theoretical Chemistry, Vrije Universiteit, Amsterdam, The Netherlands, 2010 http://www.scm.com. 29J. Vrbik, D. A. Legare, and S. M. Rothstein, J. Chem. Phys. 92, 1221 (1990). 30E. Ospadov, D. G. Oblinsky, and S. M. Rothstein, Phys. Chem. Chem. Phys. 13, 8031 (2011). 31N. Nemec, M. D. Towler, and R. J. Needs, J. Chem. Phys. 132, 034111 (2010). 32W. Cencek and J. Rychlewski, Chem. Phys. Lett. 320, 549 (2000). 33D. Tunega and J. Noga, Theor. Chem. Acc. 100, 78 (1998). 34M. Cafiero and L. Adamowicz, Phys. Rev. Lett. 88, 033002 (2002). 35L. Wharton, L. P. Gold, and W. Klemperer, J. Chem. Phys. 37, 2149 (1962). 36B. K. Lee, J. M. Stout, and C. E. Dykstra, J. Mol. Struct.: THEOCHEM 400, 57 (1997). 37D. M. Bishop and B. Lam, Chem. Phys. Lett. 120, 69 (1985). 38F. Keil and R. Ahlrichs, J. Chem. Phys. 71, 2671 (1979). 39G. Arrighini, J. Tomasi, and C. Guidotti, Theor. Chim. Acta 18, 329 (1970). 40S. P. Karna and F. Grein, Mol. Phys. 69, 661 (1990).
J. Chem. Phys. 142, 024114 (2015) 41L.
Wharton, L. P. Gold, and W. Klemperer, Phys. Rev. 133, B270 (1964). G. Oblinsky, W. K. Yuen, and S. M. Rothstein, J. Mol. Struct.: THEOCHEM 961, 29 (2010). 43T. Shiozaki, M. Kamiya, S. Hirata, and E. F. Valeev, J. Chem. Phys. 130, 054101 (2009). 44W. Klopper, Mol. Phys. 99, 481 (2001). 45G. Maroulis, Chem. Phys. Lett. 289, 403 (1998). 46S. A. Clough, Y. Beers, G. P. Klein, and L. S. Rothman, J. Chem. Phys. 59, 2254 (1973). 47S. L. Shostak, W. L. Ebenstein, and J. S. Muenter, J. Chem. Phys. 94, 5875 (1991). 48J. Verhoeven and A. Dymanus, J. Chem. Phys. 52, 3222 (1970). 49D. Feller, J. Chem. Phys. 98, 7059 (1993). 50D. Eisenberg, J. M. Pochan, and W. H. Flygare, J. Chem. Phys. 43, 4531 (1965). 51B. J. Rosenberg and I. Shavitt, J. Chem. Phys. 63, 2162 (1975). 52J. Verhoeven, A. Dymanus, and H. Bluyssen, J. Chem. Phys. 50, 3330 (1969). 53J. Kongsted, A. Osted, K. V. Mikkelsen, and O. Christiansen, J. Chem. Phys. 118, 1620 (2003). 54L. Jensen, P. T. van Duijnen, and J. G. Snijders, J. Chem. Phys. 119, 3800 (2003). 55T. D. Poulsen, P. R. Ogilby, and K. V. Mikkelsen, J. Chem. Phys. 116, 3730 (2002). 56A. L. L. East, S. M. Rothstein, and J. Vrbik, J. Chem. Phys. 89, 4880 (1988). 42D.
Journal of Chemical Physics is copyrighted by AIP Publishing LLC (AIP). Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. For more information, see http://publishing.aip.org/authors/rights-and-permissions.