THE JOURNAL OF CHEMICAL PHYSICS 141, 194104 (2014)

Excited states from quantum Monte Carlo in the basis of Slater determinants Alexander Humeniuk and Roland Mitri´ca) Institut für Physikalische und Theoretische Chemie, Julius-Maximilians Universität Würzburg, Emil-Fischer-Straße 42, 97074 Würzburg, Germany

(Received 15 July 2014; accepted 23 October 2014; published online 17 November 2014) Building on the full configuration interaction quantum Monte Carlo (FCIQMC) algorithm introduced recently by Booth et al. [J. Chem. Phys. 131, 054106 (2009)] to compute the ground state of correlated many-electron systems, an extension to the computation of excited states (exFCIQMC) is presented. The Hilbert space is divided into a large part consisting of pure Slater determinants and a much smaller orthogonal part (the size of which is controlled by a cut-off threshold), from which the lowest eigenstates can be removed efficiently. In this way, the quantum Monte Carlo algorithm is restricted to the orthogonal complement of the lower excited states and projects out the next highest excited state. Starting from the ground state, higher excited states can be found one after the other. The Schrödinger equation in imaginary time is solved by the same population dynamics as in the ground state algorithm with modified probabilities and matrix elements, for which working formulae are provided. As a proof of principle, the method is applied to lithium hydride in the 3-21G basis set and to the helium dimer in the aug-cc-pVDZ basis set. It is shown to give the correct electronic structure for all bond lengths. Much more testing will be required before the applicability of this method to electron correlation problems of interesting size can be assessed. © 2014 AIP Publishing LLC. [http://dx.doi.org/10.1063/1.4901020] I. INTRODUCTION

Quantum Monte Carlo (QMC) has been the numerical method of choice for studying new phases of matter in systems with strong electronic correlation1 or computing accurate correlation energies of the homogeneous electron gas that enter the parametrization of density functionals.2 The earliest application of QMC to the realm of quantum chemistry dates back to Anderson3 who obtained the ground state energy of H3+ . QMC using trial wave with fixed nodes4 can nowadays be applied to systems with a few thousand electrons on parallel supercomputers.5 QMC for electronic structure calculations has been revived recently by Booth and Alavi6, 8, 9 who solved the fermion sign problem by performing a random walk in the discrete space of Slater determinants, which already incorporate the correct antisymmetry under exchange of particles. The cost incurred is a dependence on the basis set. QMC, although very time-consuming, can be helpful for the validation of less expensive methods such as equation-of-motion coupledcluster10 or time-dependent density functional theory11 in situations where chemical accuracy is required. In conventional multireference methods such as CASSCF12 the judicious selection of orbitals to include in the active space is the most difficult part, which requires experience and some prior knowledge of the compound under investigation. QMC in the space of Slater determinants automatizes this selection process. Different ways have been devised to apply quantum Monte Carlo methods to excited states. Full configuration ina) Electronic mail: [email protected]

0021-9606/2014/141(19)/194104/13/$30.00

teraction quantum Monte Carlo (FCIQMC) can be adapted to the calculation of excited states by introducing the operator 2 ˆ e−τ (H −S) that will project out the eigenstate whose energy lies closest to the parameter S.13 Ohtsuka and Nagase14 compute excited states by removing the ground state, to which the solution tends to decay, in each iteration step by projecting onto the orthogonal complement. Tenno15 divides the configuration space into two parts with the Löwdin partitioning technique and determines the effective Hamiltonian by a FCIQMC simulation with walkers hopping between the two spaces. Recently, Coe and Paterson16 proposed an approach, where the active space is expanded stochastically. The method, that will be presented here, shares some aspects of all the mentioned approaches. The population dynamics of FCIQMC is left unchanged. Only by modifying the way probabilities and matrix elements are calculated in the original FCIQMC algorithm, the population dynamics is restricted to the orthogonal complement of lower lying states. The part of the basis of the Hilbert space that is required to represent the lowest eigenstates, is thus expanded stochastically. Exact diagonalization in this active space yields a basis of orthogonal vectors that replace part of the original basis of Slater determinants. The method is only efficient, as long as excited states can be described by a few hundred excited configurations, which is the case for many molecules. In Sec. II A, the ideas of the FCIQMC algorithm are described in view of the later modifications. Section II B explains how the generation probabilities and matrix elements have to be modified for calculating excited states. Sections II C and II D are concerned with extracting the excitation energies from the controlled population dynamics. Finally, in

141, 194104-1

© 2014 AIP Publishing LLC

194104-2

A. Humeniuk and R. Mitric

J. Chem. Phys. 141, 194104 (2014)

Secs. III A and III B the potential energy curves for the lowest few electronic states of lithium hydride and the helium dimer are calculated using the method just developed. For clarity of the presentation, some technical details about the random selection of basis vectors (see Appendix A), the computation of generation probabilities (B) and matrix elements (D) and some considerations for an efficient implementation (C) have been moved to the Appendixes. In the supplementary material35 the procedure is illustrated on minimal basis set H2 , where all steps can be understood explicitly.

This operation is allowed since shifting all diagonal elements by a constant EHF + S changes the solution only by a global factor |(τ ) = e−τ (K−S1) |(0) = e ˆ

=e

ˆ τ (EH F +S) −τ H |WN⊥

e

II. METHOD DESCRIPTION

In order to find the lowest eigenvalue of a Hamiltonian Hˆ ˆ the imaginary time propagator P = e−H τ can be applied to an initial state | projecting out, in the long time limit, the lowest eigenvalue and eigenstate (unless the initial state is orthogonal to the ground state). Written in the basis of eigenstates of the Hamiltonian Hˆ the time evolution of |(τ ) is described by a superposition of exponential decays with the eigenenergies as decay rates. The vector with the smallest eigenvalue decays slowest and will dominate the superposition at large times τ →∞ ˆ I |e−EI τ |I → 0|e−E0 τ |0. (1) e−H τ | = I

It can be shown that |(τ ) satisfies the differential equation

N

|(0)

|(0). (5)

CJ (τ )|J .

(6)

Since the basis vectors are time-independent, Eq. (3) turns into a differential equation for the coefficients CI (τ ), −

dCI (τ ) = I |(Kˆ − S1)|J CJ (τ ) dτ ⊥ J ∈WN

ˆ − S)CI (τ ) + = (I |K|I

(3)

In this equation, one can replace the Hamiltonian Hˆ |WN⊥ by an ˆ where the Hartree-Fock ground state energy has operator K, been subtracted from the diagonal elements (4)

ˆ CJ (τ ). I |K|J

J ∈WN⊥ ,I =J

(7) It should be noted that since the propagator in imaginary time, ˆ e−τ (K−S) , is not unitary, the wave function will not remain normalized during propagation. Therefore, the real coefficients CI can be substituted by large integers NI such that the relaC N tive populations of the basis states, C I = N I , are maintained J J giving rise to the following rate equation:

(2)

that can be integrated by stochastic means in the space spanned by Slater determinants {|Di }. While this approach has been successfully applied to calculate the ground electronic states we wish to extend it here to the calculation of excited states. The main idea of our approach is to generate systematically the excited states by restricting the Hamiltonian to the orthogonal complement of the previously found lowest N states. This allows to carry out population dynamics in an analogous way as in the ground state FCIQMC.6 However, since the population dynamics needs to be restricted to the orthogonal complement of the lowest N state, the basis states will be split into three different groups (termed hidden, composite and pure, as explained later) and the way matrix elements and generation probabilities are generated needs to be adapted and generalized. The differential equation for imaginary time propagation in such orthogonal complement of the lowest N eigenstates, WN⊥ , reads

Kˆ = Hˆ |WN⊥ − EH F 1.

|(0) ∝ e

−τ Hˆ |W ⊥

J ∈WN⊥

A. Imaginary time propagation

d − |(τ ) = Hˆ |WN⊥ |(τ ). dτ

N

The solution vector | can be expanded into orthonormal basis states |J of the orthogonal complement space WN⊥ according to |(τ ) =

d |(τ ) = −Hˆ |(τ ) dτ

−τ (Hˆ |W ⊥ −(EH F +S)1)

−

dNI = (KI I − S)NI + dτ

KI J NJ .

(8)

J ∈WN⊥ ,I =J

This differential equation will be solved by simulating the population dynamics. Each basis state I is occupied by a number of walkers, that can have a sign, such that the sum of the signs of all walkers on the same basis state equals NI . The rate of change of the signed number of walkers on determinant I is proportional to

r the signed number of walkers on the same state with rate constant (K − S)

II r and the signed number of walkers on all other states

with rate constant KIJ . For a finite time step τ , one obtains for the change of the walker population the relation −NJ = (KJ J − S)τ NJ +

KJ I τ NI .

(9)

I ∈WN⊥ ,I =J

To simulate the population dynamics exactly, one would have to iterate over all basis states I and calculate the amount of positive or negative population transferred to basis state J. Instead one can, for a given basis state I, select other states J at random and compute population transfer only to the randomly selected states. If one normalizes by the probability to select a

194104-3

A. Humeniuk and R. Mitric

J. Chem. Phys. 141, 194104 (2014)

given state, such stochastic population dynamics approaches the exact solution in the limit of large numbers of walkers. The population dynamics used to solve the differential equation consists of three steps, which are repeated for each time step t:

r Spawning step. For each walker w , on the basis veci tor Ii , another basis vector J is selected at random with computable probability pgen (J|I). Subsequently, a new walker on the state J is born with the probability pspawn (J |Ii ) =

|KI J |t i

pgen (J |Ii )

(10)

and the sign sJ = −sign(KI J ) × (sign of walker wi ). i

(11)

In the simulation, the newly spawned walkers are stored separately from the original walkers. r Diagonal death/cloning step. Only the original walkers participate in this step. For each walker on the basis state Ii the quantity pdeath/cloning (Ii ) = (KI I − S)τ i i

(12)

is computed. If pdeath/cloning is positive, the walker dies with probability pdeath/cloning . If it is negative a clone of the walker is born with probability |pdeath/cloning |. It can happen that the cloning probability exceeds 1. In this case, Int(pdeath/cloning ) (Int(x) denotes the integer part of x) walkers are cloned with certainty and one walker is cloned with the remaining probability frac(pdeath/cloning ) (frac(x) is the fractional part of x). The energy shift S can be used to control to total number of walkers; a large S initially cranks up the walker numbers to the desired level. If S is changed so as to maintain a fixed number of walkers, S will approach the eigenenergy. In fact, (Kˆ − S1)|(τ ) = 0,

(13)

once the signed walker populations NI (τ ) have reached a stationary distribution. r Annihilation step. In the last steps the original walkers, the spawned and the cloned walkers take part. Walkers on the same basis state with opposite signs cancel. As a result, all walkers on the same determinant will have the same sign. The spawning and the cloning steps can be performed concurrently for each walker. The annihilation step can be performed concurrently for each basis state, if the walkers are sorted by basis state. B. Excited states

In the following, we describe the algorithm for the construction of the orthogonal complement of the space WN = {|0, |1, . . . |N − 1} spanned by the N lowest eigenstates of the Hamiltonian Hˆ and define basis states used to carry out population dynamics. Each of the eigenstates from WN can be

represented by a linear combination of Slater determinants CiI |Di . (14) |I = i

Then the (N + 1)-th state can be found through the stochastic population dynamics by applying the restriction of the projector operator to the orthogonal complement of the subspace spanned by the lower N vectors, which is defined by WN⊥ = {|J : I |J = 0 for all |I ∈ WN }.

(15)

It should be noted that there is no unique way to construct the orthonormal complement. Since Slater determinants are automatically orthogonal to each other, only the subspace spanned by those determinants that are contained in the states 0, . . . , N − 1 needs to be orthonormalized. A basis of WN is the set of determinants contained in the eigenstates: B = {determinants |Di such that Di |I = 0 for some eigenstate|I ∈ WN }.

(16)

The size of this basis is denoted by |B| and should be much smaller than the size of the Hilbert space for the algorithm to work efficiently. The orthogonal complement to this basis is the set of determinants that are not present in any of the N eigenstates: B ⊥ = {determinants |Di such that Di |I = 0 for all eigenstates |I ∈ WN }.

(17)

Together B and B⊥ span the whole Hilbert space V = span{B ∪ B ⊥ }.7 Note that all vectors in B and B⊥ are mutually orthogonal simply because both contain only Slater determinants as basis vectors. However, the eigenstates {|0, |1, . . . |N − 1} cannot be removed easily from B to obtain the orthogonal complement WN⊥ . In order to remove them, one can mix the pure determinants in B so as to construct another orthonormal basis where N of the basis vectors coincide with the N found eigenstates. Therefore, the basis B is replaced by a new basis C that comes about by orthogonalization of the eigenstates with the rest of the basis B. The relation between the different bases is illustrated in Fig. 1. In our simulations, we employ two different strategies in order to orthogonalize the states. One way to obtain an orthonormal basis that contains the lowest N eigenstates would be to diagonalize the restriction of the Hamiltonian to the subspace spanned by B, |αα|Hˆ |ββ| Hˆ |B = α∈B β∈B (18) ˆ C = eigen-vectors of (H |B ). This makes the population dynamics more efficient as the individual basis vectors I ∈ C should be a good first approximation to the eigenstates. However, if the composite subspace is large, solving for all eigenvectors may pose a serious computational problem. Since for the basis vectors from C the coupling matrix elements satisfy the relation I |Hˆ |J ∝ δI,J , no walker on the Ith basis vector in the composite basis C can directly spawn to another basis vector J of the composite basis. Population transfer from one composite basis vector to

194104-4

A. Humeniuk and R. Mitric

J. Chem. Phys. 141, 194104 (2014)

amplitude Btrunc = {dominant determinants |Di such that |Di |I |2 ≥ 1 − for all |I ∈ WN } ⊂ B. × i

(22)

FIG. 1. Venn Diagrams representing the partitioning of the Hilbert space before and after orthogonalizing the basis. Non-overlapping sets are orthogonal to each other. (a) and (b) The basis of Slater determinants B∪B⊥ is not suitable for removing the lowest N eigenstates, since span{B} and WN overlap. (c) After dividing the vector space into the pure basis (B⊥ ), the composite basis (C \ WN ) and the hidden basis WN⊥ , the population dynamics can be restricted to the orthogonal complement of the previously found eigenvectors.

another can only happen indirectly through the coupling with Slater determinants from the pure basis B⊥ . As the eigenstates are already orthonormal, the orthogonalization procedure will not change them. If the first N basis vectors are removed from C, then together with B⊥ one obtains an orthonormal basis of the orthogonal complement space WN⊥ , WN⊥ = (C \ {|0, . . . , |N }) ∪ B ⊥ .

I ∈WN⊥

|I I |Hˆ |J J |.

(20)

J ∈WN⊥

Each of the sums can be split into a part involving only the pure Slater determinants in B⊥ and the part (that should be much smaller) of composite vectors that come from C\{|0, . . . , |N − 1}, I ∈WN⊥

=

I ∈C\{|0,...,|N}

+

.

1. “Purifying” the eigensolution

When the population dynamics has converged, the walkers will have distributed over the composite basis and some of them will have migrated to the pure basis. Therefore, the amplitudes of the (N + 1)-th eigenstate are defined with respect to the basis C ∪ B⊥ , aI |I + ai |Di . (23) |(τ → ∞) = I ∈C

(19)

The restriction of the Hamiltonian to this subspace reads Hˆ |WN⊥ =

To this end, the determinants are sorted in descending order by their probabilities. Then the determinants with the highest probabilities are added one after the other to Btrunc until the sum of the probabilities reaches 1 − for all states in WN . The truncated composite subspace Ctrunc , that is constructed in this way, is smaller than C. Therefore, the orthogo⊥ still contains some small contamination nal complement Btrunc from the lower states. The right choice of the threshold parameter will certainly be system-specific. If a sparse subspace of low-energy eigenstates exists, it can be set close to 0.0. In the absence of sparsity, it has to be increased until a compromise between execution time and accuracy is reached.

(21)

I ∈B ⊥

Another way to orthogonalize the N lowest states with respect to the rest of the basis B, which is suitable for large basis sets where the size of the composite basis can be extremely large, involves truncating the basis to the dominant configurations. This is justified since on many states the number of walkers fluctuates but would be zero on average. So, usually each state will only contain few dominant configurations and less important configurations that are at most occupied by a few walker. We can bring the size of the composite basis down to an acceptable number, if these unimportant configurations are truncated. Instead of including into the basis B all Slater determinants that are occupied in the lower N eigenstates, only the most dominant Slater determinants are included, which account for 1 − of the total probability

Di ∈B ⊥

It is preferable to express the solution in the basis of Slater determinants only, that is in the basis B∪B⊥ . To this end, one has to transform the part of the solution that lies in C into the basis B, |(τ → ∞) = aI Di |I |Di + ai |Di . Di ∈B

I ∈C

Di ∈B ⊥

(24) So the coefficients of | in the basis of Slater determinants are aI Dβ |I for β ∈ B β| = I ∈C (25) β| = aβ

for β ∈ B ⊥ .

2. Generating random excitations

In order to sample the vector space during the Monte Carlo simulation, one has to be able to generate excitations with computable probabilities. This means that, for a walker sitting on basis vector I another connected basis vector J has to be selected randomly such that I |Hˆ |WN⊥ |J is likely to be non-zero. The probability pgen (J|I) to select J must be easily computable. If the basis would only contain pure Slater determinants, one could simply select single and double excitations uniformly as the number of possibly connected determinants is known beforehand. For Nelec electrons in K orbitals, there are nocc = Nelec occupied orbitals and nvirt = K − Nelec virtual

194104-5

A. Humeniuk and R. Mitric

J. Chem. Phys. 141, 194104 (2014)

orbitals. Each determinant is connected to Nsingle = nocc nvirt

uniform probability

single excitations and Ndouble = · double excitations, amounting to a total of Nexc = Nsingle + Ndouble excitations. Probabilities for transitions to other determinants will be zero as the Hamiltonian contains at most two-particle operators. The set of different determinants that are connected to the determinant |I is denoted by Con(I). A determinant is not connected to itself, so I ∈ Con(I). The basis of the vector space from which the lowest N eigenvectors were removed is more complicated, so that determining whether two basis vectors are connected is not as easy as using Slater-Condon rules. In order to achieve this, the complete Hilbert can be divided into subspaces spanned by three bases:

1 δ(α is connected to β). Nexc (27) If β belongs to the composite subspace (β ∈ B), reject this choice and go back to step (a). Otherwise we have found J = β.

nocc (nocc −1) 2

nvirt (nvirt −1) 2

P2 (β|α) =

2.

On the other hand, if I is a pure Slater determinant (I ∈ B⊥ ): (a)

r the hidden basis consisting of the N eigenvectors {|0, . . . , |N}, that should never be populated during the simulation so that the imaginary time propagator projects out the next energetically higher state. r the composite basis C\{|0, . . . , |N}. If Hartree-Fock orbitals provide a reasonable single particle basis from which to build Slater determinants, then each basis vector in C will be a linear combination of Slater determinants that is dominated by few excitation from the Hartree-Fock determinant. r Finally, we have the pure basis of Slater determinants B⊥ . When generating transitions to other states there is some freedom in choosing the probability for generating each state. The only requirements are that all connected basis states can be generated with some probability greater than zero and that this probability is known. Two basis states |I and |J are connected if I |Hˆ |J = 0. Suppose a walker sits on a composite basis vector I ∈ C. The task is to randomly select another basis vector J with computable probability that is different from I and does not belong to the hidden basis WN . Since I is composite and the spawning probability is proportional to I |Hˆ |J = EJ δI,J for J ∈ C, only basis vectors from the pure basis J ∈ B⊥ need to be generated as those from the composite basis C would never be populated. On the other hand, a walker that sits on a basis vector of the pure basis I ∈ B⊥ can spawn both to the pure and the composite basis. The following algorithm selects a random basis vector J given that there is a walker on basis vector J and calculates the probability that this vector will be generated

r select basis vector 1.

Given the Slater determinant |I find a connected singly or doubly excited Slater determinant |β with uniform probability P2 (β|I ) =

(b)

1 δ(I is connected to β). Nexc (28)

If β belongs to the pure subspace (β ∈ B⊥ ), we are done and J = β. If, however β belongs to the composite subspace (β ∈ B), choose a basis vector J of the composite basis (J ∈ C) with probability |J|β|2 , P3 (J |β) = |J |β|2 .

(29)

If J belongs to the hidden basis J ∈ WN , reject the choice and start from step (a) again. Otherwise accept J. When an acceptable basis vector has been found, the expected probability to generate this vector has to be calculated. First consider what is the probability to select a basis vector J given I, if no basis vectors are rejected. Then the generation probability is found by multiplying the conditional probabilities and summing over all possible combinations that lead to J. Since only acceptable basis vectors will be generated the probability to select J given I has to be normalized by the probability to select any acceptable vector without rejection

pgen (J |I ) =

(J |I ) pgen . P (any acceptable vector|I )

(30)

If the dimension of the composite subspace (that is the number of determinants that are occupied in the lower N eigenstates) is much smaller than the full configuration interaction (CI) space (|B| |B⊥ |), the expected probabilities for generating basis vectors can be computed efficiently by summing over basis vectors from the composite basis only as is explained in detail in Appendix B.

If I belongs to the composite basis (I ∈ C): (a)

Select a Slater determinant |α ∈ B randomly with probability P1 (α|I ) = |α|I |2 .

(b)

(26)

Given |α find a connected singly or doubly excited Slater determinant |β of |α with

3. Matrix elements

The probability that a walker sitting on basis vector I will spawn a child on basis vector, J = I is proportional to the modulus squared of the transition matrix element |J |Hˆ |I |2 . If both I and J are composite basis vectors (so eigenstates of the restriction of H to B), the spawning probability will be zero since I |Hˆ |J = I |Hˆ |B |J = EJ δI,J = 0

for I = J.

(31)

194104-6

A. Humeniuk and R. Mitric

J. Chem. Phys. 141, 194104 (2014)

Therefore spawning probabilities are only needed in the following three situations: 1. A composite determinant I = D ∈B CiI |Di can only i spawn to a pure determinant J = |Dj ∈ B⊥ . Calculating the transition matrix element requires summing over all determinants in the basis B, Di |Hˆ |Dj Ci∗I for I ∈ C, J ∈ B ⊥ . I |Hˆ |J = Di ∈B

(32) 2. A pure basis vector I = |Di ∈ B⊥ can spawn both into the pure and the composite bases. If I spawns to a composite basis state |J = D ∈B CjJ |Dj , the matrix elej ment becomes Di |Hˆ |Dj CjJ for I ∈ B ⊥ , J ∈ C. I |Hˆ |J = Dj ∈B

(33) 3. If a pure I spawns into the pure basis, the matrix element is calculated with the usual Slater-Condon rules17 I |Hˆ |J = Di |Hˆ |Dj

⊥

⊥

for I ∈ B , J ∈ B . (34)

The Slater-Condon rules with the correct phases are summarized in Appendix D. The additional complexity for calculating matrix elements (as compared to ground state FCIQMC) should scale approximately linearly with the number of Slater determinants that are occupied in the lower N eigenstates, which equals the size of the basis B.

C. Energy estimates

Initially, only one basis state I0 is populated. I0 is chosen as the state with the lowest energy expectation value I0 |Hˆ |I0 from the set of composite basis vectors and those single and double excitations from the Hartree-Fock ground state that do not form part of the composite subspace. Instead of working with the Hamiltonian, one tries to find the eigenvalues of Kˆ = Hˆ − EH F 1,

(35)

where the large Hartree-Fock energy is subtracted from the diagonal matrix elements. The projected energy is defined as ˆ the projection of K|(τ ) onto the reference state I0 , ˆ 0 + Eproj (τ ) = I0 |K|I

J =I0

ˆ 0 NJ (τ ) , J |K|I NI (τ )

D. Population control

The number of walkers Nw is controlled by the energy shift S that is subtracted from the diagonal matrix elements of the Hamiltonian. By manipulating S in time we want to drive the population as quickly and without oscillations to the desired target population Nwmax . This problem is well-known in engineering and can be addressed with an proportionalintegral-derivative controller (“PID controller”). As the process value (PV) which we wish to control we chose the logarithm of the walker population P V (τ ) = log(Nw (τ )).

(37)

The relative error is the deviation from the desired setpoint value (SP) divided by the desired value log(Nw (τ )) − log Nwmax e(τ ) = . (38) log Nwmax The PID controller bases the changes made to S on the current error (proportional part), the previous errors (integral part) and the predicted future errors (derivative part), S(τ ) − S(τ − τ ) 1 = Kp e(τ ) + Ki T

τ

τ −T

e(τ )dτ + Kd

d e(τ ), (39) dτ

where the coefficients Kp , Ki , and Kd and the memory of the controller T have to be fine-tuned manually. This is a generalization of the control mechanism used by Booth,6 which only contains the derivative part ζ (log(Nw (τ ))−log(Nw (τ − τ ))). τ (40) That is, the change to S is proportional to the derivative of the error e(τ ) − e(τ − τ ) S(τ ) − S(τ − τ ) = −ζ log(Nwmax ) τ de(τ ) . (41) ≈ −ζ log(Nwmax ) dτ The projected energy and the shift energy give independent estimators of the correlation energy. At convergence, they both should fluctuate randomly around the correct correlation energy. One could average the projected energy and the shift energy over many iterations and estimate confidence intervals, which should overlap, from the fluctuations. Instead, in the examples below, simply the value of the energy at the last iteration step was taken. S(τ ) − S(τ − τ ) = −

1. Switching of reference state

(36)

0

where NI (τ ) is the signed sum of walkers on basis state I. To be able to calculate the projected energy efficiently, a hash with matrix elements between J and the reference state I0 is maintained for all unique occupied states J. As the number of unique occupied determinants increases, the hash is expanded.

Since the projected energy depends on the ratio of the number of walkers to the number of walkers on connected basis states, the state with most walkers should be chosen so N that the relative fluctuations in N0 are as small as possible. The I initially chosen reference state might be depopulated during the dynamics, while another state becomes dominant. If the population on another state I exceeds (1.0 + x)N0 , I becomes the new reference state. A value of x = 0.2518 gives good

194104-7

A. Humeniuk and R. Mitric

J. Chem. Phys. 141, 194104 (2014)

results. This switching has no influence on the dynamics, but care needs to be taken when averaging the projected energy. 2. Initiators

The walker distribution starts to converge as soon as the sign structure of the solution is determined. If there are too few walkers, the sign structure might never be established because of noise. To stabilize a fledgling sign structure Cleland et al.19 introduced an additional rule, where newly spawned walkers only survive if their parents sit on determinants, whose sign has been well established (called initiator determinants). In the implementation of Ref. 19 a list of determinants with the number of walkers is stored, while in this implementation a list of walkers is stored. So, being an initiator is not a property of the basis state but of the walker, that sits on it. Therefore, the rules need to be formulated a little bit differently:

r Walkers sitting on states with more than n walkers a become initiators. The threshold na should be as low as possible. r Walkers that were not born in the current iteration are dubbed veterans. r Newly spawned walkers are called greenhorns. Greenhorns can be initiated or uninitiated depending on whether they were spawned by a walker with initiation power or not. During the annihilation step, the veteran and greenhorn walkers on a determinant are counted. Uninitiated walkers only survive in the presence of veterans or if they are sign coherent (the signed sum of the uninitiated walkers is larger in absolute value than 1). Initiated walkers always survive. III. RESULTS

FIG. 2. Growth of walkers and energy estimates (4th excited state (singlet), bond length = 2.47 Å). Initially, a large energy shift of S = 3.0 Hartree increases the walker number rapidly. After the total number of walkers exceeds the target population of 200 000, the energy shift is adjusted by the PID controller (Kp = 0, Ki = 0, Kd = −0.05) so as to hold the number of walkers constant. At convergence, the number of walkers on the reference state does not change anymore, and the energy shift oscillates around the correct excitation energy.

While a full CI calculation on such a small basis set requires no more than 10 min on a single processor, the Monte Carlo simulation runs for 1 day under the same conditions. However, whereas for full CI the computation time scales with the size of the Hilbert space, for exFCIQMC it scales with the number of Slater determinants contained in the lower lying eigenstates (size of the active space needed to correctly describe the lowest N eigenstates). This suggests that exFCIQMC might be suitable for gigantic basis sets, as long as the representation of the eigenstates in the basis of Slater determinants is sparse (see Fig. 4). Still, its usefulness is probably limited to single point calculations on small molecules or model Hamiltonians.

A. Example 1: Lithium hydride with 3-21G basis set

As a simple application of the extension of FCIQMC to excited states, the lowest few excited states of lithium hydride20 in the 3-21G basis set21, 22 will be calculated. A restricted Hartree-Fock calculation (performed with PyQuante23 ) yields 22 spin orbitals. The Hilbert space consisting of 4 electrons in 22 orbitals has a dimension of = 7315, that is small enough so that all eigendfull CI = 22 4 vectors of the Hamiltonian can be easily calculated by exact diagonalization (full CI) using LAPACK.24 The lowest few excited states are obtained by quantum Monte Carlo for 30 bond lengths between 0.8 Å and 5.0 Å , running the algorithm with a constant walker number of 200 000 for 40 000 iterations with a time step of τ = 0.01. Fig. 2 depicts how the total number of walkers and the energy estimates typically evolve during a simulation run. The resulting potential energy curves are shown in Fig. 3. The full CI values refer to the exact diagonalization of the Hamiltonian matrix.25–27 The error of the quantum Monte Carlo results as compared with the full CI energies averaged over all geometries and over the lowest 5 states (1 1+ , 3x 3 1+ , and 1 2+ ) amounts to 0.0016 Hartree.

FIG. 3. Potential energy curves for the lowest few excited states of lithium hydride with the 3-21G basis set. Solid lines correspond to a full CI calculation, dots come from a quantum Monte Carlo simulation with 200 000 walkers for 40 000 iterations (cut-off threshold = 0.01). The exFCIQMC algorithm correctly finds the lowest three singlet and triplet states.

194104-8

A. Humeniuk and R. Mitric

J. Chem. Phys. 141, 194104 (2014)

FIG. 4. For each bond length the relative size of the active space needed to describe the states 0–N is depicted. The relative size of the active space is defined as the number of Slater determinants occupied in the lowest N states divided by the dimension of the full CI space (here 7315). Of course, if we would like to calculated all eigenstates (N = dfull CI ), the relative size of the active space has to be 1. But if one is only interested in the lowest eigenstates, one can hope that the active space is a tiny fraction of the full CI space and only grows moderately with the number of the lowest excited states. Here, a cut-off threshold of = 0.01 was used to truncate the composite subspace. One sees that only 1% of the full CI space is needed to describe the lowest 10 excited states.

As an example, we analyze the walker dynamics for the bond length of rH-Li = 1.63 Å, that is close to the ground state minimum. Fig. 5 shows how the projected energies converge to the excitation energies of 4 excited states (correlation energy for ground state). Two observation are noteworthy:

r The projected energy of states 1 and 2 does not change. r States 3 and 4 are actually lower in energy than state 2. To understand the first observation, we analyze the composition of the states in the basis of Slater determinants given in Table I. States 1 and 2 predominantly contain the determinants |27 and |39. The active space needed to describe the ground state already contains the Slater determinants |27 and |39, whose amplitudes in the ground state are 0.046 and −0.046, respectively. Therefore, the exact diagonalization in this active subspace produces two other correct eigenstates, 1 and 2. In other words, the Monte Carlo simulation does not need to expand the active space. Since one of the basis vectors of the composite basis coincides with an eigenstate, only this basis vector will be populated. This is the reason why the projected energies of states 1 and 2 do not change. We would like to stress, however, that this is an artifact of the tiny ba-

FIG. 5. Projected energies for the ground state and 4 excited states (bond length = 1.63 Å ) from a run with 50 000 walkers for 10 000 iterations (τ = 0.01) and no cut-off, = 0.0. The numbering of the states refers to the order in which they were found in subsequent iterations.

sis set we chose for LiH. For larger basis sets, as shown in the He2 example below, it is very unlikely that the composite subspace will already contain the correct eigenvectors. To understand the second observation, note that initially one walker is placed onto a basis state that is expected to be contained in the next lowest excited states. In the 2nd iteration, the choice of the initial reference basis vector I0 was not correct: the population dynamics selects an eigenstate, but this is not the next lowest. Instead of the 2nd excited state the 4th excited state is found. In order to converge to the 2nd excited state, one would have had to start with a reference state with an initially higher energy (the initial reference basis vector for state 3). exFCIQMC will always converge to an eigenstate, but the order of the states can be switched depending on the choice of the initial occupation. Since we have not yet implemented spin symmetry, for a triplet state all three components are calculated separately. In particular, if only a limited number of states is computed, it can happen that not all 3 components are found. This can be seen in Fig. 3: Below 3 Å all 3 components of the triplet 3 were found, while at larger bond lengths the algorithm “chose” to converge to only two components of the triplet and to one component of the triplet 3 2+ . Had we computed more states, the missing triplet state would have been found later. Finally, we investigated the dependence of the error in the energies on the truncation threshold . As can be seen in Fig. 6, the error tends to decrease as goes to 0. At the same time the number of determinants that are included in the

TABLE I. Lowest 5 electronic states of LiH at a bond length of rH-Li = 1.63 Å. State

Dominant configurations

Total energy/Hartree

Type of exc.

0 1 2 3 4

0.986 |15 0.639 |27 + 0.639 |39 0.649 | 27 − 0.649 |39 0.910 |43 0.912 |23

−7.948 −7.848 −7.833 −7.846 −7.846

HF ground state HOMO-1 → LUMO + HOMO → LUMO+1 HOMO-1 → LUMO − HOMO → LUMO+1 HOMO-1 → LUMO+1 HOMO → LUMO

194104-9

A. Humeniuk and R. Mitric

FIG. 6. Dependence of error on threshold (for LiH example). On the left axis the error in the energy averaged over the lowest 3 singlet states 1 1+ , 1 + , and 1 at a bond length of r = 2.5 Å is plotted against 1 − for values 2 of the truncation threshold = 0.2, 0.15, 0.10, 0.05, 0.01, and 0.001. On the right axis the number of determinants are shown which are retained in the truncated composite subspace. For all points the computation was performed with 200 000 walkers propagated for 40 000 time steps (t = 0.01).

truncated composite subspace increases. An increase in the size of the composite subspace (and with it in the computation time), though, is not necessarily rewarded by a large reduction in the error. Since the population dynamics is realized by a stochastic process, the part of the error that is due to noise depends on the number of walkers, the time step and the time of propagation in much the same way as in the ground state FCIQMC.

B. Example 2: Helium dimer with aug-cc-pVDZ basis set

The lowest excited states of the helium dimer were determined for 60 equidistant bond lengths between 0.5 Å and

J. Chem. Phys. 141, 194104 (2014)

FIG. 8. Relative size of the active space for the helium dimer. Less than 0.4% of the Hilbert space (≈ 250 determinants) is required to describe the lowest 10 electronic states in the helium dimer with the aug-cc-pVDZ basis set.

7.5 Å with the aug-cc-pVDZ28 basis set. The four electrons can be distributed among 36 spin orbitals resulting in a Hilbert = 58905. A matrix of this space of dimension dfull CI = 36 4 size is already out of reach of a diagonalization method that needs to store all matrix elements in memory. The energy curves from the quantum Monte Carlo calculation are compared in Fig. 7 with full CI calculations29 which were performed using GAMESS-UK.27 The exFCIQMC algorithm correctly finds the lowest singlet and triplet states, but not necessarily in the right order. The potential energy curves of the lowest 10 states deviate from the full CI values on average by 0.0045 Hartree. Plotting the size of the active space for the lowest electronic states in Fig. 8 shows that the Hamiltonian is indeed sparse in the basis of Slater determinants. The efficiency of the algorithm rests on the condition that eigenstates that make up WN are sparse in the basis of Slater determinants, so that the overhead for the modified population

FIG. 7. Potential energy curves for the lowest few excited states of the helium dimer with the aug-cc-pVDZ basis set. Solid and dashed lines correspond to singlet and triplet states from a full CI calculation,27 dots come from a quantum Monte Carlo simulation with 200 000 walkers running for 30 000 iterations (time step τ = 0.01, cut-off threshold = 0.01). The color code is supposed to indicate that excited states are not necessarily found in the correct order.

194104-10

A. Humeniuk and R. Mitric

J. Chem. Phys. 141, 194104 (2014)

dynamics in WN⊥ does not become excessive. In a study on first row diatomics using FCIQMC, Cleland et al.30 showed that for larger diatomics the ratio of required walkers NW to N the size NFCI of the Hilbert space, f = N W , indeed decreases F CI quickly as the number of correlated electrons and the size of the basis set increases. While for He2 , as calculated here with the aug-cc-pVDZ basis set, the ratio is fHe = 2 × 10−3 , it 2 goes down to fN ≈ 10−7 for the nitrogen dimer, fO ≈ 10−8 2 2 for O2 and fF ≈ 10−12 for F2 (all using the VQZ basis set). 2 The same authors also investigated how the walkers are distributed among the occupied determinants. For most diatomics they found the ground state to be dominated by few determinants, while the populations on lesser important determinants decrease almost exponentially (see Fig. 6 in Ref. 30). For the excited state method exFCIQMC this implies that truncating determinants with very small coefficients should have little impact on the accuracy. Unfortunately, in special cases such as Be2 and N2 many determinants with small weights still have to be included for an accurate description of the electronic structure.

a short term goal, it would desirable to extend the formalism to work with spin- and symmetry-adapted configuration state functions instead of Slater determinants and to reduce the overhead from parallelization in the code so that larger molecules can be treated. APPENDIX A: SELECTING BASIS VECTORS WITH GIVEN PROBABILITY 1. Note on 1b)

In order to select a Slater determinant |α with probability P1 (α|I) = |α|I|2 , one subdivides the interval [0,1] into blocks of length |α|I|2 . A random number r is drawn for [0,1]. If r lies within the kth interval (i.e., the interval 2 2 i < k |α i |I| < r < = i ≤ k |α i |I| ), the Slater determinant α k is chosen. To make the selection of the interval more efficient, one precomputes the cumulative sums of the probabilities k (I ) = |αi |I |2 , (A1) i≤k

which equals the length of all segments up to and including the kth segment. Since k (I) is sorted in ascending order with respect to the index k, finding the index k such that

IV. CONCLUSION AND OUTLOOK

An algorithm has been developed to calculate excited electronic states using quantum Monte Carlo in the basis of Slater determinants. The accuracy of the results is controlled by a cut-off parameter that determines the size of the composite subspace (which approximates the orthogonal complement of the previously found electronic states), the number of walkers and the duration of the simulation in imaginary time. As a proof of principle the algorithm was applied to lithium-hydride and the helium dimer, and its correctness was verified by comparing with the full CI results. While the algorithm scales linearly in the number of walkers, the minimum number of walkers required to obtain a stationary distribution still scales exponentially with the size of the electron system and linearly with the size of the composite subspace. Therefore, it remains to be shown that problems which cannot be solved with existing comparable methods such as full CI31 or DMRG32 are better approached with this method. As

k−1 (I ) < r ≤ k (I )

(A2)

can be accomplished in log2 (|B|) time by binary search. 2. Note on 2b)

In a similar way, one selects a composite basis vector J given a Slater determinant β with probability P3 (J|β) = |J|β|2 by defining the cumulative probabilities k (β) = |Ji |β|2 . (A3) i≤k

APPENDIX B: PREDICTING GENERATION PROBABILITIES

If we would not reject any of the generated basis vectors, the probabilities for selecting J given I would be

r if I is a composite basis vector (I ∈ C) without rejection

pgen

(J |I ) =

P2 (J |α )P1 (α |I ) =

α

1 Nexc

|α |I |2

for I ∈ C

(B1)

α ∈B∩Con(J )

and

r if I is a pure basis vector (I ∈ B) without rejection

pgen

(J |I ) =

P3 (J |β )P2 (β |I ) =

β

1 Nexc

⎧ 1 ⎨ Nexc δ(I is connected to J ) = ⎩ 1 2 β ∈B∩Con(I ) |J |β | N exc

|J |β |2

β ∈Con(I )

if J ∈ B ⊥ if J ∈ C

for I ∈ B ⊥ .

(B2)

194104-11

A. Humeniuk and R. Mitric

J. Chem. Phys. 141, 194104 (2014)

As mentioned in the main text, one has to normalize this probability by the probability to select any acceptable basis vector given I.

r I is a composite basis vector (I ∈ C) P (any acceptable vector|I ) =

without rejection

pgen

(K |I ) =

K ∈WN⊥ \{|I }

= 1−

without rejection

pgen

(β |I )

β ∈B ⊥ without rejection

pgen

β ∈B

= 1−

(β |I ) = 1 −

⎛

α ∈B β ∈B

⎞

1 δ(α connected to β )|α |I |2 Nexc

1 ⎝ 1⎠ |α |I |2 . Nexc α ∈B β ∈B∩Con(α )

(B3)

The term in round brackets is just the number of determinants in the basis B that are connected to α , NB (α ) = 1 for α ∈ B .

(B4)

β ∈B∩Con(α )

The normalization for composite I becomes thus P (any acceptable vector|I ) = 1 −

r I is a pure Slater determinant (I ∈ B⊥ ).

P (any acceptable vector|I ) =

1 N (α )|α |I |2 Nexc α ∈B B

|K |β |2

K ∈WN⊥ \{|I } β ∈B∪B ⊥

1 = Nexc

= 1−

= 1−

⎛

1 Nexc 1 Nexc

⎞ |K |β |2 ⎠

K ∈WN ∪{|I }

β ∈Con(I )

⎛

β ∈Con(I )

⎞

⎜ |K |β |2 + ⎝ K ∈WN

β ∈B∩Con(I )

K ∈WN

(B5)

1 δ(β is connected to I ) Nexc

⎝1 −

for I ∈ C .

|I |β

⎟ ⎠

=0,since I ∈ / Con(I )

(|K |β |2 ).

The length of the projection of |β into the hidden subspace WN , xβ = |K |β |2 for β ∈ B

(B6)

(B7)

K ∈WN

can be precalculated for every β ∈ B. Therefore, P (any acceptable vector|I ) = 1 −

APPENDIX C: RUNTIME CONSIDERATIONS

In this stochastic algorithm relatively simple operations are repeated for very many iterations, giving only meaningful results in the limit of large numbers. Therefore, it is essential that the individual operations be coded as efficiently as possible. The large demand for computation time and memory necessitates parallelization on a shared memory cluster. Some speed-up can be gained by neglecting electron correla-

1 Nexc

xβ

for I ∈ B ⊥ .

(B8)

β ∈B∩Con(I )

tion in the core orbitals, as most basis sets do not provide the flexibility to describe correlation in the core, anyway.

1. Parallelization

The algorithm lends itself to full parallelization over the walker list in the same way as the ground state algorithm. During the spawning and cloning step each walker is independent, so that different processes can work concurrently on disjunct

194104-12

A. Humeniuk and R. Mitric

J. Chem. Phys. 141, 194104 (2014)

portions of the walker list. The annihilation step can also be broken up into independent parts if the walker list (sorted by determinant) is only split at positions such that walkers on the same state are processed on the same processor. In this implementation, the parallel_reduce function from Intel’s Threading Building Blocks was used to parallelize the spawning and cloning step. For the annihilation step a special range object,33 that allows splits only where the basis state of the walker changes, in combination with parallel_reduce was used. To ensure that the random numbers that are produced on different processors have no unexpected correlations, the program was linked with Tina’s random number generator34 library, which was designed with parallel Monte Carlo simulations in mind. So far the scalability seems to be limited by the memory bandwidth, as all processors need to access the matrix elements which are stored on shared memory.

The algorithm spends most of the time generating the random excitations, which is done by manipulating the bit strings encoding the occupied and unoccupied orbitals of the Slater determinants. If the bit string fits into a 64 bit register, full advantage can be taken of elementary bit operations built into the hardware. Therefore in this prototype implementation, the number of orbitals should not exceed 64 (otherwise the program runs very slowly). Molecules which need larger basis sets can still be squeezed into this limit if the usual frozen core approximation is invoked: Deep-lying core orbitals are replaced by a mean field which affects the valence orbitals and very high-lying orbitals are deleted. The input to the quantum Monte Carlo simulation are the matrix elements of the single particle part, h1a,b = a|hˆ1 |b (one-electron integrals in physicists’ notation), and the interaction part, h2abcd = ab|cd (two-electron integrals in physicists’ notation), of the molecular Hamiltonian in second quantized form 1 2 † † † h1a,b aa ab + habcd aa ab ad ac . (C1) Hˆ = 2 a,b a,b,c,d The indices a, b, c, and d enumerate the K molecular HartreeFock spin orbitals. Indices of atomic orbitals (AO) are denoted by μ, ν, ρ, and σ . One defines the core density matrix of the nc core electrons =

nc

h1a=1,...,K; b=1,...,K − → h1a=n ,...,K−n c

v

; b=nc ,...,K−nv

ha=1,...,K; b=1,...,K; c=1,...,K; d=1,...,K − → h2a=n ,...,K−n c

v

; b=nc ,...,K−nv ; c=nc ,...,K−nv ; d=nc ,...,K−nv .

(C4) The resulting Hamiltonian belongs to a system of Nelec − nc electrons in K − nc − nv orbitals in the presence of a mean field generated by the core electrons. In the examples shown above, the frozen core approximation was not used. APPENDIX D: MATRIX ELEMENTS BETWEEN SLATER DETERMINANTS

2. Frozen core approximation

c Pρ,σ

to the MO basis. The lowest nc and the highest nv orbitals are deleted

Cρ∗i Cσi ,

(C2)

i=1

where Cσi are the molecular orbital (MO) coefficients of the ith Hartree-Fock orbital. The mean field is added to the single particle Hamiltonian K K 1 1 1 c μρ|νσ − νμ|ρσ Pρ,σ → hμ,ν + . hμ,ν − 2 ρ=1 σ =1 (C3) Then the single particle Hamiltonian and the two-body part of the Hamiltonian h2μ,ν,ρ,σ are transformed from the AO h1μ,ν

Matrix elements of the electronic Hamiltonian between Slater determinants can be evaluated with the help of SlaterCondon rules,17 however, additional phases crop up because of the format in which Slater determinants are encoded. Each Slater determinant |Di is encoded in a bitstring i1 i2 . . . iK (in = 0, 1), where the in -th bit is set to 1 if the nth molecular orbital is occupied and to 0, otherwise. Internally, the Slater n−1 . The bitdeterminant is stored as an integer, i = K n=1 in 2 string can be translated into an ordered list of creation operators acting on the vacuum, †

†

†

→ (i1 +i1 a1 )(i2 +i2 a2 ) · · · (iK +iK aK )|, |Di = |i1 i2 . . . iK − (D1) where in negates the in -th bit. For instance, |1101 stands for † † † a1 a2 a4 |. If the rth creation operator is moved to the very front † † † † by using the fermion commutation relation ak ar = −ar ak , the † number of creation operators in front of ar (which equals the number of set bits with indices smaller than r) has to be counted to keep track of the correct sign. The matrix element Di |Hˆ |Dj is non-zero only for those combinations of determinants which differ by one or two occupied determinants.

r Single excitation The Slater determinants |Di and |Dj differ by one occupied orbital. In |Di orbital s is occupied and orbital r unoccupied, in |Dj r is occupied instead of s, s

r

↓

↓

|Dj = |c1 c2 . . . 0 . . . 1 . . . cK †

†

†

= (c1 + c1 a1 ) · · · ar · · · (cK + cK aK ) = (−1) = (−1)

r−1

n=1 cn

r−1

n=1 cn

†

†

†

↓

†

ar (c1 + c1 a1 ) · · · 1 · · · (cK +cK aK ) s

r

↓

ar |c1 c2 . . . 0 . . . 0 . . . cK .

On the other hand s

↓

r

↓

|Di = |c1 c2 . . . 1 . . . 0 . . . cK

(D2)

194104-13

A. Humeniuk and R. Mitric

and as |Di = (−1) = (−1) = (−1)

J. Chem. Phys. 141, 194104 (2014) 1 S.

s−1

n=1 cn

s−1

n=1 cn

s−1

n=1 cn

† as |c1 c2 †

. . . 0 . . . 0 . . . cK †

as (1−as as )|c1 c2 . . . 0 . . . 0 . . . cK †

as |c1 c2 . . . 0 . . . 0 . . . cK .

(D3)

Therefore, |Dj = (−1)

r−1

s−1

n=1 cn +

n=1 cn

†

ar as |Di .

(D4)

Now, that the correct relative phase has been found the usual Slater-Condon rules can be evoked to reduce the matrix element to a sum over one- and two-electron integrals Di |Hˆ |Dj s−1

r−1

= (−1) n=1 cn + n=1 cn ˆ 1 × r|h |s + ck (rk|sk − rk|ks) . k

(D5)

r Double excitation The Slater determinants differ by two occupied orbitals. In |Di the orbitals p and q (with p < q) are occupied, in |Dj the orbitals r and s (with r < s) are occupied, instead. |Dj = (−1)

s−1

r−1

n=1 cn +

p−1

n=1 cn +

n=1

q−1 cn + n=1 cn

† †

× ar as aq ap |Di .

(D6)

Therefore, Di |Hˆ |Dj = (−1)

s−1

r−1

n=1 cn +

p−1

n=1 cn +

n=1

× (rs|pq − rs|qp).

r No excitation

q−1 cn + n=1 cn

(D7)

The diagonal elements of the Hamiltonian are calculated as usual K ip p|hˆ1 |p Di |Hˆ |Di = p=1

1 i i (pq|pq − pq|qp). 2 p=1 q=1 p q K

+

K

(D8)

Chang, V. Pandharipande, J. Carlson, and K. Schmidt, Phys. Rev. A 70, 043602 (2004). 2 D. Ceperley and B. Alder, Phys. Rev. Lett. 45, 566 (1980). 3 J. Anderson, J. Chem. Phys. 63, 1499 (1975). 4 W. Foulker, L. Mitas, R. Needs, and G. Rajagopal, Rev. Mod. Phys. 73, 33 (2001). 5 M. Gillan, M. Towler, and D. Alfe, “Petascale computing opens new vistas for quantum Monte Carlo,” Psi-k Highlight of the Month, February 2011. 6 G. Booth, A. Thom, and A. Alavi, J. Chem. Phys. 131, 054106 (2009). 7 We make use of the following notations from set theory and linear algebra: A ∪ B: elements in A or B, A\B: elements of A that are not in B, span(B): all linear combinations of elements in B. 8 G. Booth, A. Grüneis, G. Kresse, and A. Alavi, Nature (London) 493, 365 (2013). 9 G. Booth, S. Smart, and A. Alavi, “Linear-scaling and parallelizable algorithms for stochastic quantum chemistry,” e-print arXiv:1305.6981. 10 A. Krylov, Annu. Rev. Phys. Chem. 59, 433 (2008). 11 M. Petersilka, U. Gossmann, and E. Gross, Phys. Rev. Lett. 76, 1212 (1996). 12 B. Roos and P. Taylor, Chem. Phys. 48, 157–173 (1980). 13 G. Booth and G. Chan, J. Chem. Phys. 137, 191102 (2012). 14 Y. Ohtsuka and Sh. Nagase, Chem. Phys. Lett. 463, 431 (2008). 15 S. Tenno, “Stochastic determination of effective Hamiltonian for the full configuration interaction solution of quasi-degenerate electronic states,” arXiv:1302.3924. 16 J. P. Coe and M. J. Paterson, J. Chem. Phys. 139, 154103 (2013). 17 A. Szabo and N. Ostlund, Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory (Dover, Mineola, New York, 1996). 18 The default value used in the FCIQMC implementation in MOLPRO. 19 D. Cleland, G. Booth, and A. Alavi, J. Chem. Phys. 132, 041103 (2010). 20 C. Bender and E. Davidson, J. Chem. Phys. 49, 4222 (1968). 21 J. Binkley, J. Pople, and W. Hehre, J. Am. Chem. Soc. 102, 939 (1980). 22 K. Schuchardt, B. Didier, T. Elsethagen, L. Sun, V. Gurumoorthi, J. Chase, J. Li, and T. Windus, J. Chem. Inf. Model. 47, 1045–1052 (2007). 23 Open-source quantum chemistry code developed by P. Muller, http://pyquante.sourceforge.net/. 24 E. Anderson, Z. Bai et al., LAPACK Users’ Guide, 1999. 25 P. Knowles and N. Handy, Chem. Phys. Lett. 111, 315 (1984). 26 P. Knowles and N. Handy, Comput. Phys. Commun. 54, 75 (1989). 27 GAMESS-UK is a package of ab initio programs, see “http://www.cfs.dl. ac.uk/gamess-uk/index.shtml,” M. F. Guest, I. J. Bush, H. J. J. van Dam, P. Sherwood, J. M. H. Thomas, J. H. van Lenthe, R. W. A. Havenith, and J. Kendrick, “The GAMESS-UK electronic structure package: Algorithms, developments and applications” Mol. Phys. 103, 719–747 (2005). 28 T. Dunning and D. Woon, J. Chem. Phys. 100, 2975 (1994). 29 S. Zarrabian and R. Harrison, Chem. Phys. Lett. 158, 393 (1989). 30 D. Cleland, G. Booth, C. Overy, and A. Alavi, J. Chem. Theory Comput. 8, 4138 (2012). 31 P. Knowles and N. Handy, J. Chem. Phys. 91, 2396 (1989). 32 G. Chan, J. Chem. Phys. 120, 3172 (2004). 33 J. Reinders, Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism (O’Reilly, 2007). 34 K. Bauke, Tina’s Random Number Generator Library, 2014. 35 See supplementary material at http://dx.doi.org/10.1063/1.4901020 for a worked-out illustrative example of how the exFCIQMC algorithm performs on minimal basis set H2 with all intermediate steps written out.