WWW.C-CHEM.ORG

SOFTWARE NEWS AND UPDATES

Automatic Algorithms for Completeness-Optimization of Gaussian Basis Sets Susi Lehtola* We present the generic, object-oriented C11 implementation of the completeness-optimization approach (Manninen and Vaara, J. Comput. Chem. 2006, 27, 434) in the freely available ERKALE program, and recommend the addition of basis set stability scans to the completeness-optimization procedure. The design of the algorithms is independent of the studied property, the used level of theory, as well as of the role of the optimized basis set: the procedure can be used to form auxiliary basis sets in a similar fashion. This implementation can easily be interfaced with various computer programs for the

actual calculation of molecular properties for the optimization, and the calculations can be trivially parallelized. Routines for general and segmented contraction of the generated basis sets are also included. The algorithms are demonstrated for two properties of the argon atom—the total energy and the nuclear magnetic shielding constant—and they will be used in upcoming work for generation of cost-efficient basis sets for C 2014 Wiley Periodicals, Inc. various properties. V

Introduction

fuse functions to the energy optimized basis to improve the description of the wave function in regions important for the studied property. When the property (or its individual contributions) behaves monotonically as the complete basis set (CBS) limit is approached, a variational optimization target for the augmentation functions can be constructed and the augmentation functions numerically optimized, analogously to the energy. This approach has been followed recently, for instance, by Jensen and coworkers to augment EO basis sets for magnetic properties,[32–34] and by Rappoport and Furche to improve static polarizabilities.[35] Note also that for some response properties it is possible to obtain analytic expressions for the change in the atomic orbitals on the perturbation, and use them to adjust the basis set as a shortcut to faster convergence. One notable example of this approach are the Sadlej basis sets for electric polarizabilities[36–39]; the gauge independent atomic orbital[40,41] (GIAO) approach can also be included in this category. Although elegant approaches such as the ones above are possible for specific properties, in the general case the augmentation functions are determined either ad hoc by hand (even-tempered [ET] extrapolation for tight functions), or by additional optimizations of anionic energies[42,43] (for diffuse functions). However, the

One-electron basis sets are an essential tool in quantum chemical calculations. With their help, the integrodifferential Hartree– Fock (HF) and Kohn–Sham[1,2] density-functional theory (KSDFT) equations can be transformed into the algebraic equations of the Roothaan–Hall[3] or Pople–Nesbet[4] type, which are far better suited for computer implementation than the HF or KS equations in their original form. In molecular calculations, choosing the basis set to be a linear combination of atomic orbitals (LCAO) is the obvious choice. Here, Gaussian-type orbitals (GTOs) are predominantly used, as they make even post-HF approaches feasible, thanks to the cost-efficient evaluation of two-electron integrals through recursion relations such as the McMurchie–Davidson[5] and Obara–Saika[6] schemes. Traditionally, LCAO basis sets are optimized to reproduce atomic and/or molecular ground state energies.[7] For example, this approach is followed in the correlation-consistent ccp(w)(C)VXZ[8–22] series (for post-HF methods) and the polarization consistent pc-n series[23–28] (for KS-DFT). However, sufficient convergence of the energy does not imply the sufficient convergence of other properties. Different properties often have significantly differing basis set requirements, as they may probe unequivalent aspects of the wave function. For instance, the energy is sensitive to relatively tight functions, whereas significant contributions to the dipole moment arise from diffuse functions.[29] A similar observation can also be made for different levels of theory: post-HF theories, such as Møller–Plesset perturbation theory[30] and coupled-cluster theories[31] have much larger basis set requirements than the mean-field HF and KSDFT.[1,2] Although going to larger and larger energy-optimized (EO) sets will eventually yield converged properties, this approach is often unsuitable due to the calculations becoming intractable before sufficient convergence of the property has been achieved. As an alternative route to faster basis set convergence, one common procedure is to add supplementary tight and/or dif-

DOI: 10.1002/jcc.23802

S. Lehtola Department of Physics, University of Helsinki, P. O. Box 64, FI-00014 University of Helsinki, Finland and Department of Applied Physics, COMP Center of Excellence, P. O. Box 11100, FI-00076 Aalto E-mail: [email protected] *Present address: Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California, 94720 Contract grant sponsors: Vilho, Yrj€ o, and Kalle V€ais€al€a Foundation and Magnus Ehrnrooth Foundation; Contract grant sponsor: University of Helsinki; Contract grant number: 490064; Contract grant sponsor: Academy of Finland; Contract grant numbers: 251748; 263294; 1259526; 1260204 C 2014 Wiley Periodicals, Inc. V

Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

1

SOFTWARE NEWS AND UPDATES

WWW.C-CHEM.ORG

former approach is somewhat arbitrary if the underlying primitives have been fully EO,† and the latter is not always possible because not all anions are bound; for instance Ne– and Mg– are not stable. Although the problem with the latter approach could be resolved to some extent by extrapolation, it would once again introduce a measure of arbitrariness to the construction of the basis set. Having established above that faster convergence can be obtained by augmenting the EO basis sets with property specific functions, the question arises whether wholly abandoning the underlying EO basis set would result in an even faster convergence to the basis set limit for the studied property. This question was answered affirmatively by Manninen and Vaara,[44] who suggested a procedure for a systematical approach to the CBS limit; namely, completeness-optimization. This is essentially a black-box method that can be used for any property at any level of theory, in which the basis set is optimized directly for the studied property. The procedure can also be applied to forming auxiliary basis sets in an identical fashion. Significant savings in computer time are usually observed with completeness-optimized (CO) basis sets, as compared to EO basis sets. So far, much faster convergence to the CBS limit has been observed for magnetic[44–50] and magnetooptic[51–55] properties as well as the radial electron momentum density.[56,57] As a heuristic rule of thumb from these studies, nf CO basis sets yield results of (n 1 1)f EO quality (for properties other than the energy). The reason for the savings is that the CO basis sets have no irrelevant degrees of freedom with respect to the studied property. The basis set is determined by a trial-and-error approach of successive expansions and reductions. To our knowledge, all of the work on CO basis sets so far (except for that in Refs. [56] and [57]) has been based on a manual approach, where the trials are performed by hand, implying a significant amount of tedious repetitive work and ample opportunity for error. As the idea in the completeness-optimization paradigm[44] is to form separate basis sets for every property at every level of theory, there is a significant amount of work to be done in parametrizing novel basis sets for different properties that are challenging for traditional, EO basis sets. Because of this, a fully automated, general procedure for the generation of CO basis sets is clearly desirable. The current manuscript presents general C11 algorithms that are currently implemented in the freely available ERKALE program.[58,59] The algorithms are trivial to adapt to new properties, only requiring the implementation of a short wrapper program, and thus allow for user-friendly generation of CO basis sets for any property. The significant difference to our previous work,[56,57] in addition to the abstraction of the algorithms, is the introduction of rigorous polarization and shell stability scans that guarantee balanced convergence to the CBS limit. The organization of the manuscript is the following. In the next section (Method), we give a brief formulation of the completeness-optimization procedure proposed by Manninen † The spacing of exponents in the core region is larger than in the valence region.

2

Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

and Vaara.[44] In the third section (Implementation), the algorithmic implementation of the completeness-optimization procedure is presented in detail. In the fourth section (Results), we illustrate the algorithms by forming basis sets for post-HF calculations of energies and nuclear magnetic shielding constants. The article concludes with a brief summary.

Method The fundamental scheme in the basis set approach is to expand the one-electron states (orbitals) jwi in the basis set as 1 X jwi5 cb jbi;

(1)

b51

where the coefficient is cb 5hbjwi, and the basis set is assumed to be orthonormal: hbjb0 i5dbb0 . Now, the exact relation eq. (1) assumes that the basis set is complete 1 X

jbihbj5I;

(2)

b51

which is usually not true: in practice, the used basis sets are finite, so relation (2) is only fulfilled approximately. Thus, the ultimate goal of the basis set is to meet the resolution of identity N X

jbihbj  I

(3)

b51

as closely as possible, without making the amount of functions N unnecessarily large. This manuscript focuses on atom-centered basis sets, in which the orbitals are expressed as an atomic decomposition wðrÞ5

1 X n21 X l X X

dA Þ: cAnlm RAnl ðjr2RA jÞYlm ðr2R

(4)

atoms A n51 l50 m52l

where RA is the coordinate of the Ath atom and Ylm ð^r Þ are spherical harmonics in the real form. The radial degrees of freedom are represented in a predetermined set of basis functions RAnl ðrÞ; cAnlm being the expansion coefficients. In this work, the radial functions are either primitive GTOs[60] RGTO nl ðrÞ5

2l12 r l 1=4

½ð2pÞ

ð2l11Þ!!

ð2l13Þ=4 2fnl r2 f e 1=2 nl

(5)

or contracted GTOs (CGTOs) Nk X ðrÞ5 dk RGTO RCGTO nl kl ðrÞ:

(6)

k51

The primitive exponents ffnl g and their contraction coefficients fdkl g are pretabulated as part of the parametrization of the basis sets. While in principle it is possible to refine the pretabulated atomic basis sets in molecular calculations, this involves such a cost that better results can be obtained with less resources just using a bigger pretabulated basis set.[61,62] WWW.CHEMISTRYVIEWS.COM

SOFTWARE NEWS AND UPDATES

WWW.C-CHEM.ORG

Note that reoptimizing the basis set in the molecular calculation might also destroy the ability of the basis set to reproduce properties other than the energy. Chong proposed visualizing the completeness of oneelectron basis sets by studying the conservation of the norm hal jal i51 of a primitive function jal i under the resolution of the identity [eq. (3)] as[29,63] N X Yðal Þ5 hal jbihbjal i;

(7)

b51

which obviously fulfils 0  Yðal Þ  1. In this work, the scanning function jal i is a Gaussian primitive of angular momentum l with a Gaussian exponent al [fnl 5al in eq. (5)]. In essence, Yðal Þ measures the flexibility of the used basis set to represent the test function jal i: when Y (al) 5 1 the basis set can describe the test function perfectly, and when Y (al) 5 0 it is completely unable to do so. Gaussian functions are not orthonormal, so the expression for the completeness profile [eq. (7)] becomes X Yðal Þ5 hal jlihljmi21 hmjal i;

(8)

lm

where hljmi21 denotes the ðl; mÞ element of the inverse overlap matrix. The one-center overlap of two Gaussian primitives (n, l) and ðn0 ; l0 Þ with magnetic quantum numbers m and m0 is !l=213=4 4fn0 l0 fnl 0 0 0 dll0 dmm0 ; (9) hn; l; mjn ; l ; m i5 ½fn0 l0 1fnl 2 so separate profiles are obtained for all values of the angular quantum number l, all magnetic quantum numbers m yielding the same profile. Note that since the scanning functions jal i in this work are Gaussian primitives as well, the hal jli matrix elements are also given by eq. (9). The idea of Manninen and Vaara[44] was to obtain the primitive exponents ffnl g for each angular momentum l in the basis set not by minimizing the energy, but by maximizing Y(al) in some region ½amin ; amax  that will be determined later on. Findl l ing the primitives is equivalent to minimizing the deviation from flatness[44,59] ð lg amax l 1 sl 5 ½12Yðal ; ffnl gÞdlg al ; (10) min min lg amax 2lg a lg al l l where we have stressed the parametric dependence of the completeness profile on the Gaussian exponents in the basis set ffnl g. Here, lg is the international standard notation for the 10-base logarithm. As a CBS is obtained by taking the limit sl ! 0; amin ! 0, and amax ! 1 for all values of l, smooth l l convergence is possible without having to rely on the variational properties of the quantity studied. Instead, convergence is guaranteed here by the variational approach to the CBS itself, in which the functions added in each step of the process are conceptionally orthogonal to the ones previously in the basis. Because of this quasi-variational approach, the necessary lim-

Figure 1. Profile expansion. Starting from the initial basis set (solid black line), trials are made by the addition of tighter functions (red dash-dotted lines) and more diffuse functions (blue dashed lines). Three consecutive trials in each direction are shown here for illustrative purposes; only the addition of a single function is considered during the expansion.

its of the interval ½amin ; amax  and the mean deviation from coml l pleteness sl (or, equivalently, the necessary amount of exponents on the interval) can be determined for each angular momentum shell l by trial and error for the property in question.[44]

Implementation The completeness-optimization approach for generating property-optimized basis sets is a two-step procedure.[44] First, in phase I, the CBS limit is determined by consecutively adding more functions to the basis set. Second, in phase II, unimportant functions are pruned from the CBS limit basis set to obtain costefficient primitive basis sets for actual calculations, and, optionally, the basis set is contracted to further reduce the necessary amount of basis functions. Let us study a property P, the calculated value of which depends on the used basis set fjfnl ig, P f 5P ðfjfnl igÞ:

(11)

Assume further that the property is equipped with a metric D : ðP; PÞ ! R1

(12)

that gradually goes toward zero as the CBS value for P, P CBS , is approached fjfnl ig!CBS

DðP f ; P CBS Þ !0:

(13)

In ERKALE, the completeness-optimizer is a C11 template library, which must be instantiated by defining the property type P and the metric D. Phase I: Search for the CBS limit The CBS limit is found iteratively by slowly squeezing in the estimated error in the studied property, simultaneously expanding the existing shells in the basis set and adding new polarization shells to the basis. In the expansion phase, trials corresponding to the addition of a single exponent are made Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

3

SOFTWARE NEWS AND UPDATES

WWW.C-CHEM.ORG

Algorithm 1. Algorithm for finding the limit CBS limit forproperty. any property Table 1. Algorithm for finding the CBS for any

to the tight and/or diffuse ends of the profile, as illustrated in Figure 1. This is done for every angular momentum shell in the basis set. In every iteration, the trial resulting in the maximal change D of the property is accepted, until convergence to the threshold , D < , has been achieved. The automated procedure to find the CBS limit of any property through completeness-optimization,[44] further developed from Refs. [56] and [57], is illustrated in Algorithm 1. The procedure begins from an initial basis set generated for a small value of s, typically sⱗ1023. If the initial value for s is too large, the optimization will not converge. The procedure then obtains a balanced basis set by decreasing the tolerance threshold for shell expansion in Step 4, until the addition of a new polarization shell becomes more important than adding functions to the existing shells. The polarization shell is then added and the expansion of the existing shells restarted. The 4

Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

parameter d controls the speed of the adjustment of the threshold; in this work d50:9 has been used. In principle any initial basis set can be used, as the irrelevant functions in the starting basis will be pruned away from the production basis sets in the reduction process. However, a small EO basis set is attractive as a starting point as it ensures that the basis set gives a reasonable description of the ground state, making the completeness-optimization computationally less heavy as less irrelevant degrees of freedom will be carried in the calculation. In this work, the initial basis is obtained by energy minimization of a given composition of CO primitives.‡ Note that depending on the studied property, if a minimal starting basis set is used, the polarization and stability scans ‡

The widths kl of the shells are fixed; only the offsets amin are found by gradient-based minimization with finite difference gradients.

WWW.CHEMISTRYVIEWS.COM

WWW.C-CHEM.ORG

SOFTWARE NEWS AND UPDATES

may find a nonphysical solution to the electronic structure problem, that is, a numerical instability in the quantum chemical model. This can lead to divergence in the CO expansion algorithm. The problem can usually be circumvented by the use of a slightly larger initial basis set, for example, by adding more atomic functions or a polarization shell. The scan for polarization functions can be done in a straightforward brute force fashion by predefining a suitable search interval ½amin ; amax . Here, a logarithmic grid is used, with the grid spacing being defined as a fraction of the asymptotic ET[64] spacing parameter b of a s-complete basis set (see “Primitive Optimization” section). In our experience, a spacing of b=2 already often yields smooth scans for the polarization functions. During this work, it was noticed that the simple expansion algorithm proposed in Refs. [56] and [57] described above failed to detect significant local importance maxima for some properties, as will be shown later in the manuscript. For this reason, a shell stability scan has been introduced in this work. With this addition, the expansion algorithm is by construction tolerant to multiple local minima. The brute force stability and polarization scans act as an acid test for the basis set: if a better representation of the electronic structure is possible, the scans will reveal it. In the stability scan, trials are formed by adding a single primitive function outside of the pre-existing completeness plateau of the shell, with the CO primitives on the shell fixed. Similarly to the polarization scan, the stability scan is performed on a logarithmic grid with a spacing determined as a fraction of the asymptotic spacing parameter b. Here, the scanned exponent intervals are ½amin =bn ; amin  and ½amax ; bn amax  with n > 1 a predefined number. If the scan reveals an instability, the limits of the completeness profile are expanded until the completeness plateau includes the location of the instability. This often results in the addition of a significant amount of functions in a single step, but the redundant functions are pruned away in phase II. Polarization and stability scans.

If a termination tolerance t has not been set, the expansion is continued until the set limit for the amount of polarization shells is reached. Because in phase II irrelevant functions will anyway be thrown away, it often makes sense to further expand the basis set to establish a slightly better reference value: the better the reference value is, the more accurate and cost-effective the reduced basis sets are. For instance, when designing a quadruple zeta basis set for argon (meaning d, f, and g polarization functions in the basis set), stopping the expansion at the addition of the first g function implies a suboptimal starting point for the reduction, as there is less freedom in modifying the g shell. Reduction of the s, p, d, and f shells may change the optimal value of the g exponent, implying the shell should have more functions to start with. Here, the natural choice is to extrapolate the importance of the next polarization shell as Extrapolation toward CBS limit.

Figure 2. Illustration of the transformation from a general contraction (on the left) to a segmented contraction (on the right) via Davidson rotation for 7 s primitives contracted to 5 s functions for argon. The bullets denote nonzero entries. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

lg D5a‘1b; where ‘ is the maximum angular momentum in the basis set, by fitting the coefficients a and b to the ‘ – 1 and ‘ – 2 values of D.

Phase II: Generation of cost-efficient primitive basis sets Once the CBS limit has been found, the basis set can be pruned to remove any unnecessary functions to form costefficient basis sets as proposed by Manninen and Vaara.[44] This consists of two types of procedures. Either one can move the tight and/or the diffuse limit of completeness, amax and l amin , respectively, or one can maintain the limits but reduce l the sampling by dropping a function. Iterating over the reduction trials and always choosing the trial with the smallest distance D to the CBS limit, a variational sequence of less and less accurate basis sets is produced. In Refs. [56] and [57], production-level basis sets were produced by choosing a maximal error allowed for the property for each level of the basis set. However, determining the suitable error is not simple a priori, as the magnitude of the error depends on the studied property as well as the used error function. Instead, strictly correlation[8] (cc) and/or polarization[23] consistent (pc) basis sets (depending on the used level of theory and the systems used to optimize the basis set) for the studied property can be produced in a black-box manner in the following fashion. When the highest polarization shell only contains a single exponent, and the trial with the smallest distance corresponds to pruning away this function, a correlation/polarization consistent basis has been achieved. In contrast to the expansion phase, where new degrees of freedom are introduced at every step, the reduction phase is clearly more problematic with regard to the problem of multiple local minima, as degrees of freedom are here taken away instead. The reduction of a basis set is in principle a combinatorial problem: given a set of exponents that give the CBS result, prune out n functions (and possibly reoptimize the rest) so that the error is still within tolerance. Within the CO approach, this combinatorial problem reduces to tuning the limits famin g; famax g and the flatness l l of the plateau fsl g for all values of l, but clearly there are still a variety of possible locally optimal choices for the parameters. Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

5

SOFTWARE NEWS AND UPDATES

WWW.C-CHEM.ORG

Figure 3. Minimal deviation from completeness sl for 20 functions on the s (l 5 0, solid blue line), p (l 5 1, solid cyan line), and d (l 5 2, solid red line) shells as a function of the width of the completeness interval. The value of kl for the s shell that corresponds to ss 51025 is shown by the cross and the dashed vertical line. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

After some experimentation, we decided to enforce ss  sp  sd  . . . ;

(15)

After the primitives have been sufficiently pruned, the tight functions in the basis set can be contracted to gain even more computational performance. The amount of gain achievable by contractions is inherently dependent on the property that is being modeled. Sets at the SCF level of theory lead to deeper contractions than those at post-HF levels of theory.[57] Moreover, as post-HF basis sets have significantly more functions at higher angular momentum, the relative gain in the reduction of the amount of functions by contractions is much smaller. In a similar fashion, also perturbational properties usually lead to sets that are less contracted than those for ground-state properties. The contraction can be performed in a black-box fashion using the error metric. A natural choice is to limit the error of the contracted basis set by the error of the primitive basis set Contracted sets.

Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

Analogously to the primitive sets above, a restriction is set on the contracted sets so that the amount of free (uncontracted) functions satisfies Nfree  Nfree  Nfree  ...; s p d

(18)

which again is motivated by experience with other cc and pc basis sets. The contraction coefficients are usually obtained either from atomic orbital coefficients or natural atomic orbitals, which directly lead to a general contraction scheme.[65] The contraction pattern is determined by sequentially contracting the tightest exponents one by one so that eq. (16) is still satisfied. Leaving the most diffuse exponents uncontracted has been found to yield the optimal pattern both for the total energy[65] as well as molecular properties.[66,67] Not all quantum chemistry programs can treat generally contracted basis sets efficiently. The general contractions contain redundancies that can be removed by Davidson rotations[68] to obtain better segmented basis sets that are more suited for use with these programs. This procedure is illustrated in Figure 2 for the s functions of argon, where the contraction coefficients are taken from the three occupied spatial s orbitals. First, the free primitives are dropped from the contracted functions, after which the contractions are subtracted from each other (“rotated”) to eliminate linearly dependent combinations to obtain a better segmented basis set. Note that the original Davidson procedure is numerically unstable; a recently proposed, numerically more stable algorithm[69] has been implemented in ERKALE for this purpose.

(16)

Here, P c denotes the value of the property obtained with the contracted basis set, P p the value obtained with the (uncontracted) primitive set, and P CBS is the CBS value of the prop6

(17)

(14)

to guarantee a balanced composition of the basis set. In our experience so far, these two restrictions are enough to converge to a good minimum.

DðP c ; P p Þ  DðP p ; P CBS Þ:

erty obtained in the expansion phase. If D satisfies the mathematical conditions for a metric, the error of the contracted set will then be limited by DðP c ; P CBS Þ  2DðP p ; P CBS Þ:

as otherwise basis sets with odd compositions could be reproduced in the reduction. This restriction seems to be necessary for the reduction to reproduce more accurate results, that is, to not get stuck in a local minimum. Furthermore, inspired by the compositions of the cc and pc sets, a limitation on the amount of primitives on the shells is enforced in the reduction phase as Ns > Np > Nd > . . .

Figure 4. The non-negative CO exponents lg an (red squares) for the s shell, with tolerance sl 51025 . Half of the width of the resulting interval kl is shown with the black circles, the line having been drawn as an aid to the eye. The whole set of exponents can be obtained by mirroring lg an ! 2lg an . [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Primitive optimization In the evaluation of eq. (8), the canonical orthonormalization procedure[70] is used to avoid linear dependencies in the basis WWW.CHEMISTRYVIEWS.COM

SOFTWARE NEWS AND UPDATES

WWW.C-CHEM.ORG

Table 1. Connection between asymptotic step size in CO sets and the spacing of ET basis sets an 5a0 bn . sl 21

10 1022 1023 1024 1025

Figure 5. The relative error in the width kl of the completeness plateau as a function of the amount of free parameters Nfree at the edges of the plateau, compared to fully optimized exponents. Four sets of data are shown corresponding to 30, 25, 20, and 15 primitives on the shell. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary. com.]

set.[29] Here, the eigenvectors of the overlap matrix with eigenvalues si < 1025 have been omitted. Also, as the overlap of Gaussian primitives [eq. (9)] is invariant on rescaling of the exponents a ! ba, the optimization quantity of eq. (10) can be reformulated as s0l ðffnl gÞ5

1 kl

ð kl =2

½12Yðal ; ffnl gÞdlg al

(19)

2kl =2

where the exponents fnl in the basis set have been scaled by 10kl =22lg amin . Now, it is evident that the optimal profile in eq. (19) has to be symmetric around the origin. Thus, only the bN=2c exponents (lg fnl > 0) need to be optimized, the negative exponents being given by inversion. For the case N odd, one exponent is always placed at the origin lg fnl 50. The measure in eq. (19) can be optimized using standard approaches. In the previously used[44–55] KRUUNUNHAKA program,[71] the optimization is performed using the stochastic differential evolution scheme.[72] In our preceding work[56,57] using ERKALE,[58,59] the exponents were optimized using the Nelder–Mead simplex method,[73] which we found to consistently yield much smaller sl values than the previously used stochastic scheme.[59] For this work, we have modified ERKALE to use Fletcher–Reeves conjugate gradients[74] as implemented in the GNU Scientific Library,[75] with the gradient computed by finite differences, as this has enabled still tighter minimization of eq. (19). In contrast to energy optimization,[76] we have not found the direct minimization of eq. (19) in terms of the exponents to cause any problems; on the contrary, we have found this to yield faster convergence in the minimization compared to the procedure of Ref. [76]. In the approach introduced in Refs. [56] and [57], the expansion of the profile is performed by keeping sl constant. Here, the exponents are obtained by searching for the width kl in

bs

bp

bd

b

3.891 2.225 1.772 1.564 1.444

2.893 1.920 1.616 1.466 1.377

2.465 1.761 1.527 1.407 1.334

2.221 1.661 1.468 1.366 1.303

eq. (19) that yields the wanted value of sl. In ERKALE, the search is performed first by bisection until sl is linear in the search interval, after which the exact value for kl is found by linear interpolation. The dependence sl ðkl Þ is illustrated in Figure 3. When the amount of primitives on a shell l grows, the mid exponents saturate quickly and, as also happens in large EO basis sets, they start following an ET formula.[64] This is illustrated with the s shell in Figure 4. This behavior can be further used in the numerical optimization of the exponents, as the middle exponents can be represented with an ET formula and only the outermost exponents need to be fully optimized. The error in the ET approximation is studied in Figure 5. The relative error in the width is smaller than 1026 with four free exponents, which has been set as the default in ERKALE and chosen for the optimizations carried out in the rest of this work. As is obvious from Figure 3, the measure sl cannot be made arbitrarily small. Correspondingly, to avoid numerical noise, a lower limit of sl  1025:5 has been set in ERKALE. Note that the same linear dependency issue is present in all atomcentered basis sets. Tracing the connection of the parameter sl to the spacing parameter in ET sets, presented in Table 1, it is seen that the achievable value of the spacing parameter b within this scheme is much smaller than the minimal exponent ratio 1.5 commonly enforced in the energy-optimization of the correlation consistent basis sets,§ which suggests that the finite minimal achievable value of sl should be small enough to guarantee convergence of any property. Parallelization Polarization exponent scans, the expansion of existing shells, and their stability scans are all composed of independent trials. This can be taken advantage of in the implementation of the property calculation. In every step, the completeness-optimizer algorithm prepares a batch of trial basis sets, for which the property can then be computed simultaneously, for example, on a computer cluster. The parallelization can be easily dealt with in the front-end program, where the most efficient parallelization strategy can be chosen for the size of the problem.

Illustration To illustrate the procedure, we apply the automatic algorithm to completeness-optimization of basis sets for two properties of the argon atom, namely, the energy and the nuclear magnetic §

K. A. Peterson, private communication.

Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

7

SOFTWARE NEWS AND UPDATES

WWW.C-CHEM.ORG

Figure 6. Stability and polarization scans for the total energy and the NMR shielding constant of the argon atom. The horizontal dashed black line denotes the current value of . The blue circles, the red squares, and the green diamonds denote results from the stability scans for the s, p, and d shells, respectively, while the purple triangles represent the polarization scan for the f shell. The vertical dashed lines denote limits of the completeness plateau ½amin ; amax . The spacing used in the scans is b=4, the polarization exponents are scanned in the interval ½1022 ; 106 , and n 5 6 in the stability scans. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

resonance (NMR) shielding constant. The properties are computed at the all-electron coupled cluster level of theory[31] with single and double excitations[77] using the CFOUR program.[78–80] GIAOs are used in the calculation of the shielding constants. The shielding constant is interesting for the illustration, since as a nuclear property it is sensitive to tight functions in the basis set, but due to its also being a perturbational property, it implicitly depends on the virtual orbitals as well, thus introducing sensitivity to diffuse functions. The error metrics used in the completeness-optimization for the energy E and the shielding constant r used here are simply DE 5jEtr: 2Eref: j;

(20)

Dr 5jrtr: 2rref: j;

(21)

where Xtr: and Xref: are the trial and reference values, respectively. (More elaborate error metrics have been used in Refs. [56] and [57] for the electron momentum density.) The wrapper programs used to perform the optimization consist of only a few hundred lines of code each, and parallelization has been implemented with a single OpenMP loop in the wrapper routine. The optimizations are begun from an initial 9s6p basis set with sl 51024 that is EO at the all-electron Møller–Plesset 8

Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

level of theory truncated at the second order[30] (MP2), the MP2 calculations again being performed with CFOUR. Contraction coefficients are taken from HF natural orbitals that have been computed with ERKALE, similarly to Ref. [57]. The need for a brute force scan of polarization exponents and the stability analysis advocated in this work is made apparent by the plots for the change in the energy and the shielding constant in Figure 6. Here, a d shell has been added and the s, p, and d shells have been expanded until the importance of adding a new s, p, or d function is less than the determined importance of the first f function. Note that the underlying spd basis sets are different for the energy and the NMR shielding; for instance, the d exponent for the energy is roughly a factor 10 larger than the one for the shielding constant. The stability scan for the energy in Figure 6a shows that there is a significant local maximum for diffuse d functions that has not been captured by the expansion algorithm. For the NMR shielding shown in Figure 6b, instabilities exist in both the tight and the diffuse directions of the d shell, as well as for tight p functions. Here, instead of adding the f shell, the optimization algorithm expands the completeness plateaus of the existing shells to include the relevant instability maxima, and restarts the expansion. Eventually, it retries the addition of WWW.CHEMISTRYVIEWS.COM

SOFTWARE NEWS AND UPDATES

WWW.C-CHEM.ORG

Table 2. Compositions of completeness-optimized basis sets: the smallest and largest exponents on the shells in 10-base logarithmic units, as well as the amount of functions on each shell. (a) Energy. s

DZ-ref TZ2ref QZ-ref 5Z-ref ref 5Z QZ TZ DZ

p

d

F

min ai

max ai

n

min ai

max ai

n

min ai

max ai

n

20.491 20.491 20.685 20.684 20.684 20.675 20.648 20.490 20.444

3.874 4.263 4.457 5.622 5.622 5.224 4.785 4.468 3.905

13 14 15 18 18 16 13 12 9

20.525 20.691 20.691 20.857 21.023 20.984 20.844 20.882 20.633

2.199 2.365 2.698 2.864 3.030 2.991 2.771 2.689 2.345

10 11 12 13 14 11 10 9 7

1.095 20.192 20.192 20.340 20.488 20.409 20.362 20.285 1.016

1.095 1.042 1.339 1.785 2.230 2.152 1.647 1.571 1.016

1 6 7 9 11 7 5 4 1

min ai

g

max ai

20.031 0.018 20.118 20.253 20.150 0.027 1.256

min ai

n

20.031 1.138 1.274 1.681 1.578 1.401 1.256

1 6 7 9 5 3 1

h

max ai

0.018 0.064 20.062 0.058 1.151

min ai

n

0.018 1.350 1.475 1.355 1.151

1 7 8 4 1

1.325 0.080 1.045

max ai

1.325 1.282 1.045

n

N

E (Ha)

r (ppm)

1 7 1

48 84 137 225 327 166 98 66 35

2526.924 2527.297 2527.426 2527.492 2527.509 2527.486 2527.418 2527.313 2526.745

1237.935 1238.021 1238.239 1238.084 1238.036 1238.042 1237.985 1238.048 1236.754

n

N

E (Ha)

r (ppm)

1 10 1

52 109 139 272 392 131 72 51 35

2527.026 2527.372 2527.401 2527.495 2527.511 2527.449 2527.017 2526.874 2526.171

1237.853 1237.926 1237.880 1237.987 1237.992 1238.011 1237.987 1238.001 1237.835

(b) NMR shielding. s

DZ-ref TZ-ref QZ-ref 5Z-ref ref 5Z QZ TZ DZ

p

d

f

min ai

max ai

n

min ai

max ai

n

min ai

max ai

n

20.685 20.685 20.685 20.685 21.073 20.662 20.348 20.348 20.186

4.069 4.457 4.457 4.845 5.234 4.823 3.770 3.770 3.446

14 15 15 16 18 14 10 10 9

20.691 21.023 21.023 21.023 21.023 20.869 20.701 20.701 20.666

2.365 3.030 3.030 3.030 3.030 2.765 2.454 2.454 2.418

11 14 14 14 14 10 8 8 7

0.101 20.737 20.737 20.737 20.737 20.507 20.104 0.249 0.367

0.101 1.387 1.387 1.685 1.982 1.600 0.835 0.664 0.367

1 9 9 10 11 6 3 2 1

min ai

g

max ai

0.198 20.428 20.425 20.424 20.070 0.482 0.427

n

0.198 0.152 1.510 2.053 1.509 0.952 0.427

1 4 9 11 4 2 1

min ai

0.241 20.332 20.332 0.063 0.385

h

max ai

0.241 1.709 1.709 0.520 0.385

min ai

n

1 10 10 2 1

0.324 20.341 0.524

max ai

0.324 1.568 0.524

The rightmost columns contain the amount of functions N in the basis set, and the energy and shielding constant computed at the CCSD(T) level of theory. The top rows contain the extension phase double-, triple-, quadruple-, and quintuple-f reference basis sets (DZ, TZ, QZ, and 5Z, respectively), with the extrapolated basis set denoted by ref. The reduced basis sets are described in the bottom part of the table.

the f shell, resulting in the plots in Figure 6c and 6d. Now, the largest maxima are for the first f function, which is then added. The expansion of the basis is performed to the quintuple-f level, that is, h functions which is the current limitation of CFOUR for NMR calculations, and then further continued to the threshold of the estimated sextuple-f level as described above in the “Extrapolation toward the CBS limit” section, yielding the primitive un-CO-E-ref and un-CO-r-ref sets for the energy and shielding constant, respectively. After this, cost-efficient basis sets are generated by pruning out the unimportant functions from the basis set, yielding the un-CO-E and un-CO-r sets. For reference, the compositions of the obtained primitive basis sets CO for the energy (un-CO-E-ref and un-CO-E) and for

the shielding constant (un-CO-r-ref and un-CO-r) are summarized in Table 2. Importantly, the compositions of the CO basis sets are not the same. The two properties probe different aspects of the wave function, and thus, the two resulting CO basis sets are different as well. The CO basis sets can also be compared to the (cc) ccpCVXZ[12] in and aug-cc-pCVXZ[9,12] series, which were obtained from the EMSL Basis Set Exchange,[81,82] shown in Table 3. Comparing the the two EO sets it is seen that the unCO-E sets span a smaller range of sp exponents than the corresponding cc sets, which can be tentatively be attributed to a combination of two factors. First, the cc sets converge the HF energy contribution much tighter than the contribution from the post-HF energy,[8] whereas in this work both contributions

Table 3. Compositions of correlation consistent basis sets in their primitive form, with overlapping functions deleted. s

cc-pCV6Z[a] cc-pCV5Z cc-pCVQZ cc-pCVTZ cc-pCVDZ aug-cc-pCV6Z[a] aug-cc-pCV5Z aug-cc-pCVQZ aug-cc-pCVTZ aug-cc-pCVDZ

p

d

f

g min ai

max ai

h

min ai

max ai

n

min ai

max ai

n

min ai

max ai

n

min ai

max ai

n

20.851 20.827 20.766 20.709 20.631 21.271 21.269 21.215 21.164 21.149

6.961 6.869 5.978 5.736 5.163 6.961 6.869 5.978 5.736 5.163

21 20 16 15 12 22 21 17 16 13

21.034 20.960 20.906 20.811 20.732 21.434 21.396 21.362 21.312 21.273

3.848 3.446 3.276 2.882 2.657 3.848 3.446 3.276 2.882 2.657

14 12 11 9 8 15 13 12 10 9

20.581 20.510 20.507 20.387 20.132 20.967 20.917 20.936 20.772 20.620

2.043 1.885 1.572 1.316 1.059 2.043 1.885 1.572 1.316 1.059

10 8 6 4 2 11 9 7 5 3

20.417 20.389 20.265 20.051

1.788 1.555 1.340 1.136

8 20.296 1.577 6 20.177 1.424 4 0.003 1.237 2

6 20.052 1.524 4 0.102 1.324 2

20.738 20.680 20.532 20.391

1.788 1.555 1.340 1.136

9 20.593 1.577 7 20.476 1.424 5 20.338 1.237 3

7 20.279 1.524 5 20.130 1.324 3

n

min ai

max ai

n

N

E (Ha)

4 267 2527.510 2 196 2527.502 125 2527.476 76 2527.411 46 2527.248 5 303 2527.511 3 232 2527.503 150 2527.478 92 2527.416 55 2527.264

r (ppm) 1238.00 1237.991 1237.952 1237.897 1237.787 1237.992 1237.968 1237.905 1237.763 1237.378

The used notation is analogous to the one in Table 2. [a] The definition of the basis set also includes h functions that have been deleted in this work.

Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

9

SOFTWARE NEWS AND UPDATES

WWW.C-CHEM.ORG

Table 4. Contraction patterns of the CO sets. Basis CO-E21 CO-E22 CO-E23 CO-E24 CO-r21 CO-r22 CO-r23 CO-r24

Pattern

N – N0

[9s7p1d | 4s3p] [12s9p4d1f | 7s6p] [13s10p5d3f1g | 8s7p] [16s11p7d5f4g1h | 10s9p] [9s7p1d | 6s5p] [10s8p2d1f | 7s5p] [10s8p3d2f1g | 8s7p] [14s10p6d4f2g1h | 11s]

17 14 14 12 9 12 5 3

The notation is [primitive | contracted]. The rightmost column shows the savings the amount of basis functions from the primitive set (N0) to the contracted set (N).

Figure 7. Basis set convergence of properties with decontracted basis sets. Here, un-pCVXZ, un-apCVXZ, and un-apcS-n are short for un-cc-pCVXZ, unaug-cc-pCVXZ, and un-aug-pcS-n, respectively. The magenta solid line in b) portrays the CBS limit value estimated by the CO procedure, r51237:992 ppm, and the legend is otherwise the same as in a). [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

are considered simultaneously [see eq. (20)]. Second, in the energy-optimization, the sampling of the exponent space is often nonuniform, with larger spacings occurring for the tight exponents. To study the accuracy and cost efficiency of the basis sets, calculations were made at the coupled cluster with full single and double excitations as well as perturbative triple excitations (CCSD(T))[83–86] level of theory. The results for the CO sets are shown in Figure 7 alongside the cc sets as well as the pcS-n series,[33] which has been optimized to reproduce shielding constants at the KS-DFT level of theory. For an unbiased comparison, all the basis sets are taken here in their uncontracted, primitive form,¶ denoted with the un-prefix. For the energy, shown in Figure 7a, agreeable convergence is seen in the expansion phase both with the un-CO-E-ref and the un-CO-r-ref basis sets. However, the reduction of the basis sets results in significantly different performance, the triple-f un-CO-r basis set reproducing an energy 0.5 Hartrees higher than the triple-f un-CO-E basis set, the two sets being of comparable size. Barring for the double-f set, the reduced un-CO-E basis set compares rather favorably with the un-cc-pCVXZ basis set for the reproduction of the CCSD(T) energy, taking into account that the latter has energy optimal primitives that account for a large portion of the difference in the results. ¶ The sp core correlating exponents have been deleted from the un-ccpCVXZ and un-aug-cc-pCVXZ basis sets due to significant overlap with the HF primitives.

10

Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

Interestingly, the larger un-CO-E sets (at least triple-f) outperform the pcS sets, although most of the primitives in the latter are energy optimal, but for KS-DFT. Although the convergence of the energy to the CBS limit is very smooth, the NMR shielding shows clearly nonvariational behavior and the differences between the extension phase unCO-E-ref and un-CO-r-ref basis sets are obvious. The un-CO-Eref results are still not converged at the estimated sextuple-f level, whereas the un-CO-r-ref, and un-cc-pCVXZ sets have reached convergence at the quintuple-f level. However, the un-aug-cc-pCVXZ set only converges at sextuple-f level. The nonvariational behavior is best seen in the extension phase, where going from double-f to triple-f in the un-CO-r-ref set approaches the CBS limit value of the shielding constant, but

Figure 8. Benchmark of contracted basis sets. Here, pCVXZ, apCVXZ, and apcS-n are short for cc-pCVXZ, aug-cc-pCVXZ, and aug-pcS-n, respectively. The magenta solid line in b) portrays the CBS limit value estimated by the CO procedure, r51237:992 ppm, and the legend is otherwise the same as in a). [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

WWW.CHEMISTRYVIEWS.COM

WWW.C-CHEM.ORG

going to quadruple-f takes the constant in the wrong direction, making the quadruple-f result worse than the triple-f one. The jump in the reproduced shielding constant is easily understood by comparison to the un-cc-pCVXZ and un-aug-ccpCVXZ results: while going to the quadruple-f level, the CO procedure adds more diffuse functions, which change the reproduced shielding constant from a value similar to un-ccpCVQZ to one similar to un-aug-cc-pCVQZ. The expansion procedure then recovers from this at the quintuple-f level, converging onto the estimated CBS limit that is in perfect agreement with the un-aug-cc-pCV6Z calculation. These kinds of skips, usually related to diffuse functions, may often be seen in completeness-optimization, but due to the variational modification of the basis set, each step taken is closer to the CBS limit than the one before. Surprisingly, the difference in the performance of the reduced primitive sets is less pronounced than for the energy. While the un-CO-r sets reproduce the shielding constant far more accurately than any other basis set, significantly outperforming the un-cc-pCVXZ as well as the un-pcS-n series, the un-CO-E set is also surprisingly accurate, comparing favorably to the un-cc-pCVXZ results, except at double-f level. Next, the primitive un-CO-E and un-CO-r basis sets are contracted, yielding the CO-E and CO-r basis sets, respectively, the contraction patterns of which are given in Table 4. As was discussed in the “Contracted sets” section, the relative savings with the contractions in the amount of basis functions are smaller for response properties (the shielding) than the energy. This is seen here clearly; the co-E sets being more contracted than the co-r sets. However, the larger co-E sets also contain more primitives to start with, making the comparison somewhat unfair. The nonvariational nature of the shielding constant is reflected in the CO-r-2 contraction pattern, which is not consistent with the CO-r-1 and CO-r-3 sets. This could be remedied by freeing up more functions in the CO-r-2 set, but this has not been done here. The CCSD(T) level results for the energy and shielding constant are shown in Figure 8. As can be seen, the accuracy of the contracted CO sets is largely the same as that for the primitive sets, and all of the analysis above applies to the contracted sets as well, further demonstrating the cost efficiency of the CO scheme.

Summary and Discussion We have presented general C11 basis set optimization algorithms that reflect the black-box nature of the completenessoptimization procedure. All of the basis set optimization and contraction algorithms are independent of the studied property and are thus easy to adapt to novel properties and to interface to different quantum chemistry programs. To adapt the algorithm to a new property, the user only has to write a short wrapper code for the calculation of the property in a given basis set, and a function to compute the distance of two values of the property. Typically, this amounts to a few hundred lines of program code. For generation of contracted

SOFTWARE NEWS AND UPDATES

basis sets, a routine to obtain the contraction coefficients is necessary as well, while the general algorithms are used to determine the contraction pattern as well as obtain segmented basis sets. The automated algorithms allow for userfriendly generation of basis sets for any property at any level of theory, and support parallelized calculations of the trials. We have demonstrated the algorithms with two properties of the argon atom—the total energy and the magnetic shielding constant—and benchmarked the obtained basis sets against EO basis sets. By definition, the EO basis sets are optimal for reproducing the energy, but the CO sets optimized for the energy are surprisingly competitive. However, most importantly, when studying properties other than the energy, the EO sets are no longer the obvious choice, and the CO scheme offers a simple way to much faster basis set convergence. EO sets minimize the atomic energy which is rarely of importance per se; rather, usually one is interested in energy differences where the atomic contributions cancel out to a large extent, and the important contributions come from the valence region, the form of which varies between molecules. It is likely that CO-E sets would fare somewhat better in comparison in an atomization energy benchmark, but it is difficult to make any significant conclusions from this limited set of data. In the application of the completeness-optimization algorithm, one must be careful to avoid saddle point convergence, especially during the expansion phase. In contrast to energy-optimization of basis sets, where saddle point convergence increases energy and is thus implicitly avoided in the optimization, here in general one cannot rely on any simple metric to determine if convergence to a saddle point has occurred, if the approach is to be kept as generally applicable as possible. If the calculations start to oscillate between multiple solutions, the completeness-optimization becomes unstable and does not converge. Fortunately, it is easy to notice when this happens, and can often be circumvented by fixing the orbital occupations by symmetry, or by using SCF stability analysis[87,88] in the optimization. Also, because calculations might not converge in all trial basis sets and the optimization procedure should not break when this happens, the metric routine D is equipped in ERKALE with an argument containing the default value for D that is returned when the property calculation fails in the trial basis set. During expansion trials this argument is D 5 0, and in reduction and contraction it is D51, so the failed trials are simply ignored. The overall quality of the basis sets obtained through CO is dependent on the property the basis set is optimized for. Although the black-box CO procedure can be applied to any property, optimization for some properties (by themselves) may reproduce rather peculiar basis sets. For instance, a CO basis set for the magnetizablity of argon reproduces a diffuse basis set that lacks tight functions altogether, because the magnetizability is dominated by contributions from the valence region. However, thanks to the black-box nature of the procedure, it is also possible to CO many properties simultaneously. Thus, here the way to proceed would be to combine the valence-sensitive magnetizability with a similar property that is more sensitive to the core region, such as the Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

11

SOFTWARE NEWS AND UPDATES

WWW.C-CHEM.ORG

NMR shielding. But, the definition of a compound error metric for multiple properties may not be straightforward, as one needs to determine a common scale for the errors in the properties. Although the CO paradigm is applicable to variational as well as nonvariational properties, its application to nonvariational properties may be harder, especially for constructing contracted basis sets. However, as seen in this work with the shielding constant, the extension algorithm correctly finds the CBS limit and the reduction algorithm produces reasonable basis sets for the reproduction of the property even in the nonvariational case. Here, the main question is just how accurate the pruned sets really are when applied to other systems, and how far off the pruned basis sets are from the truly optimal ones. In this work, CO primitive exponents have been used. However, there is no reason why ET exponents could not be used to perform completeness-optimization, as CO primitive sets anyway approach ET ones as the CBS limit is approached. Purely ET sets may be obtained with these algorithms by setting the amount of fully optimized primitives to zero. However, because the exponents at the edges are by definition relatively unimportant,** the difference between the use of CO and ET exponents should be minimal. If ET exponents are used, for example, because they are trivial to implement, the CO paradigm can be still used to fix the ET spacing parameter via the wanted sl value.

Acknowledgments I thank Pekka Manninen and Kirk Peterson for discussions. I also thank Kirk Peterson and Sam Manzer for careful reading of the manuscript. CSC – IT Center for Science Ltd. (Espoo, Finland) is acknowledged for providing computational resources used for this work. Keywords: completeness-optimization  Gaussian  coupled cluster  magnetic shielding  total energy  basis set  general contraction  segmented contraction

How to cite this article: S. Lehtola. J. Comput. Chem. 2014, DOI: 10.1002/jcc.23802

[1] [2] [3] [4] [5] [6] [7] [8] [9]

P. Hohenberg, W. Kohn, Phys. Rev. 1964, 136, B864. W. Kohn, L. J. Sham, Phys. Rev. 1965, 140, A1133. C. Roothaan, Rev. Mod. Phys. 1951, 23, 69. J. A. Pople, R. K. Nesbet, J. Chem. Phys. 1954, 22, 571. L. E. McMurchie, E. R. Davidson, J. Comput. Phys. 1978, 26, 218. S. Obara, A. Saika, J. Chem. Phys. 1986, 84, 3963. F. Jensen, Wiley Interdiscip. Rev. Comput. Mol. Sci. 2013, 3, 273. T. H. Dunning, J. Chem. Phys. 1989, 90, 1007. R. A. Kendall, T. H. Dunning, R. J. Harrison, J. Chem. Phys. 1992, 96, 6796. [10] D. E. Woon, T. H. Dunning, J. Chem. Phys. 1993, 98, 1358. [11] D. E. Woon, T. H. Dunning, J. Chem. Phys. 1994, 100, 2975. [12] D. E. Woon, T. H. Dunning, J. Chem. Phys. 1995, 103, 4572.

**If they were important, the basis set would be expanded further.

12

Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

[13] A. K. Wilson, T. van Mourik, T. H. Dunning, J. Mol. Struct. Theochem 1996, 388, 339. [14] A. K. Wilson, D. E. Woon, K. A. Peterson, T. H. Dunning, J. Chem. Phys. 1999, 110, 7667. [15] T. Van Mourik, T. H. Dunning, Int. J. Quantum Chem. 2000, 76, 205. [16] T. H. Dunning, K. A. Peterson, A. K. Wilson, J. Chem. Phys. 2001, 114, 9244. [17] J. Koput, K. A. Peterson, J. Phys. Chem. A 2002, 106, 9595. [18] K. A. Peterson, T. H. Dunning, J. Chem. Phys. 2002, 117, 10548. [19] N. B. Balabanov, K. A. Peterson, J. Chem. Phys. 2005, 123, 64107. [20] N. J. DeYonker, K. A. Peterson, A. K. Wilson, J. Phys. Chem. A 2007, 111, 11383. [21] J. G. Hill, S. Mazumder, K. A. Peterson, J. Chem. Phys. 2010, 132, 054108. [22] B. P. Prascher, D. E. Woon, K. A. Peterson, T. H. Dunning, A. K. Wilson, Theor. Chem. Acc. 2011, 128, 69. [23] F. Jensen, J. Chem. Phys. 2001, 115, 9113. [24] F. Jensen, J. Chem. Phys. 2002, 116, 7372. [25] F. Jensen, J. Chem. Phys. 2002, 117, 9234. [26] F. Jensen, J. Chem. Phys. 2003, 118, 2459. [27] F. Jensen, Helgaker, T. J. Chem. Phys. 2004, 121, 3463. [28] F. Jensen, J. Chem. Phys. 2012, 136, 114107. [29] D. P. Chong, Can. J. Chem. 1995, 73, 79. [30] C. Møller, M. S. M. Plesset, Phys. Rev. 1934, 46, 618.  ızek, J. Chem. Phys. 1966, 45, 4256. [31] J. C [32] F. Jensen, J. Chem. Theory Comput. 2006, 2, 1360. [33] F. Jensen, J. Chem. Theory Comput. 2008, 4, 719. [34] U. Benedikt, A. A. Auer, F. Jensen, J. Chem. Phys. 2008, 129, 064111. [35] D. Rappoport, F. Furche, J. Chem. Phys. 2010, 133, 134105. [36] A. J. Sadlej, Chem. Phys. Lett. 1977, 47, 50. [37] A. J. Sadlej, Collect. Czechoslov. Chem. Commun. 1988, 53, 1995. [38] A. J. Sadlej, Theor. Chim. Acta 1991, 79, 123. [39] A. J. Sadlej, Theor. Chim. Acta 1991, 81, 45. [40] F. London, J. Phys. le Radium 1937, 8, 397. [41] R. Ditchfield, J. Chem. Phys. 1972, 56, 5688. [42] J. Chandrasekhar, J. G. Andrade, P. V. R. Schleyer, J. Am. Chem. Soc. 1981, 103, 5609. [43] T. Clark, J. Chandrasekhar, G. W. Spitznagel, P. V. R. Schleyer, J. Comput. Chem. 1983, 4, 294. [44] P. Manninen, J. Vaara, J. Comput. Chem. 2006, 27, 434. [45] S. Ik€al€ainen, P. Lantto, P. Manninen, J. Vaara, Phys. Chem. Chem. Phys. 2009, 11, 11404. [46] J. V€ah€akangas, S. Ik€al€ainen, P. Lantto, J. Vaara, Phys. Chem. Chem. Phys. 2013, 15, 4634. [47] P. Lantto, K. Jackowski, W. Makulski, M. Olejniczak, M. Jaszu nski, J. Phys. Chem. A 2011, 115, 10617. [48] N. Abuzaid, A. M. Kantola, J. Vaara, Mol. Phys. 2013, 111, 1390. [49] M. Jaszu nski, M. Olejniczak, Mol. Phys. 2013, 111, 1355. [50] J. Vaara, M. Hanni, J. Jokisaari, J. Chem. Phys. 2013, 138, 104313. [51] S. Ik€al€ainen, P. Lantto, P. Manninen, J. Vaara, J. Chem. Phys. 2008, 129, 124102. [52] S. Ik€al€ainen, M. Romalis, P. Lantto, J. Vaara, Phys. Rev. Lett. 2010, 105, 153001. [53] T. S. Pennanen, S. Ik€al€ainen, P. Lantto, J. Vaara, J. Chem. Phys. 2012, 136, 184502. [54] J. Shi, S. Ik€al€ainen, J. Vaara, M. V. Romalis, J. Phys. Chem. Lett. 2013, 4, 437. [55] L.-J. Fu, J. Vaara, J. Chem. Phys. 2013, 138, 204110. [56] J. Lehtola, P. Manninen, M. Hakala, K. H€am€al€ainen, J. Chem. Phys. 2012, 137, 104105. [57] S. Lehtola, P. Manninen, M. Hakala, K. H€am€al€ainen, J. Chem. Phys. 2013, 138, 044109. [58] S. Lehtola, ERKALE - HF/DFT from Hel. 2013. Available at: http://erkale. googlecode.com. [59] J. Lehtola, M. Hakala, A. Sakko, K. H€am€al€ainen, J. Comput. Chem. 2012, 33, 1572. [60] T. Helgaker, P. Jørgensen, J. Olsen, Molecular electronic-structure theory; John Wiley & Sons, Ltd., 2000, on page 234. [61] K. Hashimoto, Y. Osamura, Chem. Phys. Lett. 1989, 164, 353. [62] K. Hashimoto, Y. Osamura, J. Chem. Phys. 1991, 95, 1121. [63] D. P. Chong, S. R. Langhoff, J. Chem. Phys. 1990, 93, 570. [64] M. W. Schmidt, K. Ruedenberg, J. Chem. Phys. 1979, 71, 3951.

WWW.CHEMISTRYVIEWS.COM

WWW.C-CHEM.ORG

[65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78]

R. C. Raffenetti, J. Chem. Phys. 1973, 58, 4452. J. Kong, R. Boyd, J. Chem. Phys. 1997, 107, 6270. F. Jensen, Theor. Chem. Acc. 2010, 126, 371. E. R. Davidson, Chem. Phys. Lett. 1996, 260, 514. F. Jensen, J. Chem. Theory Comput. 2014, 10, 1074. P.-O. L€ owdin, J. Chem. Phys. 1950, 18, 365. P. Manninen, S. Lehtola, Kruununhaka basis set tool kit. 2011. Available at: http://www.chem.helsinki.fi/~manninen/kruununhaka/. R. Storn, K. Price, J. Glob. Optim. 1997, 11, 341. J. A. Nelder, R. Mead, Comput. J. 1965, 7, 308. R. Fletcher, C. M. Reeves, Comput. J. 1964, 7, 149. M. Galassi, et al., GNU Scientific Library Reference Manual (3rd Ed.), ISBN 0954612078. G. A. Petersson, S. Zhong, J. A. Montgomery, M. J. Frisch, J. Chem. Phys. 2003, 118, 1101. G. D. Purvis, J. Chem. Phys. 1982, 76, 1910. A quantum-chemical program package by J. F. Stanton, J. Gauss, M. E. Harding, P. G. Szalay with contributions from A. A. Auer, R. J. Bartlett, U. Benedikt, C. Berger, D. E. Bernholdt, Y. J. Bomble, L. Cheng, O. Christiansen, M. Heckert, O. Heun, C. Huber, T.-C. Jagau, D. Jonsson, J. Juselius, K. Klein, W. J. Lauderdale, D. A. Matthews, T. Metzroth, L. A. M€ uck, D. P. O’Neill, D. R. Price, E. Prochnow, C. Puzzarini, K. Ruud, F. Schiffmann, W. Schwalbach, C. Simmons, S. Stopkowicz, A. Tajti, J. Vazquez, F. Wang, J. D. Watts and the integral packages MOLECULE (J.

[79] [80] [81] [82] [83] [84] [85] [86] [87] [88]

SOFTWARE NEWS AND UPDATES

Alml€ of, P. R. Taylor), PROPS (P. R. Taylor), ABACUS (T. Helgaker, H. J. Aa. Jensen, P. Jørgensen, J. Olsen), ECP routines by A. V. Mitin and C. van W€ ullen. CFOUR, Coupled-Cluster techniques for Computational Chemistry, version 2.00 beta. For the current version, available at: see http:// www.cfour.de. J. Gauss, J. F. Stanton, J. Chem. Phys. 1995, 102, 251. J. Gauss, J. F. Stanton, J. Chem. Phys. 1995, 103, 3561. K. L. Schuchardt, B. T. Didier, T. Elsethagen, L. Sun, V. Gurumoorthi, J. Chase, J. Li, T. L. Windus, J. Chem. Inf. Model. 2007, 47, 1045. EMSL basis set exchange. Available at: http://bse.pnl.gov. K. Raghavachari, G. W. Trucks, J. A. Pople, M. Head-Gordon, Chem. Phys. Lett. 1989, 157, 479. R. J. Bartlett, J. D. Watts, S. A. Kucharski, J. Noga, Chem. Phys. Lett. 1990, 165, 513. J. Gauss, J. F. Stanton, J. Chem. Phys. 1996, 104, 2574. J. F. Stanton, Chem. Phys. Lett. 1997, 281, 130. R. Seeger, J. A. Pople, J. Chem. Phys. 1977, 66, 3045. R. Bauernschmitt, R. Ahlrichs, J. Chem. Phys. 1996, 104, 9047.

Received: 8 September 2014 Revised: 4 November 2014 Accepted: 13 November 2014 Published online on 00 Month 2014

Journal of Computational Chemistry 2014, DOI: 10.1002/jcc.23802

13

Automatic algorithms for completeness-optimization of Gaussian basis sets.

We present the generic, object-oriented C++ implementation of the completeness-optimization approach (Manninen and Vaara, J. Comput. Chem. 2006, 27, 4...
736KB Sizes 2 Downloads 5 Views