chm5586_ab_initio_HF
chm5586_ab_initio_HF
The fundamental assumption of HF theory – each electron sees all of the others as an
average field. Neglect of electron correlation can have profound chemical consequences when
it comes to determining accurate wave functions and properties derived therefrom. However,
HF theory, in spite of its fairly significant fundamental assumption, was adopted as useful in
the ab initio philosophy because it provides a very well defined stepping stone on the way to
more sophisticated theories (i.e., theories that come closer to accurate solution of the
Schrödinger equation). An enormous amount of effort has been spent on developing
mathematical and computational techniques to reach the HF limit – to solve the HF equations
with the equivalent of an infinite basis set, with no additional approximations. If the HF limit
is achieved, then the energy error associated with the HF approximation for a given system,
the electron correlation energy Ecorr, can be determined as
Ecorr = E - EHF
Along the way, it became clear that HF energies and other properties could be chemically
useful.
Basis sets
The basis set is the set of mathematical functions from which the wave function is
constructed. Each MO in HF theory is expressed as a linear combination of basis functions,
the coefficients from which are determined from the iterative solution of HF SCF equations.
The full HF wave function is expressed as a Slater determinant formed from the individual
occupied MOs. In principle, the HF limit is achieved by use of an infinite basis set, however,
in practice, one cannot make use of an infinite basis set. Thus, much work has gone into
identifying mathematical functions that allow wave functions to approach the HF limit
arbitrarily closely in as efficient manner as possible. Three considerations to make this
process efficient: (i) the number of two-electron integrals increases as N4 – we need to keep
the total number of basis functions to a minimum; (ii) it is useful to choose basis set
functional forms that permit to evaluate the integrals in a computationally efficient fashion: a
larger basis set can still represent a computational improvement over a smaller basis set if
evaluation of the greater number of integrals can be carried out faster; (iii) the basis functions
must be chosen to have a form that is useful in a chemical sense – the functions should have
large amplitude in regions of space where the electron probability density (the wave
function) is also large and vice versa.
Functional forms
n+1/2
0
decay of GTOs is exponential in r2 – this 0 0.5 1 1.5 2 2.5 3 3.5 4
The optimum combination of speed and accuracy – STO-3G. The old notation (3s)/[1s] (for
H atom) is still used sometimes. In parentheses: the number and type of primitive functions;
in brackets:
€ the number and type of contracted functions.
For molecules containing H and first-row elements – notation is (6s3p/3s)/[2s1p/1s]. For
higher rows than H and He, the exponents used for the primitive Gaussians in the s and p
contractions can be the same (then the radial parts of all two-electron integrals are identical
irresepective of whether they are (ss|ss), (ss|sp), (ss|pp), (sp|sp), etc.). The shapes of s- and p-
type functions are different – the contraction coefficients are not identical. Such basis
functions are called sometimes as sp basis functions.
Single-ζ, double-ζ, and split-valence basis sets
The STO-3G basis set is known as a ‘single-ζ’ or minimal basis set – one and only one
basis function is defined for each type of core or valence orbital. For H and He, there is only
a 1s function. For Li to Ne, there are five functions, 1s, 2s, 2px, 2py, and 2pz. For Na to Ar, 3s,
3px, 3py, and 3pz are added to the second-row set (9 functions). This number is the absolute
minimum required, nowhere near the infinite basis set limit.
One way to increase the flexibility of a basis set – to ‘decontract’ it. We may take the
STO-3G basis set, and instead of constructing each basis function as a sum (linear
combination) of three Gaussians, we could construct two basis functions for each AO, the
first being a contraction of the first two primitive Gaussians and the second the normalized
third primitive. This would not double the size of our basis set – we would have to evaluate
all the same individual integrals as before – but the size of our secular equation would
increase. A basis set with two functions for each AO – ‘double-ζ’ basis. We could decontract
further and treat each primitive as a full-fledged basis function – a ‘triple-ζ’ basis, and we
could then decide to add more functions indefinitely creating higher and higher multiple-ζ
basis sets. These increasingly large basis sets must come closer and closer to the HF limit.
Valence orbitals can vary widely as a function of chemical bonding. Atoms bonded to
significantly more electronegative elements take on partial positive charge from loss of
valence electrons – their remaining density is distributed more compactly. The reverse is true
when the bonding is to a more electroposititive element. From a chemical standpoint, there is
more to be gained by having flexibility in the valence basis functions than in the core – the
development of ‘split-valence’ basis sets: core orbitals continue to be represented by a single
(contracted) basis function but valence orbitals are split into arbitrarily many functions.
The most widely used split-valence basis sets (Pople’s group): 3-21G, 6-31G, 6-311G.
The first number indicates the number of primitives used in the contracted core functions.
The numbers after the hyphen indicate the number of primitives used in the valence functions
– if there are two such numbers, it is a valence-double-ζ basis, if there are three, valence-
triple-ζ. How one should choose exponents and coefficients for the contracted functions? As
the basis is no longer minimal, there is no particular advantage in fitting to STO. Pople and
co-workers – use the variational principle. A test set of atoms and/or molecules was
established, and exponents and coefficients were optimized so as to give the minimum
energy over the test set. The name of a basis set refers to its contraction scheme and a list of
all of its exponents and coefficients for each atom.
Modern basis sets – ‘correlation-consistent’ basis sets of Dunning: cc-pVDZ, cc-pVTZ,
cc-pVQZ, cc-pV5Z – the exponents and contraction coefficients were variationally optimized
not only for HF calculations, but also including electron correlation.
Polarization functions
We should remember about the distinction between atomic orbitals and basis functions in
molecular calculations! Example: the inversion barrier for interconversion between equivalent
pyramidal minima in NH3 is 5.8 kcal/mol. However, a HF calculation with the equivalent of an
infinite, atomic centered basis set of s and p functions predicts the geometry to be planar!
The problem with the calculation is that s and p functions centered on the atoms do not
provide sufficient mathematical flexibility to adequately describe the wave function for the
pyramidal geometry, even though the atoms nitrogen and hydrogen can individually be
reasonably well described entirely by s and p functions. The molecular orbitals, which are
eigenfunctions of a Schrödinger equation involving multiple nuclei at various positions in
space, require more mathematical flexibility than do the atoms. This flexibility is added in
the form of basis functions with one quantum number higher angular momentum than the
valence orbitals. A first-row atom – the most useful polarization functions are d GTOs,
hydrogen – p GTOs. Adding d functions to the nitrogen basis set causes HF theory to predict
correctly a pyramidal minimum for ammonia.
HF/6-31++G
HF/6-311G
HF/6-31G HF/6-31G(d,p)
Effective core potentials
Very heavy elements pose a challenge for MO theory – they have large numbers of
electrons – a large number of basis functions are required to describe them. These extra
electrons are mostly core electrons. A radical solution for this problem – replace the electrons
with analytical functions that would reasonably accurately, and much more efficiently,
represent the combined nuclear-electronic core to the remaining electrons. Effective core
potentials (ECPs) or pseudopotentials. In ab initio theory, ECPs represent not only Coulomb
repulsion effects and adherence to the Pauli principle (atomic orbitals must be orthogonal to
core orbitals having the same angular momentum).
The core electrons in very heavy elements reach velocities near the speed of light – they
manifest relativistic effects. A non-relativistic Hamiltonian is incapable to account for such
effects, which can be significant to many chemical properties. As an ECP represents the
behavior of an atomic core, relativistic effects can be included in it.
How many electrons to include in the core? ‘Large-core’ ECPs include everything but
the outermost (valence) shell; ‘small-core’ ECPs exclude the next lower shell. Polarization of
the sub-valence shell can be chemically important in heavier metals – it is usually worth the
extra cost to explicitly include that shell in the calculations. The most robust ECPs for the
elements Sc-Zn, Y-Cd, and La-Hg employ [Ne], [Ar], and [Kr] cores, respectively.
Symmetry
The presence of symmetry in a molecule can be used to great advantage in electronic
structure calculations and the advantages of symmetry are primarily associated with
computational efficiency. The most obvious advantage – in geometry optimization. The
presence of symmetry elements removes some of the 3N – 6 degrees of molecular freedom.
Consider C6H6 (benzene) as an example. 12 atoms – 30 degrees of freedom. If we restrict
ourselves to D6h symmetry – only two degrees of freedom (C-C and C-H bond lengths or O-C
and O-H distances, where O is the center of the benzene ring). Symmetry is also very useful
in several aspects of solving the SCF equations - it simplifies evaluation of one- and two-
electron integrals and also reduces the dimension of the secular equation, as diagonalization
of a block diagonal matrix can be accomplished by separate diagonalization of each block.
Sometimes symmetry is not employed deliberately. Why? This reflects a reluctance to
work on a reduced dimensionality PES because minima on that PES may not be minima on
the full PES. However, the best way to evaluate the nature of a stationary point, irrespective
it was located using symmetry or not, is to carry out a calculation of the full-dimensional
Hessian matrix and inspect the number of negative diagonal force constants (eigenvalues): all
force constants are positive – local minimum; one is negative – transition state, etc. Higher
symmetry calculations are more efficient in any case – impose symmetry at the start, release
it if the number of negative eigenvalues in Hessian is not correct. Symmetry constraints must
arise from molecular symmetry, not an erroneous idea of local symmetry. For example, the
three C-H bonds of a methyl group should not be constrained to have the same length
unless they are truly symmetrically related by a molecular threefold symmetry axis.
Efficiency of implementation and use
Formally, HF theory scales as N4, where N is the number of basis functions. However, in
practice, the situation is rarely so severe, and even linear scaling HF implementations have
begun to appear. The first issue in implementation – how to calculate the two-electron
integrals efficiently. The most straightforward way – to compute every single integral and, as
it is computed, write it to storage – then, as Fock matrix is assembled element by element,
call back the computed values whenever they are required (most of the integrals are required
several times). In practice, this approach is useful only when the time required to write to
read from storage is very fast. Otherwise, modern processors can recomputed the integral
from scratch faster than modern hardware can recover the previously computed value from
disk storage.
The process of computing each integral as it is needed – ‘direct SCF’. Only when the
storage of all integrals can be accomplished in memory itself (not on an external storage
device), the access time is sufficiently fast and the ‘traditional’ method is preferable over
direct SCF. As the size of the system increases – it is possible to estimate upper bounds for
two-electron integrals efficiently. If the upper bound is negligibly small – there is no point to
evaluate it and it is assigned to be zero. There are a lot of very small integrals in large
systems, which can be neglected, so the scaling of HF theory improves. Fast-multipole
methods can be also used to reduce the scaling of Coulomb integral evaluation.
Efficiency in converging the SCF for systems with large basis sets can be enhanced by
using as an initial guess the converged wave function from a calculation with a smaller basis
set or with a less negative charge. The same can be applied to geometry optimization – it is
often very helpful to optimize the geometry first at a more efficient level of theory. This is
true not only because the geometry optimized with the lower level is a good place to start for
the higher level, but also because one can compute the force constants at the lower level and
use them as an initial guess for the higher level Hessian matrix.
Taking advantage of molecular symmetry can provide very large savings in time.
However, structures optimized under the constraints of symmetry should always be checked
by computation of force constants to verify their nature as stationary points on the full PES.
Also, it is worthwhile to verify that open-shell wave functions for symmetric molecules are
stable with respect to orbital changes that would generate other electronic states. Finally, the
use of ECP basis sets for heavy elements improves efficiency by reducing the scale of
electronic structure problem and relativistic effects can be also accounted for.
General Performance Overview of Ab Initio HF Theory
Energetics
HF ignores correlation and, because in its ab initio formulation (as opposed to
semiempirical), no attempts is made to correct for this deficiency, HF theory cannot
realistically be used to compute heats of formation. Examination of the atomization energies
of 66 small molecules at the HF level using the aug-cc-pVnZ basis sets gave mean absolute
errors of 85, 66, and 62 kcal/mol with n = D, T, and Q, respectively.
Thus, even as one approaches the HF limit, the intrinsic error in an absolute molecular
energy calculation can be very large. In general, the energy associated with any process
involving a change in the total number of paired electrons is very poorly predicted at the HF
level because electron correlation is not accounted for. Even if the number of paired electrons
does not change but the nature of the bonds is substantially changed, the HF level can show
large errors. Example: the atmospheric CO + OH → CO2 + H reaction is known to be
exoergic with the energy change of -23 kcal/mol. The HF level of theory using the STO-3G,
3-21G, 6-31G(d,p), and near-infinite quality basis sets predicts energy changes of +34.1,
+3.1, -5.8, and –7.6 kcal/mol, respectively.
For comparison, MNDO, AM1, and PM3 calculations for the same set had mean
absolute errors of 9.1, 7.4, and 5.8 kcal/mol, and maximal errors of 42, 24, and 23 kcal/mol.
The situation improves for conformational changes: 35 different conformational energy
differences in organic molecules with the average energy difference between conformers of
1.6 kcal/mol – at the HF/6-31+G(d,p)//HF/6-31G(d) level, the RMS error in predicted
differences was 0.6 kcal/mol. The simplest of conformational changes – rotation about a
single bond. For eight rotations about HmX-YHn single bonds (X, Y = B, C, N, O, Si, S, P),
Hehre et al. found mean absolute errors of 0.6, 0.6, and 0.3 kcal/mol at the HF/STO-3G,
HF/3-21G*, and HF/6-31G* levels, respectively.
Although HF theory poorly computes most reaction energies, because of the substantial
electron correlation effects due to making/breaking bonds, it is reasonably accurate for
predicting protonation/deprotonation energies. The proton carries with it no electrons – these
reactions are less sensitive to differential electron correlation in reactants and products. If
basis sets of polarized valence-double-ζ quality or better are used, absolute proton affinities
of neutral molecules typically have accuracy of better than 5%. Errors increase if the cations
are non-classical (with bridging protons) – such structure tend to be found as minima only
after accounting for electron correlation effects. Deprotonation energies of neutral
compounds are computed with similar absolute accuracy (+8 kcal/mol) if diffuse functions
are included in the basis set to balance the description of the anion. Otherwise, very large
errors can occur.
Another fairly conservative reaction – the removal or attachment of a single electron
from/to a molecule. Koopmans’ theorem – the energy of the HOMO equals to the negative of
IP.
This approximation ignores the effect of electronic relaxation in the ionized product – the
degree to which the remaining electrons redistribute themselves following the detachment of
one from the HOMO. Alternatively, we can calculate the IP as the difference in HF energies
for the closed-shell neutral and the open-shell product, so-called ΔSCF IP:
IPΔSCF = EHF(A+•) – EHF(A)
Here, orbital relaxation is included. Including relaxation results in a smaller predicted IP
because relaxation lowers the energy of the cation radical relative to the neutral. However,
we have to remember that the neutral species has one more electron and therefore there will
be larger electron correlation effect. Ignoring these effects by employing HF theory, we
destabilize the neutral more than the radical cation and the IP will be underestimated. Thus,
Koopmans’ theorem benefits from a cancellation of errors: the orbital relaxation and the
electron correlation effects may cancel each other. In practice, the cancellation can be
remarkably good; Koopmans’ theorem IPs are often within 0.3 eV or so of experiment if
basis sets of polarized valence-double-ζ quality or better are used in the HF calculation.
Koopmans’ theorem can be formally applied to electron affinities (EAs) as well – the EA can
be taken to be the orbital energy of the lowest unoccupied (virtual) orbital (LUMO). In this
case, however, relaxation and correlation effects both favor the radical anion; rather than
canceling, the errors are additive, and Koopmans’ theorem estimates will almost always
underestimate the EA. It is generally better to compute EAs from a ΔSCF approach
whenever possible.
Can we use HF theory to model systems where two or more molecules are in contact, held
together by non-bonded interactions? Such interactions include electrostatic interactions
between permanent and induced charge distributions, dispersion, and hydrogen bonding.
HF theory is formally incapable of modeling dispersion because it is entirely a
consequence of electron correlation. Nevertheless, bimolecular interaction energies are often
reasonably well predicted by HF theory with basis sets like 6-31G(d) or similar – this again
reflects a cancellation of errors. Failure to account for dispersion – strongly reduces
intermolecular interactions, so the remaining errors must be in the direction of overbinding.
Two main contributions to overbinding: (i) HF charge distributions tend to be overpolarized,
which increases electrostatic interactions; (ii) a technical effect called ‘basis set superposition
error’ (BSSE). If we consider a bimolecular interaction, the HF interaction energy is
a∪b a b
ΔEbind = EHF ( A • B) − EHF ( A) − EHF ( B)
If a and b are not both infinite basis sets, there are more basis functions employed in the
calculation of the complex than in either of the monomers. The greater flexibility of the basis
set for the complex can provide an artificial lowering of the energy when one of the
€
monomers ‘borrows’ basis functions of the other to improve its own wave function. One
method to correct for BSSE – counterpoise (CP) correction. The CP corrected interaction
energy
CP
Ebind a∪b
= EHF a∪b
( A • B) A•B − EHF a∪b
( A) A•B − EHF [ a
( B) A•B + EHF a
( A) A•B − EHF ] [ b
( A) A + EHF b
( B) A•B − EHF ( B) A ]
The subscript after the molecular species describes the geometry employed. In the first
line, the energy of bringing two monomers together, each monomer already having the
geometry it has in the complex, is computed using a consistent basis set – in the monomer
calculations, basis functions for the missing partner are included, even though the nuclei on
which those functions are centered are not actually there (ghost basis functions).
The ghost functions lower the energies of the monomers – the overall binding energy is less.
The second line – the energy required to distort each monomer from its preferred equilibrium
structure to the structure found in the complex – the geometry distortion energies are
computed using only the nuclei-centered monomer basis set.
The borrowing of basis functions in only partly a mathematical artifact. To the extent
that some charge transfer and charge polarization indeed take place as part of forming the
bimolecular complex, some of the borrowing simply reflects chemical reality. Thus, CP
correction always overestimates BSSE.
Geometries
For minimum energy structures, HF geometries are usually very good when using basis
sets of relatively modest size. For basis sets of polarized valence-double-ζ quality, errors in
bond lengths between heavy atoms average about 0.03 Å, and between heavy atoms and H
about 0.015 Å. Bond angles are predicted to an average accuracy of about 1.5º and dihedral
angles are also generally well predicted.
Where we expect errors? HF theory tends to overemphasize occupation of bonding
orbitals – errors tend to be in the direction of predicting bonds to be too short, and this effect
becomes more pronounced as the basis set increases – the quality of HF calculations of
geometry may actually degrade with basis set improvement. Polarization functions are
absolutely required for geometric accuracy in systems with hypervalent bonding. In systems
crowding many pairs of non-bonding electrons into small regions of space (four oxygen lone
pairs in a peroxide) electron correlation effects on geometries can be large – HF geometries
may not be reliable.
Dative bonds (those where both electrons in the bonding pair formally come from only one
of the atoms) are often poorly described at the HF level. Example – B-C and B-N distances in
the complexes H3B-CO and H3B-NH3 are overestimated by ~0.1 Å at the HF/6-31G* level.
In the case of transition states (TSs), the failure of HF theory to account for electron
correlation can be more problematic, since correlation effects in partial bonds can be large.
Non-bonded complexes – the failure of HF theory to account for dispersion tends to make
such complexes too loose in structure – intermolecular distances are unrealistically large.
Hydrogen bonded structures are often quite good because errors in overestimating
electrostatic interactions cancel the failure to account for dispersion.
Charge distributions
HF dipole moments are usually insensitive to increases in basis set size beyond
valence-double-ζ. A systematic error in dipole moment – overestimation by 10-25%,
molecules are predicted to be too polar. 108 molecules were examined at the HF/6-31G(d,p)
level and a mean absolute error was 0.23 D. Results are erratic with smaller basis sets, in
part due to lower quality wave functions and in part due to poorer geometries, which affect
the dipole moment.