• Keine Ergebnisse gefunden

Solution of the minimization problem

9. Wave-function based approach for the RDMF 145

9.3. Solution of the minimization problem

elements on the one-particle reduced density matrix ρ(1)α,α. If a diagonal element is integer for some α, i.e. ρ(1)α,α ∈ {0,1}, it follows thatρ(1)α,β =δα,βρ(1)α,α for this α. This can easily be seen from Gershgorin’s circle theorem [Gershgorin, 1931]

maxβ fβ ≥max

Thus, an integer diagonal element of the one-particle reduced density matrix directly implies the existence of an integer occupation. The converse is not true. With the decomposition of the one-particle basis into a core-space, active-space and virtual-space wave function in Eq. (9.17), we only need to consider the active-space wave function

|˜Ψii for the constrained minimization of the density-matrix functional. We include the expectation values of the interaction between core-states and valence states and, thus, account for core-valence exchange.

9.3. Solution of the minimization problem

9.3.1. Lagrange function

The constrained minimization problem for the density-matrix functional defined in Eq. (5.8) can be written as

FβWˆ[ρ(1)] = min This is a mixed equality-inequality constrained non-linear minimization problem. We can write it as an equality-constrained problem by introducing the auxiliary variables xi with P(xi) = (1 + cosxi)/2. (9.30) Thus, the constraint Pi ≥ 0 can be replaced by the unconstrained variable xi and the minimization problem written in the Lagrange formalism as

FβWˆ[ρ(1)] = min

Λ, λi and h are the Lagrange multipliers for enforcing the equality constraints. For ease of notation, we define the Lagrange function

L({xi},{|Ψii},Λ,i},{hα,β}) =

The equality-constrained non-linear minimization problem of Eq. (9.27) or Eq. (9.31) in-cludes some important conceptual and practical challenges. One can imagine two different classes of iterative minimization algorithms: one that enforces the constraints in every minimization step and one that does not require the fulfillment of the constraints in ev-ery step. The first variant can suffer from the so-called Maratos effect [Maratos, 1978].

The Maratos effect means that the algorithm can fail to rapidly converge to the solution because steps that would make good progress would violate the constraints.

The derivatives of the density-matrix functional with respect to the one-particle reduced density matrix

∂FWˆ[ρ(1)]

∂ρ(1)α,β (9.33)

are required for the efficient minimization of the total energy. Ignoring mathematical peculiarities for a moment, the derivatives can be obtained from the Lagrange multipliers h of the density-matrix constraint as

∂FWˆ[ρ(1)]

∂ρ(1)α,β =hβ,α. (9.34)

However, this is only true if the Lagrange multipliers are unique. To simplify the notation for the investigation of the uniqueness of the Lagrange multipliers, we consider the density-matrix functional with only one many-particle wave function |Ψi, i.e. P1 = 1 . The derivative of the Lagrange function in Eq. (9.32) with respect to the many-particle wave function |Ψi is

and has to vanish for the solution. Thus, |Ψiis an eigenstate of the effective Hamiltonian

X

α,β

hα,βˆcαcˆβ+ ˆW (9.36) with the eigenvalue λ. It is not necessarily the ground-state.

9.3. Solution of the minimization problem

The derivatives of the density-matrix constraint

∂hΨ|cˆαˆcβ|Ψi

∂hΨ| = ˆcαcˆβ|Ψi (9.37) determine the uniqueness of the Lagrange multipliers. To illustrate the connection, we consider two matrices of Lagrange multipliers,h and ˜h, for the density-matrix constraint while all other variables are identical. We assume that the minimum condition is fulfilled for both matrices, i.e. Eq. (9.35) vanishes. Thus, we have the condition

X

α,β

hα,βh˜α,βˆcαˆcβ|Ψi= 0. (9.38) We show now that the uniqueness of the Lagrange multipliers, i.e. h= ˜h, is not required to fulfill the minimum condition in Eq. (9.38). For this purpose, we consider the situation where the two index-pairs (α0, β0)6= (α00, β00) anda 6= 0 exist such that

ˆ

cα00cˆβ00|Ψi=aˆcα0ˆcβ0|Ψi (9.39) with ˆcα0ˆcβ0|Ψi 6= 0. All other ˆcαcˆβ|Ψiare assumed to be linearly independent of ˆcα0ˆcβ0|Ψi. We show that Eq. (9.38) does not imply h = ˜h in this case. The condition in Eq. (9.38) reads

0 = X

α,β:(α,β)6=(α00)∧(α,β)6=(α0000)

hα,β−˜hα,βcˆαˆcβ|Ψi

+hα00 −˜hα00cˆα0ˆcβ0|Ψi+hα0000−˜hα0000ˆcα00cˆβ00|Ψi. (9.40) With Eq. (9.39) we obtain

0 = X

α,β:(α,β)6=(α00)∧(α,β)6=(α0000)

hα,β−˜hα,βcˆαˆcβ|Ψi

+hα00 −˜hα00+ahα0000a˜hα0000ˆcα0ˆcβ0|Ψi. (9.41) The linear independence of ˆcα0cˆβ0|Ψi from other ˆcαcˆβ|Ψi for (α, β) 6= (α0, β0)∧(α, β) 6= (α00, β00) requires

hα00 −˜hα00 =−a·hα0000−˜hα0000. (9.42) Thus, the minimum condition in Eq. (9.38) can be fulfilled with h 6= ˜h if a 6= 0. More generally, the Lagrange multipliers are not unique if the gradients of the constraints

∂hΨ|cˆαˆcβ|Ψi

∂hΨ| = ˆcαcˆβ|Ψi (9.43) are linearly dependent. This is also known as the linear-independence-constraint-qualification (LICQ). In other words, if the gradients of the constraints are linearly de-pendent, then the Lagrange multipliers do not have to correspond to the derivatives of the density-matrix functional. We have only observed this issue in evaluations of the density-matrix functional for highly symmetric one-particle reduced density matrices and highly symmetric interaction Hamiltonians. In those cases, the linear dependence can be removed by adding a very small perturbation to the one-particle reduced density matrix.

9.3.3. Powell-Hestenes augmented Lagrangian

We propose to use the Powell-Hestenes augmented Lagrangian [Powell, 1969; Hestenes, 1969] for the numerical solution of the constrained minimization problem of the density-matrix functional. Consider the generic equality-constrained minimization problem

min~x f(~x) (9.44)

with the constraints

ci(~x) = 0. (9.45)

The Lagrange function L(~x, λi) for this problem is L(~x,i}) =f(~x)−X

i

λici(~x). (9.46) The augmented Lagrangian L(~x,i},{µi}) adds a penalty function p(augmentation) to the Lagrange function and is defined

L(~x,i},{µi}) = f(~x)−X

i

λici(~x) + 1 2

X

i

µip(ci(~x)). (9.47) µi are the penalty parameters. The penalty function p(c) can be chosen rather freely and properties of penalty functions have been studied in the literature. For a review we refer the reader to [Nocedal and Wright, 2006]. We choose the quadratic penalty function p(c) =c2, because it has smooth derivatives and does not involve the derivatives of the constraints. The augmented Lagrangian method maps the constrained minimization to a series of unconstrained minimization problems whose solution converges to the solution of the constrained problem. A generic augmented Lagrangian algorithm first chooses initial values for the penalty parameters µ(0)i , tolerance τ(0), Lagrange multipliers λ(0)i and a starting point ~x(0). Then the following step are executed fork = 0,1,2, ....:

1. The unconstrained problem

min~x L(~x,i},{µi}) (9.48) is solved up to a tolerance ofk∇xL(~x,i},{µi})k ≤τ(k). The minimizer is used as

~ x(k).

2. The convergence is checked. If the current ~x(k) and estimates for the Lagrange multipliers λ(k)i satisfy the minimum conditions, then the algorithm terminates.

3. The Lagrange multipliers are updated with the first-order multiplier update

λ(k+1)i =λ(k)iµ(k)i ci(~x(k)). (9.49) 4. New penalty parameters µ(k+1)iµ(k)i are chosen.

5. A new tolerance τ(k+1) is chosen.

9.3. Solution of the minimization problem The convergence of the augmented Lagrangian-approach is governed by the theo-rem [Nocedal and Wright, 2006]: let ~x be a local minimizer of the constrained mini-mization problem defined in Eq. (9.44) and Eq. (9.45) and let the linear-independence-constraint-qualification be fulfilled at~x. Letλi be the exact Lagrange multipliers. Then there exists a threshold value µsuch that for all penalty parameters µiµ,~x is a local minimizer of L(~x ,i},{µi}).

This theorem states that in contrast to the penalty method the penalty parameters in the augmented Lagrangian do not have to be increased to infinity to find the exact solution. As a consequence, the unconstrained subproblems of the augmented Lagrangian are much less prone to ill-conditioning. Additional theorems by Bertsekas [Bertsekas, 1999; Nocedal and Wright, 2006] apply to the situation of approximate solutions of the Lagrange multipliers. With the assumption of the previous theorem, there exists δ > 0, >0 and M >0 such that:

1. For all λi and µi that satisfy

k| ≤δmax

i µi and µiµ (9.50)

the problem

min~x L(~x ,i},{µi}) (9.51) has the solution ~x with

k~x~xk ≤Mkk/µ. (9.52) 2. For all λi and µi that satisfy

k| ≤δmax

i µi and µiµ, (9.53)

we have

k(k+1)k ≤Mk(k)k/min

i µi. (9.54)

This first theorem shows that the solution ~x of the unconstrained subproblem will be close to the exact solution~x of the constrained problem, if the estimates of the Lagrange multipliesare close to the exact Lagrange multipliers or if the penalty parameters are large. The second theorem relates the change of the multipliers by the multiplier update in Eq. (9.49) to the size of the penalty parameters. Improvement is guaranteed as long as the penalty parameters are sufficiently large.

We have implemented the augmented Lagrangian scheme in a general way with well-defined interfaces so that it can work with arbitrary parametrizations of the many-particle wave function and arbitrary constraints. The unconstrained subproblems can be solved with any suitable unconstrained minimization algorithm. In situations, where the deriva-tives of the augmented Lagrangian with respect to the variational parameters are avail-able, we solve the unconstrained subproblems either with the non-linear conjugate gra-dient method or the limited-memory Broyden-Fletcher-Goldfarb-Shanno quasi-Newton

method [Broyden, 1970; Fletcher, 1970; Goldfarb, 1970; Shanno, 1970]. For parametriza-tions of many-particle wave funcparametriza-tions that contain complex parameters, we employ the complex generalizations of the minimization algorithms [Sorber et al., 2012] to pre-serve the compact structure. The line searches are solved either numerically exact, if L(~x+α~y,i},{µi}) can be easily written as a polynomial function in α, or otherwise with a secant line search. In cases where the derivatives are not available or if there is noise in the augmented Lagrangian, we use the simultaneous perturbation stochastic ap-proximation approach (SPSA, [Spall, 1987, 1992]). The SPSA will be discussed in detail in section 9.7.5.

The augmented Lagrangian approach presented here is a general algorithm for the solution of constrained minimization problems. The main benefits are a rather simple formulation, the availability of derivatives and, most importantly, the numerical stability1.