Preliminary Results - Global Convergence - A Strictly Feasible Sequential Convex Programming Me

5. A Strictly Feasible Sequential Convex Programming Method

5.2 Global Convergence

5.2.2 Preliminary Results

The convergence proof for Algorithm 16 is an extension of the convergence proof of Algorithm 14, see Zillober [97, 98] and [102]. For the nonlinear optimization problem (5.16), the Lagrangian function (2.8) can be written as

L(x, y) = f(x) +y_c^Tc(x) +y_e^Te(x) (5.71)

∇_xL(x, y) = ∇f(x) +A_c(x)y_c+A_e(x)y_e (5.72) and, respectively, the augmented Lagrangian (5.13) is given by

Φ_ρ with the corresponding gradient

∇Φ_ρ Moreover, we consider the first order necessary optimality conditions (KKT, see Def-inition 6 and (2.12) - (2.19)) for the subproblem. We denote the Lagrangian function (2.8) of subproblem (5.19) in iteration k byL^(k)(z, v). As z^(k), v^(k)

Applying the Taylor approximation with residualR_f(k)(x),R_c(k) j

(x) andR_e_j(x) for the corresponding functions, we obtain the following equations for the objectivef(x):

f^(k)(x) = f x^(k)

and respectively for the constraintsc_j(x), j = 1, . . . , m_c, and e_j(x), j = 1, . . . , m_f,

5.2 Global Convergence 65

respectively. The exact values of R_c(k) j

(x), ∇R_c(k) j

(x), j = 1, . . . , m_c, and R_f(k)(x),

∇R_f(k)(x) can be obtained according to the following lemma, see Zillober [102] Section 2.3. The lemma is used to prove Lemma 5.4, which is an essential part to show that a sufficient descent with respect to the augmented Lagrangian is obtained, see Theorem 5.2.

Lemma 5.1. Let f^(k)(x)and c^(k)_j (x), j = 1, . . . , m_c, be convex approximations (4.21) and (4.5) formulated in x^(k) ∈ Rⁿ by Algorithm 16. Moreover, the corresponding asymptotes L^(k)_i and U_i^(k), i= 1, . . . , n, be feasible according to Definition 4.1. Then the following equations hold for all k ≥0

R_f(k)(x) = X

Proof. We obtain the following condition by exploiting the equality of the Taylor series (5.85) and the definition of the approximation (4.21). We get:

f^(k)(x) ^(5.85)= f x^(k)

5.2 Global Convergence 67

R_f(k)(x) = X

The derivatives of R_f(k)(x) can be determined easily, since ∂f x^(k)

∂x_i , U_i^(k), L^(k)_i , x^(k)_i

5.2 Global Convergence 69

In total we get:

∂R_f^(k)(x) The partial derivatives of R_c(k)

(x), j = 1, . . . , m_c, can be computed analogously by settingτ = 0.

The following lemma gives an important relation between the Lagrangian multipliers of the box constraints vu^(k) and v_l^(k) and the search direction ∆x^(k). The results are needed to prove Lemma 5.3, which is used to show that a sufficient descent with re-spect to the augmented Lagrangian is obtained, see Theorem 5.2.

Lemma 5.2. Let the sequences

x^(k), y^(k) and

z^(k), v^(k) be computed by Algorithm 16. The box constraints of subproblem (5.19) in iteration k are given by b^(k)u (x) and b^(k)_l (x) and the corresponding Lagrangian multipliers are denoted by vu^(k) ∈ Rⁿ and v_l^(k) ∈ Rⁿ, respectively. Let ∆x^(k) :=z^(k)−x^(k) be the primal search direction, where z^(k) is the primal solution of subproblem (5.19) formulated in the current iterate x^(k). Then the following inequalities

v_u^(k)^T (5.20) and (5.22). We start with the proof of (5.103)

v_u^(k)^T

Analogously, we can proof (5.104)

The next Lemma provides an upper bound on the descent of the objective function in iteration x^(k). The proof of Zillober [102], see Section 2.3, is extended by feasibil-ity constraints e_j(x), j = 1, . . . , m_f. The results are required in Theorem 5.2, which shows that a sufficient descent with respect to the augmented Lagrangian is obtained.

Lemma 5.3. Let the sequences

x^(k), y^(k) and

z^(k), v^(k) be computed by Algo-rithm 16 and let f^(k)(x) and c^(k)_j (x), j = 1, . . . , mc, be the corresponding convex approximations defined by (4.21) and (4.5), obtained by a sequence of feasible asymp-totes according to Definition 4.1. Moreover, the primal search direction is denoted by

∆x^(k) :=z^(k)−x^(k), wherez^(k) is the primal solution of subproblem (5.19) formulated

5.2 Global Convergence 71

Proof. Proceeding from (5.75) we get:

∇_xL^(k) z^(k), v^(k)

= ∇f^(k) z^(k)

| {z }

=∇f(^x^(k))^+∇R_f(k)(^z^(k))^,^(5.86)

+ A_c(k)(z^(k))v_c^(k)

| {z }

=Ac(x^(k))vc^(k)+∇R

c(k)(z^(k))vc^(k), (5.88)

+A_e(z^(k))v^(k)_e +v_u^(k)−v_l^(k)

= ∇f x^(k)

+∇R_f^(k) z^(k)

+Ac(x^(k))v_c^(k)+∇R_c^(k)(z^(k))v^(k)_c +A_e(z^(k))v^(k)_e +v_u^(k)−v_l^(k)

= 0

By reformulation and multiplication with ∆x^(k) we get

∇f x^(k)T

∆x^(k) = −∇R_f(k) z^(k)T

∆x^(k)− v_c^(k)T

A_c(x^(k))^T∆x^(k)

− v_c^(k)T

∇R_c(k)(z^(k))^T∆x^(k)− v^(k)_e T

A_e(z^(k))^T∆x^(k)

| {z }

≥e(z^(k))−e(x^(k)),(5.94)

− v_u^(k)^T

∆x^(k)

| {z }

≥0,see (5.103)

+ v^(k)_l T

∆x^(k)

| {z }

≤0,see (5.104)

≤ −∇R_f(k) z^(k)T

∆x^(k)− v_c^(k)T

A_c(x^(k))^T∆x^(k)

− v_c^(k)^T

∇R_c(k)(z^(k))^T∆x^(k)+ v_e^(k)^T

e(x^(k))−e(z^(k))

To prove that the search direction is a descent direction for the augmented Lagrangian, we need the following lemma, which shows that condition (5.106) holds. We review the proof of Zillober, see Section 2.3 of [102].

Lemma 5.4. Let the sequences

x^(k), y^(k) and

z^(k), v^(k) be computed by Algo-rithm 16 and let c^(k)_j (x), j = 1, . . . , mc, be the corresponding convex approximations, obtained by a sequence of feasible asymptotes according to Definition 4.1. Let the pri-mal search direction be ∆x^(k) := z^(k) −x^(k), where z^(k) ∈ Rⁿ is the primal solution of subproblem (5.19) formulated in x^(k) ∈ Rⁿ. Moreover, let the corresponding La-grangian multipliers be defined by vc^(k)∈R^m^c, then

v^(k)_c ^T

R_c^(k)(z^(k))− v_c^(k)^T

∇R_c^(k)(z^(k))^T∆x^(k) ≤ 0 (5.106) holds for all k= 0,1,2, . . ..

Proof. As

v_c^(k)^T

R_c(k)(z^(k))− v^(k)_c ^T

∇R_c(k)(z^(k))^T∆x^(k)

j=1

h v_c^(k)

jR_c(k) j

z^(k)

− v_c^(k)

j∇R_c(k) j

z^(k)T

∆x^(k)i

holds, we consider each constraint c_j(x), j = 1, . . . , m_c individually, to show that each term of the sum is less or equal than zero. For this purpose, we use the results of Lemma 5.1.

5.2 Global Convergence 73

Moreover, we have to consider the parameterη_i^(k), i= 1, . . . , n, see (4.30), which gives an estimate of the curvature of the approximation of the objective function f^(k)(x).

It is shown by Zillober, in Corollary 4.14 of [97], that η_i^(k), i = 1, . . . , n, is bounded from below, if the sequence of asymptotes is feasible. The parameter guarantees the sufficient descent of the augmented Lagrangian function, see Theorem 5.2.

Lemma 5.5. Let the sequences

x^(k), y^(k) and

z^(k), v^(k) be computed by Algo-rithm 16 with η_i^(k), i = 1, . . . , n, defined by (4.30). If the sequence of asymptotes is feasible according to Definition 4.1, then

i=1,...,nmin η_i^(k) =:η^(k) > η >0 (5.107)

holds for all k= 0,1,2, . . . with

η := τ (2−ω)ξ

(U_max−L_min)². (5.108)

Proof. Using (5.22) we get for a fixed i∈I₊^(k) with ω ∈]0,1[

z_i^(k) ≤ x^(k)_i

|{z}

=x^(k)_i +ω

U_i^(k)−x^(k)_i , (5.22)

z_i^(k) ≤ x^(k)_i +ω

U_i^(k)−x^(k)_i

−z_i^(k) ≥ −x^(k)_i −ω

U_i^(k)−x^(k)_i

Adding 2U_i^(k)−x^(k)_i leads to

2U_i^(k)−x^(k)_i −z_i^(k) ≥ 2U_i^(k)−x^(k)_i −x^(k)_i −ω

U_i^(k)−x^(k)_i

= (2−ω)

U_i^(k)−x^(k)_i

| {z }

≥ξ,(5.30)

In total we get

2U_i^(k)−z^(k)_i −x^(k)_i ≥ (2−ω)ξ (5.109)

We proceed from (4.30) and use (5.109), with a fixed i∈I₊^(k).

η_i^(k) =







∂f x^(k)

∂x_i

| {z }

≥0

+τ







2U_i^(k)−z_i^(k)−x^(k)_i

U_i^(k)−z^(k)_i 2

≥ τ2U_i^(k)−z_i^(k)−x^(k)_i

U_i^(k)−z_i^(k)2

> τ2U_i^(k)−z_i^(k)−x^(k)_i (U_max−L_min)²

(5.109)

≥ τ (2−ω)ξ (U_max−L_min)²

| {z }

=:η

> 0

withτ >0. The corresponding proof for i∈I₋^(k) can be given analogously with (5.20) and (5.29). All together we get:

η^(k)_i ≥η^(k) > η >0, ∀i= 1, . . . , n. (5.110)

An important part of the convergence proof is to show that the penalty parameters and the augmented Lagrangian function are bounded. Before we can show this, we have to show that the gradients of the approximationsf^(k)(x) and c^(k)_j (x), j = 1, . . . , mc, are bounded onX^(k). We review the proof for the objective function and the regular constraintsc_j(x), j = 1, . . . , m_c, given by Zillober [97] in Theorem 4.13.

Lemma 5.6. Let the sequence

x^(k), y^(k) be computed by Algorithm 16 and letf^(k)(x) and c^(k)_j (x), j = 1, . . . , m_c, be the corresponding convex approximations defined by (4.21) and (4.5), obtained by a feasible sequence of asymptotes

L^(k) and U^(k) according to Definition 4.1. Let the lower and upper boundsx^(k)_i andx^(k)_i , i= 1, . . . , n, be defined by (5.20) and (5.22).

If F is nonempty and compact, then there exists M₀ > 0 and M_j >0, j = 1, . . . , m_c such that

∂f^(k)(x)

∂x_i

< M0, i= 1, . . . , n (5.111)

∂c^(k)_j (x)

∂x_i

≤ Mj, i= 1, . . . , n, j = 1, . . . , mc (5.112) holds for all x∈X^(k) and k = 0,1,2, . . .

5.2 Global Convergence 75

Proof. We start with the derivatives of the approximated objective function, given in (4.23)

Moreover, we consider (5.20) and (5.22) U_i^(k) −x^(k)_i We can show that (5.114) is bounded. Starting with an arbitrary i∈I₊^(k) we get

5.2 Global Convergence 77 Due to Assumption 1,

≤ M0 holds. Together with the previous results and (5.113), we get for each i∈I₊^(k)

The same can be shown for eachi∈I₋^(k),

Moreover, we need to show that the augmented Lagrangian merit function defined in (5.73) is bounded from below. The proof including the objective function f(x) and regular constraints cj(x), j = 1, . . . , mc, is given in Theorem 5.3 of Zillober [97]. It has to be extended by adding feasibility constraints e_j(x), j = 1, . . . , m_f. Lemma 5.7 is needed for the main convergence Theorem 5.5.

5.2 Global Convergence 79

Lemma 5.7. Let F defined by (5.18) be nonempty and compact. Then there exists a M_Φ ∈R such that

Proof. Considering the augmented Lagrangian merit function (5.73), we obtain Φρ As F is nonempty and compact, there exists min

x∈F c_j(x) ≤0, j = 1, . . . , m_c, min

x∈F f(x), and min

x∈F e_j(x)≤0, j = 1, . . . , m_f, respectively. Moreover, there exists by assumption ay_max∈R such that |y_i| ≤y_max, i= 1, . . . , m_c+m_f.

Φ_ρ

With the results of Section 5.2.2 we can prove the convergence of Algorithm 16. In the following theorem it is shown that the primal and dual variables are bounded.

This is essential to give an estimation of the descent properties of the augmented Lagrangian, see also Theorem 5.2. The proof is based on Theorem 5.6 of Zillober [97], and Theorem 2.4.1 of Zillober [102], which are extended byej(x), j = 1, . . . , mf. Theorem 5.1. Let the sequences

x^(k), y^(k) and

z^(k), v^(k) be computed by Algo-rithm 16, where the corresponding approximations f^(k)(x) and c^(k)_j (x) are defined by (4.21) and (4.5). Moreover, let the smallest eigenvalue of A_J^(k) z^(k)T

A_J^(k) z^(k) be larger than a lower bound κ^(k)2

>0. Let the asymptotes L^(k)_i andU_i^(k), i= 1, . . . , n, be feasible according to Definition 4.1 and let F defined by (5.18) be nonempty and compact. Then all iterates x^(k) are in F and the corresponding multipliers y^(k) ∈ R^m^c^+m^f are bounded, i.e., there exists a y_max ∈ R such that

Proof. We assume an infinite sequence

x^(k), y^(k) . We start with a feasible point with respect to the feasibility constraints, i.e., x⁽⁰⁾ ∈ F. The solution z^(k) of subproblem (5.19) lies inF_X^(k) ⊆F, see (5.11). AsF is convex andx^(k)∈F,x^(k+1) ∈]x^(k), z^(k)]⊆F.

Im Dokument A strictly feasible sequential convex programming method (Seite 77-94)