Constrained optimization and SQP - Robust Updated MPC Schemes

We now examine the corresponding optimality conditions for the constrained problem

min f(z) s.t. g(z)≤0

h(z) = 0

(5.6) where f : Rⁿ → R is the objective function, g : Rⁿ → R^Nⁱ represents the inequality constraints whileh:Rⁿ→R^N^e the equality constraints. We setNc= Ni+Ne. The process of solving (5.6) is referred to asnonlinear programming (NLP). Some properties arising from optimality conditions in this section turn out to be required properties for the sensitivity analysis to be discussed afterwards.

In addition, we also present an algorithm to solve (5.6) which we use throughout the thesis.

We call the set Σ :=

gj(z)≤0, j= 1, . . . , Ni

hj(z) = 0, j=Ni+ 1, . . . , Nc

theadmissible setor thefeasible set. Note that with the defined indexing, no indexj repeats. The function

L(z, λ, µ) :=f(z) +µ^>g(z) +λ^>h(z)

is called theLagrangian functionandµ∈R^Nⁱ, λ∈R^N^e are calledLagrange multipliers corresponding to the inequality and equality constraints, respec-tively.

Definition 5.4.1.

A pointz^∗∈Σis aglobal minimizer of (5.6) if f(z^∗)≤f(z) for allz∈Σ

A point z^∗ ∈Σ is alocal minimizer of (5.6) if there exists a neighborhood N(z^∗)such that

f(z^∗)≤f(z) for allz∈ N(z^∗)∩Σ

A pointz^∗∈Σis astrict local minimizerof (5.6) if there exists a neighbor-hood N(z^∗)such that

f(z^∗)< f(z) for allz∈ N(z^∗)∩Σ, z6=z^∗

Consider the following set of indices associated with an optimal solutionz^∗ of (5.6)

Eq := {Ni+ 1, . . . , Nc}

In(z^∗) := {j∈ {1, . . . , Ni} |gj(z^∗) = 0} A(z^∗) := Eq∪In(z^∗)

I(z^∗) := {j∈ {1, . . . , Ni} |gj(z^∗)<0}

The notationA(z^∗)denotes theindex set of active constraintswhile the no-tation{hi, gi |i∈ A(z^∗)}gives theset of active constraintsforz^∗∈ Σ. The setI(z^∗)is theindex set of inactive constraintsforz^∗ and { gi |i∈ I(z^∗)}

5.4. Constrained optimization and SQP

is theset of inactive constraintsforz^∗∈Σ.

Suppose I(z^∗) 6= ∅, i.e., there exists i0 ∈ {1, . . . , Ni} such that gi0(z^∗) < 0. Deleting thei0-th inequality constraint does not changez^∗ from being the local minimizer of the problem (5.6). Thus assumingA(z^∗)is the index set of active constraints forz^∗ for (5.6), thenz^∗ is also the local minimizer of the equality constrained problem

min f(z)

s.t. gi(z) = 0, i∈ A(z^∗) h(z) = 0

(5.7)

Convex problems

The optimization problem (5.6) is said to be a convex problem if it has a convex objective function and a convex feasible region.

If g is convex and h is linear, then the feasible region Σ is convex. Indeed, suppose Σis not convex. Then there exist x, y ∈Σ andα∈(0,1) such that z:=αx+(1−α)y /∈Σwhich means eitherg(z)>0orh(z)6= 0. Sincegis convex, 0< g(z) =g(αx+ (1−α))y≤αg(x) + (1−α)g(y)≤0giving a contradiction.

Sincehis linear,06=h(z) =h(αx+ (1−α)y) =αh(x) + (1−α)h(y) = 0which also gives a contradiction. If, in addition,f(z)is convex, then (5.6) is a convex problem.

Linear and quadratic programming problems Problems of the form

minz c^>z s.t. Az+b ≤ 0 Aeqz+beq = 0 called linear programming (LP), and

minz h^>z+1 2z^>Bz s.t. Az+b ≤ 0 Aeqz+beq = 0

with positive semidefinite matrixB, called quadratic programming (QP), are convex problems.

Nonconvex problems

For general NLP, nonlinear equality constraints render a problem nonconvex even iff(z)is convex. A nonconvex problem may have multiple local minimizer increasing the complexity to identify whether the problem has no solution or has a global minimizer. In this case, one can then limit the analysis to a local setting.

The key advantage when (5.6) is a convex problem is given in the following theorem.

Theorem 5.4.2. If the optimization problem (5.6)is convex, then every local minimizer inΣis a global minimizer.

Chapter 5. NLP and sensitivity analysis

Proof. Similar to the proof of Theorem 5.2.2. In this case, the convexity of Σ guarantees that the pointαx+ (1−α)y is feasible for feasible pointsxandy.

Definition 5.4.3. Let z⁰ ∈ Σ and d ∈ Rⁿ. Then dis said to be a descent direction at z⁰ if∇f(z⁰)^>d <0. We define the set

D(z⁰) ={d∈Rⁿ | ∇f(z⁰)^>d <0} as the set of all descent directions atz⁰.

Definition 5.4.4. Letz⁰∈Σandd∈Rⁿ\{0}. If there existsδ >0 such that z⁰+td∈Σ for allt∈[0, δ]

thendis said to be afeasible direction of Σ atz⁰. The set F^Σ(z⁰) ={d∈Rⁿ\{0} | ∃δ >0 s.t. z⁰+td∈Σ∀t∈[0, δ]} contains all feasible directions ofΣatz⁰.

In the following, we define certain cone conditions derived from the linearization of the active constraints.

Definition 5.4.5. Letz⁰ ∈Σ. The set of all linearized feasible directions given by

C^Σ(z⁰) =

d∈Rⁿ

∇hi(z⁰)^>d= 0, i∈Eq

∇gi(z⁰)^>d≤0, i∈ A(z⁰) is called thelinearized feasible cone.

Definition 5.4.6. Let z⁰ ∈Σandd∈Rⁿ. If there exist a sequence{d^k}and a positive sequence{δ^k}such thatz⁰+δ^kd^k ∈Σfor allkwithd^k →dandδ^k →0, then the limiting directiondis called thesequential feasible direction ofΣ atz⁰. The set

S^Σ(z⁰) =

d∈Rⁿ

z⁰+δ^kd^k ∈Σ ∀k d^k→d, δ^k→0

is the set of all sequential feasible directions ofΣatz⁰.

From Definition 5.4.6, settingz^k :=z⁰+δ^kd^k, we obtainz^k →z⁰. In addition, settingδ^k :=kz^k−z⁰kgivesd^k= z^k−z⁰

kz^k−z⁰k →d. Thus,{z^k}is a feasible point sequence with limiting directiond.

Definition 5.4.7. We define thetangent coneofΣatz⁰ T^Σ(z⁰) =S^Σ(z⁰)∪ {0}

Lemma 5.4.8. Let z⁰∈Σ. Ifg, hare differentiable at z⁰, then F^Σ(z⁰)⊆ S^Σ(z⁰)⊆ C^Σ(z⁰)

Proof. See proof in [64, Lemma 8.2.4].

5.4. Constrained optimization and SQP

Theorem 5.4.9. Letz^∗be a local minimizer of (5.6). Iff, g, hare differentiable atz^∗, then

∇f(z^∗)^>d≥0 for all d∈ S^Σ(z^∗) Proof. See proof in [64, Lemma 8.2.5].

Lemma 5.4.10(Restatement of Farkas’ lemma). The equality S:=





 d∈Rⁿ

∇f(z^∗)^>d <0,

∇hi(z^∗)^>d= 0, i∈Eq

∇gi(z^∗)^>d≤0, i∈ A(z^∗)







= ∅

holds if and only if there existλi∈R, i∈Eq andµi≥0, i∈In(z^∗)such that

∇f(z^∗) +X

i∈Eq

λi∇hi(z^∗) + X

i∈In(z^∗)

µi∇gi(z^∗) = 0

Proof. By using Theorem 5.1.2, with v = −d, b = −∇f(z^∗), C = ∇g(z^∗), D=∇h(z^∗),x=µandy=λ.

We next introduce a constraint qualification that ensures that the sequential feasible direction at a solution can be represented by the linearizations of active constraints at that point.

Definition 5.4.11. Given a local solutionz^∗of (5.6) and the index set of active constraintsA(z^∗), linear independence constraint qualification (LICQ) holds if the constraint gradients

∇gi(z^∗),∇hi(z^∗), i∈ A(z^∗) are linearly independent.

Lemma 5.4.12. If LICQ holds atz^∗, thenT^Σ(z^∗) =C^Σ(z^∗).

Proof. See proof in [49, Lemma 12.2].

Now, we are ready to state the first-order necessary condition for (5.6).

Theorem 5.4.13 (First-order necessary condition). If z^∗ is a local minimizer of (5.6)at which LICQ holds, there exists λ^∗∈R^N^e andµ^∗∈R^N^c such that

∇L(z^∗, λ^∗, µ^∗) :=∇f(z^∗) +∇g(z^∗)^>µ^∗+∇h(z^∗)^>λ^∗= 0 (5.8) g(z^∗)≤0, h(z^∗) = 0 (5.9) µ^∗>g(z^∗) = 0, µ^∗≥0 (5.10) Proof. Sincez^∗∈Σ, (5.9) follows. Letd∈ T^Σ(z^∗). Sincez^∗ is a local minimizer, by Definition 5.4.7 and Theorem 5.4.9,∇f(z^∗)^>d≥0. In addition, by LICQ and Lemma 5.4.12,d∈ C^Σ(z^∗). This means that the system







∇f(z^∗)^>d <0

∇hi(z^∗)^>d= 0, i∈Eq

∇gi(z^∗)^>d≤0, i∈In(z^∗)

Chapter 5. NLP and sensitivity analysis

has no solution inRⁿ. Then by Farkas’ Lemma,

∇f(z^∗) +X

i∈Eq

λ^∗_i∇hi(z^∗) + X

i∈In(z^∗)

µ^∗_i∇gi(z^∗) = 0

whereλ^∗_i ≥0, i∈Eq andµ^∗_i ≥0, i∈In(z^∗). Setµ^∗_i = 0fori∈ I(z^∗), then

∇f(z^∗) +∇g(z^∗)^>µ^∗+∇h(z^∗)^>λ^∗= 0

which shows (5.8) and thatµ^∗≥0. Lastly, ifi∈In(z^∗), thengi(z^∗) = 0giving µ^∗>g(z^∗) = 0and if i∈ I(z^∗), since we set µ^∗_i = 0, then µ^∗>g(z^∗) = 0. This shows (5.10).

Conditions (5.8)–(5.10) are called theKarush-Kuhn-Tucker (KKT) condi-tions. This set of conditions is comprised of condition (5.8) called the stationary point condition, (5.9) called the feasibility conditions and (5.10) giving the nonnegativity of the multipliers and complementarity condition.

Next we examine the second-order necessary conditions. To this end, we first refine the definition of the index setA. We define

A^S(z^∗) ={ i∈In(z^∗)|µ^∗_i >0} as theindex set of strongly active constraintsand A^W(z^∗) ={i∈In(z^∗)|µ^∗_i = 0}

as the index set of weakly active constraints. Given a local minimizer z^∗ of (5.6) together with multipliersλ^∗, µ^∗ satisfying (5.8) and (5.10), strict complementarityis said to occur ifA^W(z^∗) =∅.

We now consider the critical cone G^Σ(z^∗) =





 d∈Rⁿ

∇h(z^∗)^>d= 0,

∇gi(z^∗)^>d= 0, i∈ A^S(z^∗)

∇gi(z^∗)^>d≤0, i∈ A^W(z^∗)







(5.11)

We now state the following constrained optimization second-order conditions.

Theorem 5.4.14 (Second-order necessary condition). Ifz^∗ is a local minimizer of (5.6)at which LICQ holds together withλ^∗, µ^∗ satisfying the KKT conditions (5.8)–(5.10), then

d^>∇²L(z^∗, λ^∗, µ^∗)d≥0 for alld∈ G^Σ(z^∗) Proof. See, e.g., proofs of [11, Theorem 4.17] or [64, Theorem 8.3.3]

Theorem 5.4.15 (Second-order sufficient conditions (SOSC)). Ifz^∗ and the multipliersλ^∗, µ^∗ satisfy the KKT conditions (5.8)–(5.10)and

d^>∇L(z^∗, λ^∗, µ^∗)d >0 for all nonzero d∈ G^Σ(z^∗) (5.12) thenz^∗ is a strict local minimizer of (5.6).

Proof. See, e.g., proofs of [11, Theorem 4.18] or [64, Theorem 8.3.4]

5.4. Constrained optimization and SQP

5.4.1 Equality constrained optimization problems

We now adapt the Newton-based method (see Algorithm 5.3.1) that solves unconstrained optimization problems to the constrained setting. We first consider the constrained problem

min f(z)

s.t. h(z) = 0 (5.13)

with only equality constraints.

Assuming LICQ, the KKT conditions for (5.13) read

∇L(z, λ) =∇f(z) +∇h(z)^>λ= 0 linearized system can be written as

F(w^k) +∇^wF(w^k)^>(w−w^k) = 0

is called theKKT ma-trix. Finding an update rule z^k+1 = z^k + ∆z^k andλ^k+1 =λ^k+ ∆λ^k, since Solving the linear system (5.15) allows to computez^k+1andλ^k+1. This gives us the Newton Lagrange method [2] for solving the equality constrained problem (5.13).

Algorithm 5.4.16. (Newton Lagrange method) Choose a starting pointz⁰, λ⁰ and tolerance.

(1) If F(w^k)

< , stop. For z^k, λ^k, solve the linear system (5.15).

(2) Set z^k+1=z^k+ ∆z^k andk=k+ 1.

Since the Algorithm 5.4.16 is applying the root-finding Newton’s Method to F(w) = 0, similar to Algorithm 5.3.1, one can also employ a choice of step length γ^k yielding instead an update rulez^k+1=z^k+γ^k∆z^k.

It is easy to see that ∆z^k

λ^k+1

also happens to be the solution of the quadratic

Chapter 5. NLP and sensitivity analysis

programming problem

min∆z^k ∇f(z^k)^>∆z^k+1

2∆z^k^>∇²L(z^k, λ^k)∆z^k s.t. ∇h(z^k)^>∆z^k+h(z^k) = 0

(5.16) Indeed, in determining the KKT conditions for (5.16), we recover (5.15).

Remark 5.4.17. One can see that solving an arbitrary optimization problem of the form (5.13) by the Newton Lagrange method is equivalent to solving a sequence of quadratic programming problems (5.16) until convergence to the solution.

Similar to Algorithm 5.3.1, Algorithm 5.4.16 can also be adapted to tackle challenges in calculating derivatives and handling large-scale problems resulting in large matrices in order to effectively keep the computational costs to a tolerable level.

5.4.2 Inequality constrained optimization problems

We consider first the QP

minz h^>z+1 2z^>Bz s.t. Az+b ≤ 0

(5.17) whereB is positive semidefinite making the problem convex. The corresponding Lagrangian function is L(z, µ) = h^>z+ 1

2z^>Bz+µ^>b+µ^>Az. The KKT conditions are

∇L(z, µ) =Bz+h+A^>µ = 0

Az+b ≤ 0 (5.18)

µ ≥ 0, (Az+b)^>µ = 0

Supposez^∗ is a global minimizer of (5.17). Now the left-hand side of inequality (5.18) can be decomposed as

A_A A_I

z^∗+

b_A b_I

whereA_Az^∗+b_A= 0represents the active constraints whileA_Iz^∗+b_I <0 the inactive.

Theorem 5.4.18. z^∗ is a global minimizer of (5.17) if and only if there exist index setsAandI and a vectorµ^∗_Asuch that

Bz^∗+h+A^>_Aµ^∗_A = 0 (5.19)

A_Az^∗+b_A = 0 (5.20)

A_Iz^∗+b_I < 0 (5.21)

µ^∗_A ≥ 0 (5.22)

withµ^∗_I= 0 whereµ^∗= µ^∗_A

µ^∗_I

Im Dokument Robust Updated MPC Schemes (Seite 74-81)