Sequential Quadratic Programming and the Trust Region Method of YuanTrust Region Method of Yuan

SOLUTION TECHNIQUES

2.2 Sequential Quadratic Programming and the Trust Region Method of YuanTrust Region Method of Yuan

In this section, we review a trust region algorithm yielding a stationary point for the continuous nonlinear program (2.1). To simplify readability we assume that the box constraints are included in the nonlinear constraintsg(x), see (2.5). It was proposed by Yuan [112] and is closely related to the well-known sequential quadratic programming methods. The most common nonlinear programming algorithms are interior point methods, e.g., W¨achter and Biegler [105] or sequential quadratic programming (SQP) methods, e.g., Schittkowski and Yuan [96]. These nonlinear programming algorithms converge globally towards a stationary point. A global minimum can only be guaran-teed for convex problems, e.g., the continuous relaxation of the convex MINLP (1.6).

Nonlinear programming problems naturally arise during the solution process of convex mixed-integer nonlinear optimization problems (1.6), e.g., as continuous relaxation or by fixing the integer variablesy∈Nⁿⁱ. We focus on the trust region method proposed by Yuan [112], since it is the base for a novel MINLP solution method presented in Section 3.1.

Every SQP method is based on solving a series of continuous quadratic (QP) sub-problems. In iteration k a quadratic subproblem is constructed by linearizing the constraints at the current iterate x^k. The objective function is a quadratic approx-imation of the Lagrangian function L(x, λ, µ) of NLP (2.1), see Definition 2.4 and Definition 2.7 for a suitable choice of the matrixB^kforming the quadratic term in the objective function. The subproblem in iterationk is given by

d∈Rⁿ:

min ∇_xf(x^k)^Td+¹₂d^TB^kd

s.t. g_j(x^k) +∇_xg_j(x^k)^Td = 0, j=1, . . . , m_e, g_j(x^k) +∇_xg_j(x^k)^Td ≥ 0, j=m_e+1, . . . , m.

(2.27)

The solution of QP (2.27) is denoted by d^k. The next iterate x^k+1 is obtained either directly, i.e.,x^k+1 =x^k+d^k, if a trust region constraintd≤ k∆^kkis included, where∆^k denotes the trust region radius. Or it is obtained by a line search, i.e.,x^k+1 =x^k+α^kd^k, with α^k ∈(0, 1] reducing the step-lengthα^k until a sufficient descent with respect to an appropriate merit-function is obtained.

The subproblems of the trust region method of Yuan [112] approximate theL_∞-penalty function introduced in Definition 2.9 instead.

Definition 2.9. TheL_∞-penalty functionP_σ(x)associated with the penalty parameter σ∈R⁺ and x ∈Rⁿ is given by

P_σ(x) = f(x) +σkg(x)⁻k_∞, (2.28)

with

g(x)⁻ := (g₁(x)⁻, . . . , g_m(x)⁻)^T (2.29) g_j(x)⁻ :=

g_j(x), j=1, . . . , m_e

min{g_j(x), 0}, j=m_e+1, . . . , m. (2.30) In each iterationkthe trust region method constructs a model of the original problem based on the current iterate x^k. Typically, the accuracy and therefore the quality of the model decreases with an increasing distance fromx^k. To control the quality of the model, the maximal distance from the current iterate is restricted by the trust region radius ∆^k.

In iteration k Yuan’s trust region algorithm approximates the L_∞-penalty function leading to the following problem:

d∈Rⁿ :

min Φ^k(d)

s.t. kdk_∞ ≤ ∆^k,

(2.31)

where the objective function is given by

Φ^k(d) := ∇x f(x^k)^Td+1

2d^TB^kd

+ σ^kk(g(x^k) + [∇_x g(x^k)]^Td)⁻k_∞.

(2.32)

Note, that problem (2.31) is equivalent to a quadratic program, see Yuan [112]. There-fore, it can be solved efficiently by any QP solver, e.g., QL of Schittkowski [94]. The optimal solution of subproblem (2.31) is denoted by d^k. It provides a search direction to determine the next iterate, i.e.,

x^k+1 = x^k+d^k. (2.33)

Problem (2.31) predicts a decrease in theL_∞-penalty function. We define

Pred^k:= Φ^k(0) −Φ^k(d^k). (2.34) If the subsequent stopping criterion

d^k=0 (2.35)

and certain assumptions hold, the current iterate x^k is one out of three stationary points, that are defined below. One of these stationary points corresponds to a KKT-point of NLP (2.1), see Yuan [112].

To reduce the constraint violation a penalty parameter is introduced, that penalizes the violation of constraints. If feasibility is not sufficiently improved, the penalty

parameterσ^khas to be increased. The increase of the penalty parameter is controlled by a parameterδ^k ∈R with δ^k> 0.

Comparing the predicted reduction given by (2.34), with the realized decrease of the L_∞-penalty function, the quality of model associated with problem (2.31) can be judged. The corresponding measure is denoted by

r^k := P_σ^k(x^k) −P_σ^k(x^k+d^k)

Pred^k . (2.36)

If the model corresponding to problem (2.31) is of poor quality, e.g.r^k≤0, the trust region radius has to be decreased. Otherwise, the trust region radius can be extended.

Subsuming all components we can specify the trust region method of Yuan [112].

Algorithm 2.1. (Trust Region Method of Yuan)

1. Given x⁰ ∈ Rⁿ, ∆⁰ > 0, B⁰ ∈ R^n×n symmetric and positive definite, δ⁰ > 0, σ⁰ > 0 and k:=0.

Evaluate the functions f(x⁰) and g(x⁰) and determine gradients ∇_xf(x⁰) and

∇_xg(x⁰).

2. Solve subproblem (2.31) determining d^k.

If stopping criterion (2.35) is satisfied, then STOP.

Else evaluate f(x^k+d^k), g(x^k+d^k) and P_σ^k(x^k+d^k).

Evaluate the quality of the model with respect to theL_∞-penalty function byr^k (2.36).

3. Check for a descent with respect to the L_∞-penalty function.

If r^k > 0, then update iterate, i.e., set x^k+1 :=x^k+d^k. Else set x^k+1 :=x^k, B^k+1:= B^k, ∆^k+1:= ¹₄kd^kk_∞, k:= k+1.

GOTO Step 2.

4. Adapt trust region radius:

∆^k+1 :=





max{∆^k, 4kd^kk_∞}, if r^k > 0.9,

∆^k, if 0.9≥r^k≥0.1, min{¹₄∆^k,¹₂kd^kk_∞}, if r^k < 0.1.

(2.37)

Choose B^k+1, such that B^k+1 is any symmetric, positive definite matrix.

Penalty Update:

Φ^k(0) −Φ^k(d^k) ≤ σ^kδ^kmin

∆^k,kg(x^k)⁻k_∞ , (2.38) then set

σ^k+1 := 2σ^k, (2.39)

δ^k+1 := 1

4δ^k. (2.40)

Else set

σ^k+1 := σ^k, (2.41)

δ^k+1 := δ^k. (2.42)

5. Compute ∇_x f(x^k+1) and [∇_x g(x^k+1)], set k:=k+1 and GOTO Step 2.

The trust region method of Yuan converges towards a stationary point, if Assump-tion 2.1 is satisfied, see Yuan [112] and Jarre and Stoer [66].

Assumption 2.1. 1. f(x) and g_j(x), j=1, . . . , m are continuously differentiable

∀x∈Rⁿ.

2. The sequences {x^k} and {B^k} are bounded ∀ k.

Moreover, we review the theoretical properties of Algorithm 2.1, according to Yuan [112].

The limit of kg(x^k)⁻k_∞ exists, if the sequence of penalty parameter {σ^k} goes to in-finity.

Lemma 2.1. If Assumption 2.1 holds and σ^k→ ∞, then lim

k→∞kg(x^k)⁻k_∞ exists.

Proof. See Yuan [112].

Algorithm 2.1 converges towards one of three different stationary points, see Yuan [112].

These are either a KKT-point for problem (2.1) as specified in Theorem 2.1. Or it is a infeasible stationary point or a singular stationary point.

Yuan [112] defines an infeasible stationary point as follows.

Definition 2.10. Let F be the feasible region of NLP (2.1). x^∗ ∈ Rⁿ\F is called an infeasible stationary point, if

1. kg(x^∗)⁻k_∞ > 0 2. min

d∈Rⁿk(g(x^∗) + [∇_x g(x^∗)]^Td)⁻k_∞=kg(x^∗)⁻k_∞

holds.

An infeasible stationary point is a minimizer of the infinity norm of the linearized constraints, see Yuan [112]. In contrast to an infeasible stationary point, a singular stationary point is feasible. It is defined as follows.

Definition 2.11. LetFbe the feasible region of NLP (2.1).x^∗ ∈Fis called a singular stationary point, if the following conditions hold.

1. kg(x^∗)⁻k_∞=0.

2. There exists a sequence{^x^k}converging towards x^∗ such thatkg(^x^k)⁻k_∞> 0and

klim→∞ min

kdk_∞≤kg(^x^k)⁻k_∞

k(g(^x^k) + [∇_x g(^x^k)]^Td)⁻k_∞

kg(^x^k)⁻k_∞ =1. (2.43) In addition a stationary point is defined as follows.

Definition 2.12. Let F be the feasible region of NLP (2.1). x^∗ ∈ F is called a sta-tionary point, if the following conditions hold.

1. kg(x^∗)⁻k_∞=0.

2. If

∇_xg_j(x^∗)^Ts ≥ 0, j ∈J> (2.44)

∇_xg_j(x^∗)^Ts = 0, j ∈J= (2.45) hold for s∈Rⁿ, then

∇_xf(x^∗)^Ts ≥ 0 (2.46)

is satisfied

Note, that a stationary point according to Definition 2.12 corresponds to a KKT-point of NLP (2.1), see Jarre and Stoer [66].

The following theorem shows, that the trust region method of Yuan converges towards a KKT point, if the sequence of penalty parameters {σ^k}is bounded.

Theorem 2.3. If Assumption 2.1 holds and the sequence of penalty parameters {σ^k} is bounded, one member of the sequence of iterates {x^k} is a stationary point speci-fied in Definition 2.12 or the sequence {x^k} generated by Algorithm 2.1 possesses an accumulation point that is a stationary point.

Proof. See Yuan [112].

If the sequence of penalty parameters {σ^k} is unbounded and the constraints are violated in the limit, Yuan’s trust region method converges towards an infeasible stationary point given by Definition 2.10.

Lemma 2.2. If Assumption 2.1 holds, lim

k→∞σ^k =∞and lim

k→∞kg(x^k)⁻k_∞> 0, then the sequence{x^k}possesses an infeasible stationary point of NLP (2.1)as an accumulation point.

Proof. See Yuan [112].

If the sequence of penalty parameters {σ^k}is unbounded but the constraint violation goes to zero, Yuan’s trust region method converges towards a singular stationary point as specified in Definition 2.11.

Lemma 2.3. If Assumption 2.1 holds, lim

k→∞σ^k = ∞ and lim

k→∞kg(x^k)⁻k_∞ = 0, then the sequence {x^k} generated by Algorithm 2.1 possesses a singular stationary point of NLP (2.1) as an accumulation point.

Proof. See Yuan [112].

2.3 Review on Solution Techniques for Convex

Im Dokument On Efficient Solution Methods for Mixed-Integer Nonlinear and Mixed-Integer Quadratic Optimization Problems (Seite 31-36)