MIQP-Supported Linear Outer ApproximationApproximation

SOLUTION METHOD

3.1 MIQP-Supported Linear Outer ApproximationApproximation

In this section we propose a new outer approximation algorithm, which incorporates mixed-integer search steps obtained from the solution of strictly convex MIQP prob-lems. The algorithm is designed such that convergence properties can be established under realistic conditions for the convex MINLP problem given by

x ∈Rⁿ^c, y ∈Nⁿⁱ : min f(x, y)

s.t. g_j(x, y) ≥ 0, j∈J.

(3.1)

Within this chapter the box-constraints are included in the constraints gj, j ∈ J. Therefore we denote the number of the original nonlinear constraints by ˜m for the moment and extend these constraints by n_c upper and n_c lower bounds on the

con-tinuous variables and n_i upper andn_i lower bounds on the integer variables, i.e., gm+i˜ (x, y) := −x_i+e^T_ix_u ≥ 0, ∀i∈{1, . . . , n_c},

g_m+n_˜ _c_+i(x, y) := x_i−e^T_ix_l ≥ 0, ∀i∈{1, . . . , n_c}, gm+2n˜ c+i(x, y) := −y_i+e^T_iy_u ≥ 0, ∀i∈{1, . . . , n_i}, g_m+2n_˜ _c_+n_i_+i(x, y) := y_i−e^T_iy_l ≥ 0, ∀i∈{1, . . . , n_i}.

(3.2)

Nevertheless, we still denote the feasible domains induced by the bounds on the con-tinuous and integer variables by Xand Y, i.e.,

X := {x ∈Rⁿ^c :g_m+i_˜ (x, y)≥0, i ∈{1, . . . , 2n_c}}, (3.3) Y := {y∈Nⁿⁱ :gm+2n˜ c+i(x, y)≥0, i∈{1, . . . , 2n_i}}. (3.4) To be consistent with the notation of the previous chapters, we define the number of constraints to be m:= m˜ +2n_c+2n_i and extend the index set J accordingly. Note, that the relaxation of the setY is denoted by Y_R.

The requirements for proving convergence are subsumed in Assumption 3.1 later on in this chapter. The main restriction is, that the objective function f(x, y) is required to be convex and the constraints g_j(x, y) need to be concave for all j ∈ J on the relaxation of the feasible domain described by X×Y_R.

Furthermore, we denote by M_x an upper bound on the distance between two values

x, ˜x ∈X, i.e., k^x−˜xk₂ ≤ M_x holds for all x,^ ˜x ∈X. From a practical point of view, Mx corresponds to the maximal distance between two bounds, i.e.,

M_x := √

n_c max

i∈{1,...,nc}{(x_u)_i− (x_l)_i}. (3.5) As presented in Section 2.5, linear outer approximation algorithms guarantee global optimality by the successive solution of MILP master problems. The master problem is a linear relaxation of the original convex MINLP, which is refined in each iteration yielding a monotone increasing sequence of lower bounds on the optimal objective value of MINLP (3.1). The new algorithm to be proposed in this section guarantees convergence properties for convex MINLPs by the master problem in a similar way.

As a consequence, we have to gain the desired increase in efficiency and robustness by modifying the remaining part of the linear outer approximation algorithm. The basic idea is to look for improving integer values instead of fixing the integer variables. This implies that the integer variables are allowed to vary not only during the solution of the master problem. Therefore, within this chapter both the continuous and the integer variables depend on the iteration index k.

As motivated in Section 2.5, it is profitable to apply the trust region method of Yuan, i.e., Algorithm 2.1, within a linear outer approximation method such as Algorithm 2.2

for solving NLP(y^k) for a given iteratey^k ∈Y. In the reminder of this chapter NLP(y^k) is given by

x ∈Rⁿ^c :

min f(x, y^k)

s.t. g_j(x, y^k) ≥ 0, ∀ j∈J.

(3.6)

As established by Corollary 2.3 and 2.5 in Section 2.5, applying Yuan’s trust region method has the advantage, that we need not distinguish between solving NLP(y^k) given by (3.6) and the feasibility problem F(y^k) for some fixed y^k ∈ Y, if F(y^k) is given by

x ∈Rⁿ^c, η∈R+ : min η

s.t. gj(x, y^k) +η ≥ 0, ∀j∈J.

(3.7)

Note, that otherwise, F(y^k) needs to be solved in addition, whenever an infeasible integer valuey /∈V is encountered, where the set V is given by

V = {y∈Y :∃ x∈X with g_j(x, y)≥0, ∀ j∈J}. (3.8) See Fletcher and Leyffer [50] for a more general formulation of F(y^k).

In order to identify solutions of F(y^k), the subsequent linear program, denoted by LPF(x^k, y^k), is considered, see Corollary 2.4 relating the KKT-conditions of F(y^k) and LPF(x^k, y^k). It was already introduced in (2.90).

(d_F)_x ∈Rⁿ^c, η∈R⁺ : min η

s.t. g_j(x^k, y^k) +∇_xg_j(x^k, y^k)^T(d_F)_x+η ≥ 0, ∀j∈J.

(3.9)

The solution of LPF(x^k, y^k) given by (3.9) is denoted by ((d^k_F)_x, η^k). In addition, we introduce d_F :=

(d_F)_x 0

, and extend the solution ((d^k_F)_x, η^k) to (d^k_F, η^k), with d^k_F ∈Rⁿ given by d^k_F :=

(d^k_F)x

Solutions of NLP(y^k) also need to be identified. This task is directly established by Yuan’s trust-region algorithm, see the subsequent Corollary 3.1:

Applying trust region Algorithm 2.1 for solving NLP(y^k) given by (3.6) yields the following continuous subproblem at the current iterate (x^k, y^k). It approximates the L_∞-penalty function of the continuous nonlinear program NLP(y^k), see Section 2.2.

(dc)x ∈Rⁿ^c :

min Φ^k_c((d_c)_x)

s.t. k(d_c)_xk_∞ ≤ ∆^k_c,

(3.10)

where the objective function is given by

Φ^k_c((d_c)_x) := ∇_xf(x^k, y^k)^T(d_c)_x+1

2(d_c)^T_xB^k_c(d_c)_x + σ^kk(g(x^k, y^k) + [∇_xg(x^k, y^k)]^T(d_c)_x)⁻k_∞.

(3.11) B^k_c ∈Rⁿ^c^×n^c is the upper left sub-matrix of then×n-matrix B^k, which is symmetric and positive definite. B^k is possibly a Quasi-Newton approximation of the Hessian matrix of the Lagrangian function, see Definition 2.4 and 2.7.σ^k ∈R⁺ is the penalty parameter of the L_∞ penalty function. (.)⁻ is defined analogue to Definition 2.9 and

∆^k_c ∈ R+ is the trust region radius, i.e., it is equivalent to ∆^k in Algorithm 2.1.

Furthermore, we define d_c :=

(d_c)_x 0

∈ Rⁿ^c⁺ⁿⁱ, where 0 is the vector of all zeros of dimension n_i. According to Section 2.2 the solution of (3.10) is denoted by (d^k_c)_x and therefore d^k_c is defined by d^k_c :=

(d^k_c)x

∈Rⁿ^c⁺ⁿⁱ, where 0 is the vector of all zeros of dimension n_i.

Problem (3.10) is equivalent to the quadratic program

(d_c)_x ∈Rⁿ^c, η_c ∈R+ : (3.12)

min ∇_xf(x^k, y^k)^T(d_c)_x+¹₂(d_c)^T_xB^k_c(d_c)_x+σ^kη_c

s.t. η_c+g_j(x^k, y^k) +∇_xg_j(x^k, y^k)^T(d_c)_x ≥ 0, j=1, . . . , m, k(d_c)_xk_∞ ≤ ∆^k_c,

see Yuan [112]. QP (3.12) is denoted by QP(x^k, y^k), since it depends on the model-ing point (x^k, y^k) determining function and gradient values. Furthermore, (3.12) is a strictly convex quadratic program and therefore can be solved efficiently.

The subsequent corollary relates the KKT-conditions of a KKT-point(η^k_c,(d^k_c)_x,(λ^k_c, λ^k_η_c)) of QP(x^k, y^k) given by (3.12) to those of NLP(y^k) given by (3.6).

Corollary 3.1. For some y^k∈Y and a sufficiently large value of the penalty param-eter σ^k, (¯x_y^k,¯λ) is a KKT-point of NLP(y^k) given by (3.6), if and only if

(η^k_c,(d^k_c)_x,(λ^k_c, λ^k_η

c)) is a KKT-point of QP(¯x_yk, y^k) given by (3.12) with (d^k_c)_x = 0, η^k_c =0.

Proof. The KKT-conditions of QP(¯x_y^k, y^k) given by (3.12) for the KKT-point (η^k_c,(d^k_c)_x,(λ^k_c, λ^k_η_c))are determined by the subsequent formulas, where the trust region constraint

k(d_c)_xk_∞ ≤ ∆^k_c (3.13)

is neglected, since it is not active for (d^k_c)x = 0 and ∆^k_c > 0. Note, that λ^k_c contains the Lagrangian multipliers (λ^k_c)_j, j∈J associated with the constraints

η_c+g_j(x^k, y^k) +∇_xg_j(x^k, y^k)^T(d_c)_x ≥ 0, j∈J. (3.14)

The Lagrangian multiplierλ^k_η_c is associated with the non-negativity condition for η_c. We get the following KKT-conditions:

∇xf(¯x_y^k, y^k) +B^k_c(d^k_c)x− Xm

j=1

(λ^k_c)j∇xgj(¯x_y^k, y^k) = 0,

σ^k = Xm

j=1

(λ^k_c)_j+λ^k_η

η^k_c+g_j(¯x_y^k, y^k) +∇_xg_j(¯x_y^k, y^k)^T(d^k_c)_x ≥ 0, ∀j∈J, (3.15) η^k_c ≥ 0,

(λ^k_c)_j ≥ 0, ∀j∈J, λ^k_η_c ≥ 0,

(λ^k_c)_j(η^k_c+g_j(¯x_y^k, y^k) +∇_xg_j(¯x_y^k, y^k)^T(d^k_c)_x) = 0, ∀j∈J, η^k_cλ_η_c = 0.

Case 1 ’⇒’: (¯x_y^k,¯λ) is a KKT-point of NLP(y^k) given by (3.6):

⇒ (η^k_c,(d^k_c)x,(λ^k_c, λ^k_η_c)) with η^k_c := 0, (d^k_c)x := 0 and (λ^k_c)j := ¯λj, j ∈ J and λ^k_η_c := σ^k−P

j∈J

¯λ_j satisfies (3.15), since (¯x_yk,¯λ) satisfies the subsequent KKT-condition of NLP(y^k):

∇xf(¯x_y^k, y^k) −P

j∈J

¯λj∇xgj(¯x_y^k, y^k) = 0,

gj(¯x_y^k, y^k) ≥ 0, j∈J,

¯λ_jg_j(¯x_y^k, y^k) = 0, j∈J,

¯λ_j ≥ 0, j∈J.

(3.16)

Case 2 ’⇐’: (η^k_c,(d^k_c)_x,(λ^k_c, λ^k_η_c))is a KKT-point of QP(¯x_y^k, y^k) withη^k_c :=0,(d^k_c)_x :=

0 satisfying (3.15):

⇒ (¯x_yk,¯λ) with ¯λ_j := (λ^k_c)_j, j ∈ J satisfies the KKT-conditions of NLP(y^k) stated in (3.16) due to (3.15).

A property of the linear outer approximation Algorithm 2.2 is, that it terminates after a finite number of iterations due to the finiteness of the set Y. In principle, this can only be assured, if the solution process of NLP(y^k) given by (3.6) or F(y^k) given by

(3.7) respectively, also terminates after a finite number of iterations in each iteration of a linear outer approximation method. This topic is usually neglected in existing literature. To be able to prove finite termination of the algorithm to be proposed in this section, we introduce a ε-stationary-point of NLP(y^k) and F(y^k) for fixed y^k ∈Y in the subsequent definition.

Definition 3.1. A point (x^k, λ^k) with y^k ∈ V is a ε-stationary point of NLP(y^k) subject to a tolerance ε > 0, if the following approximations of the KKT-conditions of NLP(y^k) are satisfied: subject to a tolerance ε > 0, if the following approximations of the KKT-conditions of F(y^k) are satisfied:

The subsequent corollary establishes the relationship of a ε-KKT point of NLP(y^k) introduced in Definition 3.1 and the solution of subproblem QP(x^k, y^k).

Corollary 3.2. Let ((d^k_c)_x, η^k_c,(λ^k_c, λ^k_η_c)) be a KKT-point of QP(x^k, y^k) with

k(d^k_c)_xk₂ ≤ ˜ε, (3.19)

η^k_c ≤ ˜ε (3.20)

and ˜ε > 0. Furthermore let

k(d^k_c)_xk_∞ < ∆^k_c (3.21)

hold, i.e., the trust region constraint of QP(x^k, y^k) can be neglected. Then (x^k, λ^k_c) is a ε-stationary point of NLP(y^k) according to Definition 3.1 subject to an accuracy ε satisfying

ε ≥ max{M_B˜ε,(1+M∇g)˜ε,(1+M∇g)M_λ˜ε} (3.22) with kB^k_ck₂≤M_B, k∇_xg_j(x^k, y^k)k₂ ≤M∇g, ∀j∈J and |(λ^k_c)_j|≤M_λ, ∀j∈J.

Proof. Optimality: Since ((d^k_c)_x, η^k_c,(λ^k_c, λ^k_η_c)) is a KKT-point of QP(x^k, y^k)

∇xf(x^k, y^k) +B^k_c(d^k_c)x−X

j∈J

(λ^k_c)j∇xgj(x^k, y^k) = 0. (3.23) holds. As a consequence we obtain

k∇_xf(x^k, y^k) −P

j∈J

(λ^k_c)_j∇_xg_j(x^k, y^k)k₂ = k−B^k_c(d^k_c)_xk₂

≤ kB^k_ck₂k(d^k_c)_xk₂

≤ M_B˜ε.

(3.24)

Primal Feasibility: Since ((d^k_c)x, η^k_c,(λ^k_c, λ^k_η_c)) is a KKT-point of QP(x^k, y^k)

η^k_c+g_j(x^k, y^k) +∇_xg_j(x^k, y^k)^T(d^k_c)_x ≥ 0, ∀j∈J (3.25) holds. As a consequence we obtain ∀j∈J

g_j(x^k, y^k) ≥ −η^k_c−∇_xg_j(x^k, y^k)^T(d^k_c)_x

≥ −η^k_c−|∇_xg_j(x^k, y^k)^T(d^k_c)_x|

≥ −η^k_c−k∇_xg_j(x^k, y^k)^Tk₂k(d^k_c)_xk₂

≥ −˜ε−M∇g˜ε

≥ −(1+M∇g)˜ε.

(3.26)

Dual Feasibility: Since ((d^k_c)_x, η^k_c,(λ^k_c, λ^k_η_c)) is a KKT-point of QP(x^k, y^k)

(λ^k_c)_j ≥ 0, ∀j∈J (3.27)

holds.

Complementarity: Since ((d^k_c)x, η^k_c,(λ^k_c, λ^k_η_c)) is a KKT-point of QP(x^k, y^k) (λ^k_c)_j η^k_c +g_j(x^k, y^k) +∇_xg_j(x^k, y^k)^T(d^k_c)_x

= 0, ∀j∈J (3.28)

holds. We obtain ∀j∈J

|(λ^k_c)_jg_j(x^k, y^k)| = |(λ^k_c)_j −η^k_c−∇_xg_j(x^k, y^k)^T(d^k_c)_x

≤ |(λ^k_c)_j||−η^k_c −∇_xg_j(x^k, y^k)^T(d^k_c)_x|

≤ |(λ^k_c)_j||η^k_c+∇_xg_j(x^k, y^k)^T(d^k_c)_x|

≤ |(λ^k_c)_j| |η^k_c|+|∇_xg_j(x^k, y^k)^T(d^k_c)_x|

≤ |(λ^k_c)_j| |η^k_c|+k∇_xg_j(x^k, y^k)^Tk₂k(d^k_c)_xk₂

≤ M_λ(˜ε+M∇g˜ε)

≤ (1+M∇g)M_λ˜ε.

(3.29)

As a consequence(x^k, λ^k_c)is aε-stationary point of NLP(y^k) according to Definition 3.1 subject to an accuracy εsatisfying

ε ≥ max{M_B˜ε,(1+M∇g)˜ε,(1+M∇g)M_λ˜ε}. (3.30)

The subsequent corollary establishes the relationship of a ε-KKT point of F(y^k) in-troduced in Definition 3.1 and the solution of subproblem LPF(x^k, y^k).

Corollary 3.3. Let ((d^k_F)x, η^k,(λ^k_F, λ^k_η)) be a KKT-point of LPF(x^k, y^k) with

k(d^k_F)_xk₂ ≤ ˜ε, (3.31)

η^k > ˜ε (3.32)

and ˜ε > 0. Then (x^k, η^k,(λ^k_F, λ^k_η)) is a ε-stationary point of F(y^k) according to Defi-nition 3.1 subject to an accuracy ε satisfying

ε ≥ max{M_∇g˜ε, M_∇gM_λ˜ε} (3.33) with k∇_xg_j(x^k, y^k)k₂ ≤M∇g, ∀j∈J and |(λ^k_F)_j|≤M_λ, ∀j∈J.

Proof. Optimality: Since ((d^k_F)_x, η^k,(λ^k_F, λ^k_η))is a KKT-point of LPF(x^k, y^k) 0

−X

j∈J

(λ^k_F)_j

∇_xg_j(x^k, y^k) 1

− 0

λ^k_η

= 0 (3.34)

holds.

Primal Feasibility: Since((d^k_F)_x, η^k,(λ^k_F, λ^k_η)) is a KKT-point of LPF(x^k, y^k)

g_j(x^k, y^k) +∇_xg_j(x^k, y^k)^T(d^k_F)_x+η^k ≥ 0, ∀j∈J, (3.35)

holds. As a consequence we obtain

gj(x^k, y^k) +η^k ≥ −∇xgj(x^k, y^k)^T(d^k_F)x

≥ −|∇_xg_j(x^k, y^k)^T(d^k_F)_x|

≥ −k∇_xgj(x^k, y^k)k₂k(d^k_F)xk₂

≥ −M∇g˜ε, ∀j∈J.

(3.36)

In addition

η^k ≥ 0 (3.37)

is satisfied due to (3.32).

Dual Feasibility: Since ((d^k_F)_x, η^k,(λ^k_F, λ^k_η)) is a KKT-point of LPF(x^k, y^k) (λ^k_F)j ≥ 0, ∀j∈J,

λ^k_η ≥ 0

(3.38) holds.

Complementarity: Since ((d^k_F)_x, η^k,(λ^k_F, λ^k_η)) is a KKT-point of LPF(x^k, y^k) g_j(x^k, y) +∇_xg_j(x^k, y)^T(d^k_F)_x+η^k

(λ^k_F)_j = 0, ∀j∈J, η^kλ^k_η = 0

(3.39) holds. We obtain

| g_j(x^k, y) +η^k

(λ^k_F)_j| = |− ∇_xg_j(x^k, y)^T(d^k_F)_x (λ^k_F)_j|

≤ k∇xgj(x^k, y)k2k(d^k_F)xk2|(λ^k_F)j|

≤ M∇g˜εM_λ, ∀j∈J.

(3.40)

As a consequence (x^k, η^k,(λ^k_F, λ^k_η)) is a ε-stationary point of F(y^k) according to Defi-nition 3.1 subject to an accuracyε satisfying

ε ≥ max{M∇g˜ε, M∇gMλ˜ε} (3.41)

To ease the readability, the definition of the master problem of the linear outer approx-imation method introduced in Chapter 2 is repeated. To point out the dependencies,

it is denoted by MILP(T_ε^k, S^k_ε,^η^k, ε_OA), where T_ε^k, S^k_ε, and η^^k are specified below.

The sets T^k and S^k introduced in (2.71) and (2.72) need to be adapted based on Definition 3.1 yielding Similar to well-known linear outer approximation methods, such as Algorithm 2.2, the sets T_ε^k andS^k_ε are updated, such that they contain one out of the infinitely many ε-stationary points of each NLP(yⁱ) given by (3.6) with i≤kor F(y^j) given by (3.7) with j≤k, that was obtained in previous iterations.

^ variables and for upper and lower bounds on integer variables as well.

Up to now we motivated the part of the new algorithm, that is derived from linear outer approximation methods, such as Algorithm 2.2. As mentioned at the beginning

of this chapter, we want to combine the linear outer approximation approach with ideas implemented in MISQP, which is reviewed in Section 2.10. Since the algorith-mic concept of MISQP is based on sequence of mixed-integer quadratic approxima-tions, we focus now on the integration of a MIQP approximation into a linear outer approximation approach.

The L_∞-penalty function corresponding to NLP(y) given by (3.6) can also be associ-ated with the continuous relaxation of MINLP (3.1), which is given by

x ∈Rⁿ^c, y ∈Rⁿⁱ : min f(x, y)

s.t. g_j(x, y) ≥ 0, ∀ j∈J.

(3.47)

Inspired by MISQP, we apply the algorithm of Yuan, see Yuan [112], to solve the continuous relaxation of MINLP (3.1) and replace the corresponding continuous sub-problems by a mixed-integer formulation. This mixed-integer problem depends on the current iteration point (x^k, y^k) and is given by

d_i ∈Rⁿ^c ×Nⁿⁱ : min Φ^k_i(d_i)

s.t. kd_ik_∞ ≤ ∆^k_i,

(3.48)

where the objective function is given by

Φ^k_i(di) := ∇_x,yf(x^k, y^k)^Tdi+ ¹₂d^T_iB^kdi

+ σ^kk(g(x^k, y^k) + [∇_x,yg(x^k, y^k)]^Td_i)⁻k_∞,

(3.49)

with di =

(di)x

(d_i)_y

and (di)x ∈ Rⁿ^c and (di)y ∈ Nⁿⁱ. B^k ∈ R^n×n is a symmetric and positive definite matrix, possibly a Quasi-Newton approximation of the Hessian matrix of the Lagrangian function, see Definition 2.4 and 2.7.σ^k∈R⁺ is the penalty parameter of the L_∞-penalty function. (.)⁻ is defined analogue to Definition 2.9 and

∆^k_i ∈R+ is the trust region radius.

Analogue to QP(x^k, y^k), problem (3.48) is equivalent to the mixed-integer quadratic program denoted by MIQP(x^k, y^k)

d_i ∈Rⁿ^c ×Nⁿⁱ, η_i ∈R+ : (3.50)

min ∇x,yf(x^k, y^k)^Tdi+¹₂d^T_iB^kdi+σ^kηi

s.t. η_i+g_j(x^k, y^k) +∇_x,yg_j(x^k, y^k)^Td_i ≥ 0, j=1, . . . , m, kd_ik_∞ ≤ ∆^k_i.

The solution of (3.50) is denoted by (d^k_i, η^k_i). The search direction with respect to the integral variables y^k ∈ Y is restricted to integral values, i.e., (d_i)_y ∈ Nⁿⁱ. As a consequence, integrality is satisfied fory^k+ (d^k_i)y, i.e., y^k+ (d^k_i)y ∈Nⁿⁱ.

The main idea of the new algorithm is to compare the search step determined by MIQP(x^k, y^k) with that determined by QP(x^k, y^k) with respect to the value of the L_∞ -penalty function and to choose the better one, i.e., the one with a lower merit function value. As a consequence, we define an improving mixed-integer search direction as follows.

Definition 3.2. The solution d^k_i ∈Rⁿ^c ×Nⁿⁱ of MIQP(x^k, y^k) given by (3.50) is an improving mixed-integer search direction, if it satisfies the following conditions:

P_σk(x^k, y^k) −P_σk(x^k+ (d^k_i)_x, y^k+ (d^k_i)_y)

Φ^k_c(0) −Φ^k_c((d^k_c)_x) ≥ 0.1, (3.51) 2.

P_σk(x^k+ (d^k_i)_x, y^k+ (d^k_i)_y) < P_σk(x^k+ (d^k_c)_x, y^k). (3.52) (d^k_c)_x ∈Rⁿ^c is part of the solution (η^k_c,(d^k_c)_x) of QP(x^k, y^k) given by (3.12). P_σ^k(x, y) denotes the L_∞-penalty function with respect to the penalty parameter σ^k, see Defini-tion 2.9. Furthermore, Φ^k_c is defined by (3.11).

Definition 3.2 motivates an extension of Step 2 of the linear outer approximation method described by Algorithm 2.2, that is based on two models. The first model corresponds to the continuous quadratic problem (3.12) and is called the continuous model. It is equivalent to the subproblem, that arises during the solution of NLP(y^k) by the trust region method given by Algorithm 2.1 in some iteration of a linear outer approximation method such as Algorithm 2.2. The second model represented by MIQP (3.50) is called the mixed-integer model. Analogue to MISQP, see Section 2.10, it is a mixed-integer quadratic approximation derived from MINLP (3.1).

We introduce and motivate the parameters of the algorithm. The notation of the parameters is chosen according to Yuan’s trust region method given by Algorithm 2.1.

In the sequel of this section, the parameters that correspond to the continuous model associated with problem (3.12) are indexed byc, while parameters of the mixed-integer model corresponding to problem (3.50) are indexed by i. Therefore, ∆^k_c denotes the trust region radius for the continuous model, while ∆^k_i is the trust region radius for the mixed-integer model. σ^k denotes the penalty parameter associated with the L_∞ -penalty function. As soon as the -penalty parameter is larger than an upper bound σ¯ ∈R+, the determination of mixed-integer quadratic search steps is omitted. Note, that only one penalty parameter is necessary, since the same L_∞-penalty function is associated with the continuous model represented by problem (3.12) and the mixed-integer model associated with problem (3.50).

The reduction of the merit function predicted by the corresponding model, which is either QP(x^k, y^k) given by (3.12) or MIQP(x^k, y^k) given by (3.50), is given by

Φ^k_c(0) −Φ^k_c((d^k_c)_x) or Φ^k_i(0) −Φ^k_i(d^k_i),

respectively, whereΦ^k_c is defined in (3.11) andΦ^k_i is specified in (3.49). The reduction obtained by the solution ((d^k_c)x, η^k_c) of QP(x^k, y^k) and (d^k_i, η^k_i) of MIQP(x^k, y^k), see (3.12) and (3.50), with respect to the penalty function P_σk, see Definition 2.9, can be evaluated by

P_σ^k(x^k, y^k) −P_σ^k(x^k+ (d^k_c)_x, y^k) or

P_σk(x^k, y^k) −P_σk(x^k+ (d^k_i)_x, y^k+ (d^k_i)_y).

By comparing both quantities introducingr^k_c and r^k_i r^k_c := P_σ^k(x^k, y^k) −P_σ^k(x^k+ (d^k_c)x, y^k)

Φ^k_c(0) −Φ^k_c((d^k_c)_x) , (3.53) r^k_i := P_σ^k(x^k, y^k) −P_σ^k(x^k+ (d^k_i)_x, y^k+ (d^k_i)_y)

Φ^k_i(0) −Φ^k_i(d^k_i) , (3.54) the precision of the predicted reduction of the merit function of the continuous and the integer model can be measured.

Based on these ideas, we propose an extension of a linear outer approximation method.

Since the algorithm relies on the successive solution of mixed-integer quadratic pro-gramming problems (3.50), it is called MIQP-supported linear outer approximation (MIQPSOA).

Before we describe MIQPSOA in detail, we provide a brief overview to ease the read-ability and understanding. The brief outline is aligned with Figure 3.1, which provides a graphical representation of MIQPSOA and its components. As already described, MIQPSOA is based on the trust region method of Yuan, see Algorithm 2.1. The cor-responding components are represented in green in Figure 3.1. Combining the green and blue components, MIQPSOA yields a linear outer approximation method, such as Algorithm 2.2. The red-colored components represent the new components, that are motivated by MISQP, see Section 2.10. In addition the algorithm possesses some coordination and decision steps.

Step 1: Within the Initialization the algorithmic parameters including the toler-ance ε_OA and the sets T_ε⁻¹, S⁻¹_ε as well as the best known solution and the corresponding objective value f^∗ are initialized. Furthermore, a starting point (x⁰, y⁰) ∈ X×Y together with the corresponding function and gradient values is provided.

Step 2: After the initialization is finished, the internal iteration loop starts by solving QP(x^k, y^k), wherek denotes the current iteration.

Step 3: Depending on the value of the penalty parameter the linear subproblem LPF(x^k, y^k) derived from F(y^k), see (3.7) is solved. Alternatively the mixed-integer quadratic programMIQP(x^k, y^k)is solved, if the flagon^k_MIQP possesses the value 1.

Step 4: The subsequent coordination step executes the Search Step Selection. 4 different possibilities for selecting the search step arise:

If an improving mixed-integer search direction according to Definition 3.2 was obtained by MIQP(x^k, y^k),then it is chosen to be the search step.

Else if the current iterate is a ε-stationary-point of NLP(y^k) or F(y^k) accord-ing to Definition 3.1, then the search step is determined by the outer approximation master problem MILP(T_ε^k, S^k_ε, f^∗, ε_OA), see below.

Else if the solution d^k_c of QP(x^k, y^k) provides improvement with respect to the L_∞-penalty function, measured by r^k_c (3.53), then the subsequent iterate is obtained by adding d^k_c.

Else no step is performed and the trust region radius is decreased.

Step 5: If the search step is determined by the solution of either QP(x^k, y^k) or MIQP(x^k, y^k) a Parameter Update is performed. This affects among others the continuous trust region radius and the penalty parameter.

Step 6: If the search step is to be determined by the solution of the outer approxi-mation master problem, MILP(T_ε^k, S^k_ε, f^∗, ε_OA) is solved after an update of the sets T_ε^k, S^k_ε. If MILP(T_ε^k, S^k_ε, f^∗, ε_OA) is infeasible, then MINLP (3.1) is solved.

Step 7: If a search step was performed Gradients are evaluated and the next iter-ation loop is started.

Note, that whenever the problem-functions f and g are evaluated at some point (x^k, y^k), the subsequent test checks, if the current incumbent can be updated:

Update current best solution

(x^∗, y^∗) := (x^k, y^k) (3.55)

f^∗ := f(x^k, y^k), (3.56)

kg(x^k, y^k)⁻k_∞ ≤ ε (3.57)

f(x^k, y^k) < f^∗ (3.58) holds. In the initial step, the best known solution (x^∗, y^∗) is initialized as follows:

(x^∗, y^∗) := (x⁰, y⁰)and f^∗ :=

∞, if kg(x⁰, y⁰)⁻k_∞ > ε f(x⁰, y⁰), if kg(x⁰, y⁰)⁻k_∞ ≤ ε

. (3.59)

Algorithm 3.1. MIQPSOA 1. Initialization:

Letx⁰ ∈X,y⁰∈Y be starting values anddefinethe parameters ∆⁰_c> 0,∆⁰_i ≥1, B⁰ ∈ R^n×n symmetric and positive definite, T_ε⁻¹ = S⁻¹_ε := ∅, δ⁰ > 0, σ⁰ > 0, ε_OA > 0, ε > 0, σ¯ ≥0, on⁰_MIQP := 1, n_oa :=0, k:=0.

Evaluate the functions f(x⁰, y⁰) and g(x⁰, y⁰) and determinegradients

∇_x,yf(x⁰, y⁰) and ∇_x,yg(x⁰, y⁰) and initialize best known solution (x^∗, y^∗).

2. QP(x^k, y^k):

Determine a KKT-point ((d^k_c)_x, η^k_c,(λ^k_c, λ^k_η_c)) of QP(x^k, y^k) given by (3.12).

Evaluate f(x^k+ (d^k_c)x, y^k), g(x^k+ (d^k_c)x, y^k) andP_σ^k(x^k+ (d^k_c)x, y^k), where P_σ^k is specified in Definition 2.9.

Evaluate r^k_c by (3.53).

3. MIQP(x^k, y^k) or LPF(x^k, y^k):

If on^k_MIQP =1,

then solve the mixed-integer program MIQP(x^k, y^k) given by (3.50) determining (d^k_i, η^k_i).

Evaluatef(x^k+(d^k_i)_x, y^k+(d^k_i)_y),g(x^k+(d^k_i)_x, y^k+(d^k_i)_y)andP_σ^k(x^k+ (d^k_i)x, y^k+ (d^k_i)y).

Calculate r^k_i given by (3.54).

Adapt integer trust region radius:

∆^k+1_i :=









max{∆^k_i, 4kd^k_ik_∞}, if r^k_i > 0.9,

∆^k_i, if 0.9≥r^k_i ≥0.1, min{¹₄∆^k_i,¹₂kd^k_ik_∞}, if r^k_i < 0.1.

(3.60)

Else if σ^k>σ,¯

then solve the linear program LPF(x^k, y^k) given by (3.9) and denote the solution by((d^k_F)_x, η^k)and the corresponding Lagrangian multipliers by (λ^k_F, λ^k_η).

4. Search Step Selection:

If on^k_MIQP =1 and d^k_i is an improving search direction according to Def-inition 3.2,

then set (x^k+1, y^k+1) := (x^k+ (d^k_i)x, y^k+ (d^k_i)y).

GOTO Step 5.

Else if (x^k, λ^k_c) is a ε-stationary point of NLP(y^k), i.e.,

k(d^k_c)_xk₂ ≤ ε, η^k_c ≤ ε, k(d^k_c)_xk_∞ < ∆^k_c (3.61) holds and if in addition

ε_OA > k∇_xf(x^k, y^k)k₂k(d^k_c)_xk₂+M_xk(d^k_c)^T_xB^k_ck₂ +k(d^k_c)^T_xB^k_ck₂k(d^k_c)_xk₂,

(3.62) is satisfied with M_x defined by (3.5),

or if σ^k > σ¯ and (x^k, η^k,(λ^k_F, λ^k_η)) is a ε-stationary point of F(y^k), i.e.,

k(d_F)_xk₂ ≤ ε, η^k > ε (3.63) holds and if in addition

k∇_xg_j(x^k, y^k)k₂k(d^k_F)_xk₂ < ε, ∀j∈J, (3.64) holds,

then GOTO Step 6.

Else if r^k_c > 0 defined in (3.53),

then set (x^k+1, y^k+1) := (x^k+ (d^k_c)_x, y^k) and ∆^k+1_i :=∆^k_i. Else set

(x^k+1, y^k+1) := (x^k, y^k), B^k+1 := B^k, σ^k+1 := σ^k, δ^k+1 := δ^k, T_ε^k := T_ε^k−1, S^k_ε := S^k−1_ε ,

∆^k+1_c := ¹₄k(d^k_c)_xk_∞, on^k+1_MIQP := on^k_MIQP, ∆^k+1_i := ∆^k_i, k := k+1.

GOTO Step 2.

5. Parameter Update:

∆^k+1_c :=









max{∆^k_c, 4k(d^k_c)xk_∞}, if r^k_c > 0.9,

∆^k_c, if 0.9≥r^k_c ≥0.1, min{¹₄∆^k_c,¹₂k(d^k_c)xk_∞}, if r^k_c < 0.1.

(3.65)

Choose B^k+1, such that B^k+1 is any symmetric, positive definite matrix.

Set T_ε^k :=T_ε^k−1, S^k_ε := S^k−1_ε .

Penalty Update with respect to Φ^k_c defined in (3.11):

Φ^k_c(0) −Φ^k_c((d^k_c)_x) ≤ σ^kδ^kmin

∆^k_c,kg(x^k, y^k)⁻k_∞ , (3.66) then set σ^k+1 :=2σ^k and δ^k+1:= ¹₄δ^k.

Else set σ^k+1 :=σ^k and δ^k+1:= δ^k. If σ^k >σ,¯

then set on^k_MIQP :=0.

Else set on^k+1_MIQP :=on^k_MIQP GOTO Step 7.

6. MILP(T_ε^k, S^k_ε, f^∗, ε_OA), i.e., outer approximation master problem:

If (x^k, λ^k_c) is a ε-stationary-point of NLP(y^k) given by (3.6), then update set T_ε^k−1 given by (3.43):

T_ε^k := T_ε^k−1∪{(x^k, y^k)}, S^k_ε := S^k−1_ε .

Else i.e., (x^k, η^k,(λ^k_F, λ^k_η)) is a ε-stationary-point of F(y^k) given by (3.7), then update set S^k−1_ε given by (3.44):

S^k_ε := S^k−1_ε ∪{(x^k, y^k)}, T_ε^k := T_ε^k−1.

Solve the linear outer approximation master problem MILP(T_ε^k, S^k_ε, f^∗, ε_OA).

If linear outer approximation master problem (3.42) is infeasible, then STOP.

Else denote the solution by (x^k+1, y^k+1).

Evaluate f(x^k+1, y^k+1) and g(x^k+1, y^k+1).

Set n_oa :=n_oa+1, yⁿ_oa^oa :=y^k+1 and

∆^k+1_i := max

∆⁰_i, ∆^k_i , ∆^k+1_c := max

∆⁰_c, ∆^k_c , δ^k+1 := δ⁰, σ^k+1 := σ⁰,

B^k+1 := B^k.

(3.67)

If ∃ i ∈{1, . . . , n_oa−1} with y^k+1 =yⁱ_oa,

then set on^k_MIQP := 0, i.e., solve NLP(y^k+1) or F(y^k+1) respectively.

Else set on^k_MIQP :=1.

7. Gradients:

Evaluate ∇_x,yf(x^k+1, y^k+1)and[∇_x,y g(x^k+1, y^k+1)],setk:= k+1andGOTO Step 2.

QP(x^k, y^k)

(x^k+1, y^k+1) :=

(x^k, y^k)

(x^k+1, y^k+1) :=

(x^k, y^k) +d^k_c

Parameter Update

Gradients

LPF(x^k, y^k)

MILP(T_ε^k, S^k_ε, f^∗, εOA)

⇒(x^k+1, y^k+1)

STOP MIQP(x^k, y^k)

Search Step Selection

(x^k+1, y^k+1) :=

(x^k, y^k) +d^k_i

Problem

Initialization ⇒(x⁰, y⁰)

Search Step Selection σk≤¯σandonk MIQP=0 onk MIQP=1 σk

>σ¯

r^k_c ≤0

rk c

dk i impro

ving searc

hdirection

(x^k, λ^k_c)or(x^k, η^k,(λ^k_F, λ^k_η)) ε-stationary point of NLP(y^k) or F(y^k)

feasible

infeasible

SQP-TR Component OA Component MIQP Extension

Coordination Decision Description

Fig. 3.1: MIQP-supported Outer Approximation

Figure 3.1 illustrates the solution process of Algorithm 3.1. The previously defined

al-gorithm is based on four different subproblems, which are QP(x^k, y^k), MIQP(x^k, y^k), LPF(x^k, y^k) and MILP(T_ε^k, S^k_ε, f^∗, ε_OA). In every iteration the continuous quadratic program QP(x^k, y^k) is solved yielding a continuous search direction. This search di-rection is equivalent to the search didi-rection obtained at iterate (x^k, y^k) by the trust region algorithm of Yuan stated by Algorithm 2.1, while solving NLP(y^k) given by (3.6).

The parameter on^k_MIQP is a flag for turning on or off the calculation of the mixed-integer search direction provided by the solution of MIQP(x^k, y^k) in iteration k. As shown in the remainder of this chapter,on^k_MIQPneeds to be set to0, i.e., MIQP(x^k, y^k) is not solved, in certain situations to ensure convergence.

As will be proved below, Lemma 2.2 applies, if the penalty parameterσ^ktends towards infinity and exceeds the threshold ¯σ, i.e., the iteration sequence converges towards an infeasible stationary point specified in Definition 2.10. To identify these infeasible stationary points LPF(x^k, y^k) is solved as soon as σ^k exceeds the threshold ¯σ. As a consequence, a suitable value for the parameter ¯σ is ¯σ:= 10¹⁰.

The parameter ε_OA is the optimality tolerance needed by the outer approximation master problem MILP(T_ε^k, S^k_ε, f^∗, ε_OA), in order to ensure finite termination of Algo-rithm 3.1. The same holds for any other linear outer approximation algoAlgo-rithm, such as Algorithm 2.2.

If (3.61) and (3.62) hold in Step 4, then (x^k, λ^k_c) is a ε-stationary-point of NLP(y^k) given by (3.6) specified in Definition 3.1 due to Corollary 3.2. In addition (x^k, y^k) satisfies

ε_OA > k∇_xf(x^k, y^k)k₂k(d^k_c)_xk₂+M_xk(d^k_c)^T_xB^k_ck₂+k(d^k_c)^T_xB^k_ck₂k(d^k_c)_xk₂, (3.68) whereM_x is determined by the maximal range of the continuous variables and ((d^k_c)_x, η^k_c,(λ^k_c, λ^k_η_c)) is a KKT-point of QP(x^k, y^k).

Ifσ^k>σ¯as well as (3.63) and (3.64) hold in Step 4 instead, then(x^k, η^k,(λ^k_F, λ^k_η))is a ε-stationary-point of F(y^k) given by (3.7) specified in Definition 3.1 due to Corollary 3.3.

In addition (x^k, y^k)satisfies

k∇_xg_j(x^k, y^k)k₂k(d^k_F)_xk₂ < ε, ∀j∈J, η^k > ε,

(3.69) where((d^k_F)_x, η^k) is the optimal solution of LPF(x^k, y) given by (3.9) and(λ^k_F, λ^k_η) are the corresponding Lagrangian multipliers.

Note, that the values of the constants used within parameter updates, e.g. the update of the trust region radii, are taken from Yuan [112].

The convergence analysis is based on Assumptions 2.1 and 2.2, which are unified and restated here.

Assumption 3.1. 1. f(x, y) and g_j(x, y), j=1, . . . , m are continuously differen-tiable on X×Y_R.

2. f(x, y) is convex and g_j(x, y), j=1, . . . , m are concave on X×Y_R and the set X defined by (3.3) is nonempty and compact.

3. The linear independent constraint qualification, stated in Definition 2.6, holds at each optimal solution of problem NLP(y) and F(y) for all y∈Y.

4. The sequences{(x^k, y^k)}and{B^k}generated by the proposed algorithm are bounded

∀ k.

Note, that the set X defined in (3.3) is compact, due to existence of upper and lower bounds on all continuous variables.

The algorithm is designed to yield the same iteration sequence as the linear outer approximation Algorithm 2.2 under certain circumstances, where the nonlinear pro-grams, which arise as subproblems are solved by Algorithm 2.1. This is established by the subsequent corollaries in the reminder of this section. Since the mixed-integer search steps obtained in Step 3 distinguish Algorithm 3.1 from the linear outer ap-proximation method described by Algorithm 2.2, we skip their calculation for the moment by assuming on^k_MIQP =0, ∀k.

Corollary 3.4 states the equivalence of the Steps 2, 5 and 7 as well as parts of Step 4,

Im Dokument On Efficient Solution Methods for Mixed-Integer Nonlinear and Mixed-Integer Quadratic Optimization Problems (Seite 79-100)