Integration of Branch-and-Bound and SQP

SOLUTION TECHNIQUES

2.9 Integration of Branch-and-Bound and SQP

Apart from the LP/NLP-based branch-and-bound algorithm, all solution methods for convex MINLP problems presented in the previous sections decouple the solution process by considering continuous nonlinear optimization and integer optimization separately. Applying a decomposition enables the solution of problem (1.6), since efficient software for both mixed-integer linear and continuous nonlinear programming is available.

Recently, some algorithms have been proposed, reducing the computational effort for solving MINLPs by integrating mixed-integer and continuous nonlinear programming techniques. Leyffer [76] presents such a solution approach for convex MINLP prob-lems (1.6) extending NLP-based branch-and-bound methods. The motivation is to improve the performance of the NLP-based branch-and-bound approach presented in Section 2.4 by integrating the solution process of the NLP subproblems into the branch-and-bound enumeration. This reduces the effort for solving the continuous nonlinear programs significantly, since early termination, also called early branching, is possible. Early branching was first introduced by Borchers and Mitchell [29], but Leyffer [76] improves their algorithmic ideas essentially.

As seen in Section 2.4, valid lower bounds on the optimal solution of the original problem are required at each node of the branch-and-bound tree, if bounding should be applied, i.e., the second fathoming rule stated in Corollary 2.2. If bounding cannot be applied, the efficiency of the branch-and-bound method is very poor, since a huge number of possible integer values has to be enumerated.

The goal is to reduce the computational effort for solving the continuous relaxations, see also Section 2.4, given by the following NLP in branch-and-bound iteration l.

x ∈X, y∈Y_R: min f(x, y)

s.t. g_j(x, y) ≥ 0, j=1, . . . , m, y_i ≥ (y^l_l)_i, i∈I^l_L,

yi ≤ (y^l_u)i, i∈I^l_U.

(2.118)

Note, that I^l_L and I^l_U, defined by (2.50) and (2.51) denote the index sets correspond-ing to more strcorrespond-ingent lower and upper bounds on some integer variables, which are determined by the branch-and-bound enumeration. The idea of early branching is, to improve the performance by terminating before an optimal solution (¯x^l,y¯^l) for NLP (2.118) is obtained. If we terminate before the optimal solution (¯x^l,y¯^l)is deter-mined, no valid lower bounds underestimating ¯f^l :=f(¯x^l,y¯^l)are available. This means, that the corresponding subtrees cannot be cut off, even if ¯f^l ≥^ηholds, where^ηdenotes the current incumbent, i.e., the best feasible solution found so far. Note, that cutting off these subtrees is equivalent to fathoming the corresponding nodes according to the second fathoming rules stated in Corollary 2.2. One alternative way to obtain lower bounds in case of early branching is proposed by Borchers and Mitchell [29]. They suggest to evaluate the Lagrangian dual of the NLP (2.47) in iteration l for a given set of Lagrangian multipliers λ∈R^m, i.e.,

x∈X, y ∈Y_R^l : (2.119)

min f(x, y) −λ^Tg(x, y),

whereY^l

R is defined by (2.58). The solution (^x^l,y^^l)of (2.119) with objective value

^L^l := f(^x^l,y^^l) −λ^Tg(^x^l,y^^l) (2.120)

provides a lower bound on the optimal solution (¯x^l,y¯^l) of NLP (2.118), i.e., ^L^l ≤ f(¯x^l,¯y^l) holds, see Leyffer [76].

Leyffer [76] presents an improved approach, where the solution of the Lagrangian dual (2.119) is unnecessary. Instead of obtaining lower bounds by solving (2.119), one can interpret the constraints of a continuous quadratic program arising during the solution process of the NLP (2.118) via a SQP method, see Section 2.2, as supporting hyperplanes. If a convex MINLP is considered, a linearization is equivalent to an outer approximation, see also Section 2.5.

If we apply a SQP method to solve NLP (2.118), a sequence of quadratic programs

d∈Rⁿ:

min ∇_x,y f(x^k, y^k)^Td+ 1 2d^TB^kd

s.t. g(x^k, y^k) + [∇_x,y g(x^k, y^k)]^Td ≥ 0 x^k+d_x ∈ X y^k+d_y ∈ Y^l

(2.121)

is solved withd= d_x

d_y

.B^k∈R^n×ndenotes an approximation of the Hessian of the Lagrangian function andY^l

R is defined by (2.58), see also Section 2.2. Note, that the iteration indexk denotes thek-th QP subproblem while solving the l-th NLP during the branch-and-bound enumeration process. In general, the optimal objective value of QP (2.121) does not underestimate the optimal solution f(¯x^l,y¯^l) of NLP (2.118), i.e., QP (2.121) yields no lower bound.

To obtain a lower bound of the optimal solution f(¯x^l,y¯^l) of NLP (2.118) by the solution of the QP-subproblem, Leyffer [76] suggests to include a so-called objective cut, analogue to (2.73) and (2.76) for linear outer approximation, given by

f(x^k, y^k) +∇_x,yf(x^k, y^k)^Td≤^η−ε. (2.122)

η ∈ R ∪{∞} denotes the current incumbent. Note, that we consider the solution process of a single NLP during a branch-and-bound enumeration, which implies that

η does not vary. Including this objective cut in QP (2.121) yields d∈Rⁿ:

min ∇_x,yf(x^k, y^k)^Td+ 1

2d^TB^kd

s.t. g(x^k, y^k) + [∇_x,yg(x^k, y^k)]^Td ≥ 0 f(x^k, y^k) +∇_x,yf(x^k, y^k)^Td ≤ η^−ε

x^k+dx ∈ X y^k+d_y ∈ Y_R^l.

(2.123)

QP (2.123) can also be interpreted as a SQP subproblem corresponding to the NLP given by

x∈X, y ∈Y_R^l : min f(x, y)

s.t. g(x, y) ≥ 0 f(x, y) ≤ ^η−ε.

(2.124)

Due to the introduction of the objective cut, bounding corresponds to the infeasibility of NLP (2.124). The following lemma stated by Leyffer [76] shows the importance of QP (2.123).

Lemma 2.7. Let f(x, y) and g(x, y) be continuously differentiable functions and let f(x, y) be convex, while g(x, y) is concave. A sufficient condition for bounding, i.e., the second fathoming rule given by Corollary 2.2, is that QP (2.123) generated by a SQP method solving problem (2.124) is infeasible in any iteration k.

Proof. See Leyffer [76].

By now we neglected the presence of a trust region ensuring global convergence. A trust region truncates the feasible region of QP (2.123). As a consequence, we have to distinguish whether infeasibility is caused by the trust region or not. If infeasibility is encountered independently of the trust region, the current subtree can be fathomed according to Lemma 2.7. If on the other hand QP (2.123) is infeasible due to the trust region constraint, a feasibility problem, similar to (2.54), has to be solved to check whether the current node can be fathomed or not, see Leyffer [76] for details.

2.10 An Extension of Yuan’s Trust Region Method for Mixed-Integer Optimization

The efficient mixed-integer nonlinear SQP trust region algorithm of Exler and Schit-tkowski [45], whose implementation is called MISQP, is an extension of the well-known SQP methods. It is based on the trust region method proposed by Yuan [112], see Section 2.2. Analogue to continuous sequential quadratic programming methods the mixed-integer nonlinear optimization problem is solved by a sequence of mixed-integer quadratic approximations.

If we apply Yuan’s trust region Algorithm 2.1 to solve the continuous relaxation of MINLP (1.1), the corresponding quadratic model is represented by

d_x ∈Rⁿ^c, d_y ∈Rⁿⁱ, η∈R+ : (2.125)

min ∇_x,yf(x^k, y^k)^Td+¹₂d^TB^kd+σ^kη

s.t. η+g_j(x^k, y^k) +∇_x,yg_j(x^k, y^k)^Td ≥ 0, j=1, . . . , m_e, η−gj(x^k, y^k) −∇_x,ygj(x^k, y^k)^Td ≥ 0, j=1, . . . , me, η+g_j(x^k, y^k) +∇_x,yg_j(x^k, y^k)^Td ≥ 0, j=m_e+1, . . . , m,

kd_xk_∞ ≤ ∆^k_x, kd_yk_∞ ≤ ∆^k_y, x^k+d_x ∈ X, y^k+dy ∈ Y_R,

with d :=

d_x d_y

. Since problem (2.125) is equivalent to (2.31), see Yuan [112], its solution provides a search direction in iteration k. Note, that ∆^k_x and ∆^k_y denote the continuous and integer trust region radius, see also Section 2.2.

Restricting the domain of d_y to Nⁿⁱ turns the continuous quadratic program into a

mixed-integer quadratic model of MINLP (1.1) given by problem

d_x ∈Rⁿ^c, d_y ∈Nⁿⁱ, η∈R+ : (2.126)

min ∇_x,yf(x^k, y^k)^Td+¹₂d^TB^kd+σ^kη

s.t. η+g_j(x^k, y^k) +∇_x,yg_j(x^k, y^k)^Td ≥ 0, j=1, . . . , m_e, η−g_j(x^k, y^k) −∇_x,yg_j(x^k, y^k)^Td ≥ 0, j=1, . . . , m_e, η+g_j(x^k, y^k) +∇_x,yg_j(x^k, y^k)^Td ≥ 0, j=m_e+1, . . . , m,

kd_xk_∞ ≤ ∆^k_x, kd_yk_∞ ≤ ∆^k_y, x^k+dx ∈ X, y^k+d_y ∈ Y.

Applying Yuan’s trust region method, were QP subproblems are replaced by MIQP (2.126) ensures, that each iterate (x^k, y^k) satisfies the integrality condition, i.e., y^k ∈ Nⁿⁱ. Problem (2.126) can be solved by any mixed-integer quadratic solver, such as MIQL of Lehmann and Schittkowski [71]. Extensive tests on academic and real-world prob-lems show, that the algorithm is very efficient in terms of the number of function evaluations. Furthermore, the algorithm is only based on local approximations, i.e., the number of linearizations is not successively increasing and linearizations only de-pend on the current iteration point (x^k, y^k). As a consequence, it performs very well for non-convex problems.

Despite of the encouraging results, no convergence proof for this mixed-integer non-linear SQP trust region algorithm was found yet. Furthermore, extensive test with the current implementation called MISQP showed, that in some rare cases the global optimal solution of a convex test-case was not found. In future work, we will focus on a convergence proof for an algorithm, that is based on the ideas implemented in MISQP but where slight modifications are incorporated due to the existence of convex prob-lems, where MISQP fails. Since only local approximations are contained MIQP (2.126) provides no valid lower bounds. Furthermore, the adjustment of some important pa-rameters such as the trust region radius ∆^k_y in (2.126) is critical, especially for binary variables.

2.11 Convex Mixed-Integer Quadratic

Im Dokument On Efficient Solution Methods for Mixed-Integer Nonlinear and Mixed-Integer Quadratic Optimization Problems (Seite 63-68)