Methods for nonconvex problems - Nondifferentiable Optimization Algorithms

Nondifferentiable Optimization Algorithms

7 Methods for nonconvex problems

Nonconvex minimization problems are solved in NOA by natural extensions of the methods described in Sections 3, 4 and 6, see Kiwiel (1985a, 1985d, 1986b, 1986c, 1988a).

For simplicity, let us consider the problem of minimizing a locally Lipschitzian function f on SL

.

In the nonconvex case the subgradient gf (y) may be used for modelling f around z only when y is close to z (we no longer have f _>

f

(

.

; y ) ). The subgradient locality measure

with a parameter 7, > 0 indicates how much gf (y) differs from being a subgradient of f a t z . At the k-th iteration the algorithm uses the following modification of (6) for finding dk via (7)

f k ( z ) = f ( z k ) + m a x { - u f ( z k ; $ ) + ( g f ( # ) , ²- z k ) : j~

~ f } .

In the convex case with 7, = 0 this approximation is equivalent to (6)

(since f ( z k ) 2 f ( z k ; y ~ ) ) . For 7, > 0 the local subgradients with small weights af ( z k ; y ~ ) tend t o influence dk more strongly than the nonlocal ones.

The above definition of a/ is rather arbitrary (cf. Mifflin, 1982), and it is not clear how the value of 7, should reflect the degree of nonconvexity of f (in theory any 7, > 0 will do).

Of course, for convex f 7, = 0 is best. Larger values of 7, are essential for concluding that z k

is optimal because

jk

indicates that f has no feasible descent directions a t z k . On the other hand, a large 7, may cause that after a serious step all the past subgradients will become inactive a t the search direction finding. Then the algorithm will be forced to accumulate local subgradients by performing many null steps.

It is, therefore, reassuring to observe that 7, = 1 seems t o work quite well in practice (cf.

Kiwiel, 1988a). However, it may be necessary to scale the variables so that they are of order 1 a t the solution (to justify the Euclidean norm in (17)). Since automatic scaling could be dangerous, it is not implemented in NOA, but we intend to pursue this subject in the future.

Another feature of the nonconvex case is the need for line searches. Two cases are possible when a line search explores how well

jk

agrees with f between z k and z k

+

dk. Either it is possible to make a serious step by finding a stepsize t i E ( 0 , 1 ] such that the next iterate zk+' = z k

+

t i d k has a significantly lower objective value than z k

,

or a null step zk+' = z k ( t i = 0 ) which evaluates f and gf at the new trial point yk+' = z k + t k d k

,

with t k E ( 0 , 11, should improve the next model fk+' that will yield a better dk+'.

More specifically, a serious step t i = t k > 0 is taken if

whereas a null step occurs with 0 = t i < t k

5 f

and

where ^mL

,

^mR

,

^muand

i

are positive parameters. A simple procedure for finding t i and t k is given in Kiwiel ( 1 9 8 6 ~ ) for the case of ^mL

+

^mu^<^mR ^<¹^and

f 5

1. Since the aim of a null step is to detect discontinuities of gf on the segment [ z k

,

zk+' ]

,

this procedure requires that f and gf be consistent in the sense that

l i m s ~ ~ ( ~ ~ ( z + t ' d ) , d )

>

l i m i n f [ f ( z + t i d ) - f ( z ) ] / t i

i-+m s + m

for all z , d E Rn

,

{ ti ) c R+

,

^ti

1 .

In practice we use ^mL = 0.1, ^mR = 0.5, ^mu= 0.1,

f

= 0.1 and simple quadratic interpolation for diminishing trial stepsizes (see Remark 3.3.5 in Kiwiel, 1985d). Yet our crude procedure seems t o be quite efficient; it requires on average less than two function evaluations (cf. Kiwiel, 1988a). On the other hand, our experience with more sophisticated procedures that insist on directional minimizations (cf. Mifflin, 1984) is quite negative. The resulting increase in the number of f-evaluations is not usually offset by a reduction in the number of iterations. This is not efficient in applications where the cost of one f-evaluation may dominate the effort in auxiliary operations (mainly a t quadratic programming) per iteration.

We should add that in practice we employ the locality measures a:,j = max

{

^l/(zk)^-^{i ( z k}^;^Y')I

^,

^78(~;)'

}

that over-estimate as ( z k ; y j ) by using the upper estimate 8; =

-

^ZJ

1 + x!i/

^(zit'^-^zi)

of Izk - yjl, which can be updated without storing y ~ .

The extensions to the nonconvex case of the methods of Sections 4 and 6 follow the lines sketched above. We only add that all the problem functions should satisfy the semidiffer- entiability condition (18). In fact the convergence analysis of Kiwiel (1988a) requires the

equality constraints t o be continuously differentiable, but we have managed t o solve many problems with nondifferentiable equality constraints.

We may add t h a t each method of NOA has another version that uses subgradient deletion rules instead of subgradient locality measures for localizing the past subgradient information (see, for instance, Kiwiel, 1985a and Kiwiel, 1 9 8 6 ~ ) . It is not clear which version is preferable, since their merits are problem-dependent. We intend t o clarify this situation in the near future.

8 Conclusions

We have presented an overview of several NDO algorithms t h a t are implemented in the system NOA. The emphasis has been laid on practical difficulties, but they can only be resolved by further theoretical work. We hope, therefore, t h a t this paper will contribute t o the development of NDO methods.

9 References

Bihain, A. (1984). Optimization of upper-semidifferentiable functions. Journal of Opti- mization Theory and Applications, 44, pp. 545-568.

Bihain, A., Nguyen, V. H. and Strodiot, J.-J. (1987). A reduced subgradient algorithm.

Mathematical Programming Study, 30, pp. 127-149.

Bronisz, P. and Krus, L. (1985). Experiments in calculation of game equilibria using nons- mooth optimization. In: Lewandowski, A. and Wierzbicki, A. P. eds., Software, theory and testing examples in decision support systems, pp. 275-286. International Institute for Applied Systems Analysis, Laxenburg, Austria.

Clarke, F. H. (1983). Optimization and nonsmooth analysis. Wiley Interscience, New York.

Kiwiel, K. C. (1985a). A linearization algorithm for nonsmooth minimization. Mathematics of Operations Research, 10, pp. 185-194.

Kiwiel, K. C. (1985b). An algorithm for linearly constrained convex nondifferentiable mini- mization problems. Journal of Mathematical Analysis and Applications, 105, pp. 452- 465.

Kiwiel, K. C. ( 1 9 8 5 ~ ) . An exact penalty function method for nonsmooth constrained convex minimization problems. IMA Journal of Numerical Analysis, 5, pp. 11 1-1 19.

Kiwiel, K. C. (1985d). Methods of descent for nondifferentiable optimization. Lecture Notes in Mathematics, 1133. Springer, Berlin.

Kiwiel, K. C. (1986a). A method for solving certain quadratic programming problems arising in nonsmooth optimization. IMA Journal of Numerical Analysis, 6, pp. 137-152.

Kiwiel, K. C. (1986b). A method of linearizations for linearly constrained nonconvex nons- mooth optimization. Mathematical Programming, 34, pp. 175-187.

Kiwiel, K. C. ( 1 9 8 6 ~ ) . An aggregate subgradient method for nonsmooth and nonconvex minimization. Journal of Computational and Applied Mathematics, 14, pp. 391-400.

Kiwiel, K. C. (1987a). A constraint linearization method for nondifferentiable convex min- imization. Numerische Mathematik, 51, pp. 395-414.

Kiwiel, K. C. (1987b). A subgradient selection method for minimizing convex functions subject t o linear constraints. Computing, 39, pp. 293-305.

Kiwiel, K. C. (1988a). An exact penalty function method for nondifferentiable constrained minimization. Prace IBS PAN, 155, Warszawa.

Kiwiel, K. C. (1988b). Computational Methods for Nondifferentiable Optimization. Osso- lineum, Wroclaw (in Polish).

Kiwiel, K. C. ( 1 9 8 8 ~ ) . Descent methods for quasidifferentiable minimization. Applied Math- ematics and Optimization (in press).

Kiwiel, K. C. (1988d). Proximity control in bundle methods for convex nondifferentiable minimization. Mathematical Programming, (to appear).

Lemarechal, C. (1978). Nonsmooth optimization and descent methods. Report RR-78-4, International Institute for Applied Systems Analysis, Laxenburg, Austria.

Lemarechal, C. (1986). Constructing bundle methods for convex optimization. In: J. B. Hir- iart-Urruty, ed., Fermat Days: Mathematics for Optimization, pp. 201-240. North- Holland, Amsterdam.

Lemarechal, C., Strodiot, J.-J., and Bihain, A. (1981). On a boundle algorithm for nons- mooth optimization. In: Nonlinear Programming, 3 ( 0 . L. Mangasarian, R. R. Meyer and S. M. Robinson, eds.), pp. 245-281. Academic Press, New York.

Mifflin, R. (1977). Semismooth and semiconvex functions in constrained optimization.

SIAM Journal on Control and Optimization, 15, pp. 959-972.

Mifflin, R. (1982). A modification and an extension of Lemarechal's algorithm for nonsmooth minimization. Mathematical Programming Study, 17, pp. 77-90.

Mifflin, R. (1984). Stationarity and superlinear convergence of an algorithm for univariate locally Lipschitz constrained minimization. Mathematical Programming, 28, pp. 50-71.

Nguyen, V. H., and Strodiot, J.-J. (1984). A linearly constrained algorithm not requiring derivative continuity. Engineering Structures, 6, pp. 7-11.

Panier, E. (1987). An active set method for solving linearly constrained nonsmooth opti- mization problems. Mathematical Programming, 37, pp. 269-292.

Polak, E., Mayne, D. Q., and Wardi, Y. (1983). On the extension of constrained optimization algorithms from differentiable t o nondifferentiable problems. SIAM Journal on Control

and Optimization, 21, pp. 179-203.

Wierzbicki, A. P. (1986). On the completeness and constructiveness of parametric charac- terizations t o vector optimization problems. OR Spectrum, 8, pp. 73-87.

A Methodological Guide t o the Decision Support System

Im Dokument Theory, Software and Testing Examples in Decision Support Systems (Seite 171-175)