Constrained Optimization of Discontinuous Systems

(1)

Working Paper

C O N S T R A I N E D OPTIMIZATION O F D I S C O N T I N U O U S S Y S T E M S

Yuri M. ERMOLIEV Vladimir I. NORKIN

WP-96-78 July 1996

lflllASA

International Institute for Applied Systems Analysis A-2361 Laxenburg Austria

&.;:

Telephone: +43 2236 807 Fax: +43 2236 71313 E-Mail: info@iiasa.ac.at

(2)

C O N S T R A I N E D OPTIMIZATION O F D I S C O N T I N U O U S S Y S T E M S

Yuri M. ERMOLIEV Vladimir I. NORKIN

JVP-96-78 July 1996

bl'orbing Papers a r e interim reports on work of t h e International Institute for Applied Systems Analysis a n d have received only limited review. Views or opinions expressed herein d o not necessarily represent those of t h e Institute, its National Member Organizations, or other organizations supporting t h e work.

lSlll ASA

International Institute for Applied Systems Analysis o A-2361 Laxenburg o Austria

W~W.. Telephone: +43 2236 807 Fax: +43 2236 71313 E-Mail: info~iiasa.ac.at

(3)

Abstract

In this paper we extend t h e results of Ermoliev, Norkin and Wets [8] and Ermoliev and Korkin [7]

t o the case of constrained discontinuous optimization problems. In contrast t o [7] the attention is concentrated on the proof of general optimality conditions for problems with nonconvex feasible sets. Easily implementable random search technique is proposed.

Key words: Discontinuous Systems, Necessary Optimality Conditions, Averaged Functions, Mollifier Subgradients, Stochastic Optimization.

(4)

CONSTRAINED OPTIMIZATION O F DISCONTINUOUS SYSTEMS

Yuri M. E R M O L I E V Vladimir I. NORKIN 1 Introduction

In this paper we elaborate further results of Ermoliev, Norkin and Wets [8] a n d Ermoliev and Norkin [ 7 ] t o a general constrained discontinuous optimization problem:

minimize F ( x ) (1)

subject t o x E K R n , ( 2 )

ivhere F ( x ) is a (strongly) lower semicontinuous function, I< is a compact set.

As we showed in [ 7 ] t h e class of strongly lower semicontinuous functions is appropriate for modeling and optimization of abruptly changing systems without instantaneous jumps a n d re- turns. In particular, we analyzed risk control problems, optimization of stochastic networks a n d discrete event systems, screening irreversible changes and stochastic pollution control. Another important application may be stochastic jumping processes describing risk reserves of interde- pendent insurance a n d reinsurance companies. In a rather general form t h e risk reserves can I)c\ understood as "reservoirs", where risk premiums are continuously flowing in a n d random claims a t random time moments abruptly draining them out. A sarnple path of such process is a strongly lower semicontinuous function with random jumps a t claim occurrence times.

In a sense t h e main aim of this article is t o provide proofs of necessary optimality conditions for general discontinuous constrained optimization problems discussed in [ 7 ] . In Section 2 ive analyze situations when the expectation function belongs t o t h e class of strongly lower semicontinuous functions. General idea of discontinuous optimization is presented i n Section 3. Optimality conditions for discontinuous functions a n d general constraints are analysed in Section 4. Section 5 outlines possible computational procedures.

2 Some classes of discontinuous functions

In nonsmooth analysis different classes of continuous functions are introduced a n d studied.

Thp same is necessary for discontinuous functions. We basically restrict possible discontinuity

(6)

t o the case of strongly lower semicontinuous functions, which seem t o be most important for applications.

Definition 2 . 1 A function F : Rn ^---,R1 is called strongly lower semicontinuous at x , if it is lower semicontinuous at x and there exists a sequence x k

-

^x^withF continuous at x k (for all k) such that F ( x ~ )

-

F ( x ) . T h e function F is called strongly lower semicontinuous (strongly

~ S C ) o n

X C

Rn if this holds for all x E X .

Definition 2 . 2 Lower semicontinuous function F : Rn

-

^R1is called directionally continu- ous at x if there exists a n open (direction) set D ( x ) containing sequences x k E D ( x ) , x k ^--+x such that F ( x ~ )

-

F ( x ) . Function F ( x ) i s called directionally continuozts if this holds for a n y

1: E R n .

Definition 2 . 3 Function F ( x ) i s called piecewise continuous if for a n y open set A

c

Rn there is another open set B

c

A o n which F ( x ) is continuous.

Proposition 2 . 1 If function F ( x ) i s piecewise continuous and directionally continuovs then it is sLron1~1y lower semicontinuous.

Proof. By definition of piecewise continuity for any open vicinity V ( x ) of x we can find a n open set B

c

D ( x )

n

V ( x ) on which function F is continuous. Iience there esists sequence x k E D ( x ) , z k --+ x with F continuous a t x k . By definition of directional continuity F ( x k ) ^--+

F ( x ) .

Properties of directional continuity, piecewise continuity and strong lower semicontinuity can be easily verified for one dimensional function. For instance, if one dimensional function F ( x ) . x E R , is ( i ) lower semicontinuous, (ii) continuous almost everywhere in R and (iii) a t each point of discontinuity x E R function F ( x ) is continuous either from t h e left or from t h e right, then F ( x ) is strongly Isc. Next proposition clarifies t h e structure of multidimensional discontinuous functions of interest.

Proposition 2 . 2 If F ( x ) = F o ( F l ( x l ) ,

. . .,

F m ( x m ) ) , where x = ( a l , .

.

. , x m ) , xi E R n l , func- tion Fo(.) i s continuous and functions F i ( x i ) , i = 1 , .

. . ,

n are strongly lsc (directionally con- tinuous), then the composite function F ( x ) i s also strongly lsc (directionally continuous). If F ( . T ) = F o ( F l ( x ) .

. . .

F m ( x ) ) , x E Rnl where Fo(.) i s continuous and F i ( x ) , i = 1 , .

. .

, m: are pir-ceu-ise continuous, then F ( x ) i s also piecewise continuous.

I n particular, strong lsc, directional continuity and piecewise

con ti nu it?^

are preserved under continuous 2ransformation.s.

(7)

Proof is evident.

T h e next proposition gives a sufficient condition for a mathematical expectation function F ( x ) = E f ( x , w ) t o be strongly lower semicontinuous.

Proposition 2.3 Assume function f ( . , w ) is locally bounded around x by an integrable (in w ) function, piecewise continuous around z and a.s. directionally continuous at x with direction set D ( x , w ) = D ( x ) (independent of w ) . Suppose w takes only a finite or countable number of I - d u e s . Then F ( z ) = E f ( x , w ) is strongly lsc at x .

Proof. Lower semicontinuity of F follows from Fatu lemma. T h e convergence of F ( x k ) t o F ( x ) for x k -+ x , x k

E

D ( x ) follows from Lebesgue's dominant convergence theorem. Hence F is directionally continuous a t z in D ( x ) . It remains t o show t h a t in any open set A C R n which is close t o z there a r e points of continuity of F. For t h e case when w takes finite number of values w l ,

. . . ,

w, with probabilities p l ,

. . .

,pm t h e function F ( . ) =

C Z 1

^pp,f ( . , w ; ) is clearly piece-wise conlinuous. For t h e case when w takes a countable number of values there is a sequence of closed balls B, B;-l

c

A convergent t o some point y

E

A with f ( . , w , ) continuous on B;. We shall show (.hat F ( . ) ⁼

C z l

pi f ( . , w ; ) is continuous a t y. By assumption

1

f ( x , w i ) l

<

C , for z E A and

C z l

p i c i

<

+m. T h e n

W m

F ( x ) - F ( Y ) = C p i ( f ( x , w i ) f ( ~ , w i ) ) C p i ( f ( x : w i ) ^-f ( ! / , w i ) )

+

n m ( x , 9 ) :

i= 1 i = l

T h u s for any x k -+ y

cc

lim sup f ( x k )

<

^{F ( y )}

+ C

^2piCi;

k i=m+l

lim inf f ( x k )

2

F ( y ) -

C

2 p i c i .

k i=m+l

Sirice

CZ,,+,

2p,C,

-

⁰^{a s m}^-+^mthen limk F ( x k ) = F ( y ) . U

Let us remark t h a t functions of t h e form f ( x , w ) = f ( x ^-w ) , x , w E R n , with f (.) piecewise a n d directionally continuous have D ( z ) independent of w .

Propositions 2.1-2.3 provide a certain calculous for strongly lsc functions.

3 Averaged functions and mollifier subgradients

In order t o optimize discontinuous functions we approximate them by so-called averaged functions which a r e often considered in optimization theory (see Yudin [23], IIasminski [ 1 2 ] , Antonov

(8)

and Iiatkovnik [ I ] , Zaharov [24], Katkovnik and Kulchitsky [14] Nikolaeva [18], Archetti and Betro [2], Lliarga [25], Katkovnik, [ l 3 ] , Gupal [9], [ l o ] , Gupal a n d Korkin [ I ] ] . R.ubinstcin [22], Batuhtin a n d Maiboroda [4], Mayne a n d Polak [16], Mikhalevich, Gupal a n d Korkin [17], Er-

~noliev a n d Gaivoronski [6], Kreimer a n d Rubinstein [15], Batuhtin [3], Ermoliev, Norkin and

\\jets [8]). T h e convolution of a discontinuous function with appropriate mollifier (probability density function) improves continuity a n d differentiability, but on t h e other hand increases computational complexity of resulting problems since it transfers a deterministic function F ( x ) i n t o a n expectation function defined a s multiple integral. Therefore, this operation is meaningful only in combination with appropriate stochastic optimization techniques. Our purpose is t o introduce such technique a n d t o develop a certain subdifferential calculous for discontinuous functions. Let us introduce necessary notions and facts which a r e generalized in t h e next section t o t h e case of constrained problems.

D e f i n i t i o n 3.1 Given a locally integrable (discontinuous) function F : Eln

-

^El' nnd a family of nzollifiers ( $ 0 : Rn ^--+ R+, 6 E R + ) that by definition satisfy ^JR,,d O ( z ) d z = I . ~ u p p $ ~ :=

{z E

I t n (

d ~ ~ ( z )

>

0)

c

^poB with a unit ball B , p~ J 0 us 6

1

0. the associated famzly {F', B E R + ) of averaged functions is defined by

hlollifiers m a y also have unbounded support (see [8]).

E x a m p l e 3.1 A s s u m e F ( x ) = E f ( x , w). If f ( x , w) is such that E,l f ( x , w)l exists and grows i n /he infinity not faster than some polynom of x and random vector q has standard normal distribution, then for C ( x , q , w ) =

i [ f

^(x

+

^{Oq, w)}^-^f( x , w)]q or [ ~ ( x , q , w ) =

$[

f ( x

+

6 7 , ~ ) ^- f ( x - Orl,w)]q, 6

>

0 , we have VF'(X) = E,,<Q(x, q , w ) . The finite difference approximations

< Q ( x , q , ~ ) are unbiased estimates of VF'(X). A s i n [7], we can call them stochastic mollifier gradient of F ( x ) .

D e f i n i t i o n 3.2 (See, for example, Rockafellar and W e t s [19]). A sequence of functions {Fk :

Rn

-

- ^{R )}epi-converges to F : Rn

- ^R

relative to X

C Rn

if for any x E ,rr' (i) liminfk,, F k ( x k )

>

^{F ( x )}^{for all}^{x k}

-

^{x , x k}^E^X;

(ii) limk,, F ~ " ( x ~ ) = F ( x ) for some sequence x k -+ x , x k E X . 7'hr scquence { F ~ ) epi-converges t o F if this holds relative to X = R n .

For example, if g : Rn x Rm

- ^%?

is (jointly) lsc a t ( T , p ) a n d is continuous in y a t

:/,

then for any sequence yk

- ^p,

t h e corresponding sequence of functions F k ( . ) = g ( . , y k ) epi-converges t o F ( . ) = g ( . , y ) .

(9)

T h e following i m p o r t a n t property of epi-convergent functions shows tha.t constrained opti- rnization of a discontinuous function F ( x ) can be in principle carried o u t through optimization o f approsimating epi-convergent functions F k ( x ) .

Theorem 3.1 If sequence of functions {Fk :

Rn - ^{R )}

epi-convel-ges to F : Rn + then f i r any compacl Ii C

Rn

lirn(1im inf(inf F ~ ) ) = lim(1im sup(inf F ~ ) ) = inf F,

€10 k I<, €10 k K , 11- ( 4 )

where Ii, ⁼Ii

+

c B , B = {x E RnIIIxII

5

1).

~f

Fk(x:)

5

inf~;, Fk

t

Sk, z E I , , Sk

J

0 a s k ⁺co, then

lirn sup(1im sup x, k )

5

argnzinI,-F,

e l 0 k

(5) 11;here (lirn supk x:) denotes the set X , of cluster points of the sequence {x:) and (lim sup,lo X , ) denotes the set of cluster points of the family { X , , ^E E R+) as ^F

L

0.

Proof. Note t h a t (infl;, F k ) monotonously increases (non decreases) as E

L

0 , hence the s a m e holds for liminfk,, inflc, Fk and l i m s ~ p ~ + ~ i n f ~ , ~ , F k . T h u s limits over c J 0 in ( 4 ) exist.

Let us t,ake arbitrary sequence 6 , 0, indices k& and points x& such t h a t under fixed m lirn inf(inf F ~ ) = lirn ( inf F k R ) = lim FkR(x;).

k I<,, s-+W I<,, s-+m

T h u s

for some indices s,. By property (i) of epi-convergence lirn,,, F " ~ (xLm)

>

^infl,-^F.^Hence

lim(1im sup(inf Fk))

2

lim(1irn inf(inf F k ) )

>

^inf^F

c10 k I<, €10 k I<, I<

Let us proof t h e opposite inequality. Since F is lower semicontinuous, then F ( z ) = infI; F for some x E Ii. By condition (ii) of epi-convergence there exists sequence x k

-

x such t h a t F ~ ( x ~ ) + F ( x ) . For k sufficiently large x k E Ii,, hence infIiC Fk

5

F k ( x k ) and

lim(1im inf(inf F ~ ) )

<

lim(1im sup(inf F k ) )

5

F ( x ) = inf F.

€10 k K , €10 k I<, I<

'The proof of ( 4 ) is completed.

Now prove (5). Let x t E K , and Fk(x,k)

5

^infI,-,Fk + b k , bk

L

0. Denote X, = lim supk 2:

C

I i , . Let F,

L

0, x,, E

X,,

a n d x,, -+ x E Ii as m ^--,co. By construction of

S,

for each

k"

lired rn there exist sequences xmm

-

^x,, satisfying Fk$(xk')

5

^infh., Fk,

+

^6kL, _6kk

₁

0 i i s s

--

^cc.^Byproperty ( i )

F ( x c m )

<

^{lirn inf}^{F ~ R}( x i & )

<

lirn inf( inf Fk;)

5

lim sup( inf F ~ ) .

5 S I<,, k Ice,

(10)

Due t o lower semicontinuity of F and (4) we obtain

F ( x )

5

lirn inf F(x,,)

_<

lirn inf(1im sup(inf F ~ ) ) = inf F,

m d o o tm10 k IiFCrn K

hencr x E argminIil;: t h a t proves (51.0

Remark t h a t in Theorem 3.1 we could relax constraint set 11' in different ways, for instance, i f 1I' = {x E R n

I

^{G ( x )}

5

0 ) with some lower semicontinuous function G ( x ) , then we could define h, = {x E R n ( G ( x )

<

^{E ) ,} ^{E >}^0.

Let us illustrate t h e result of Theorem 3.1 by the following example.

Example 3.2 Consider a discontinuous optimization problem

I,LI r 0 ( x ) be a family of averaged functions for F associated with a family of mollifiers $ ( Y / B ) , 6

>

0: where molliher $(.) is symmetric with respect to point y = 0. Obviously, functions F e ( x ) epi- converge to F and minx>o - F e ( x ) = ~ ' ( 0 ) = 112. If we don't relax constraint set

{XI

^x

>

0 ) then oplinzization of approximate functions ~ ' ( x ) over set

{XI

^x

>

0 ) leads to a ujrong resull

lirn min F'(x) = ^-1

8-0 z20 2

The relaxation according to Theorem 3.1 leads to the true optimal value of the problem:

lirn min ~ ' ( x ) = 0

0-0 z > - €

nnd thus

lim(1irn min ~ ' ( x ) ) = 0 = min F ( x ) .

€ 1 0 0-0 r > - c x>O

T h e following statement jointly with Theorem 3.1 shows t h a t t h e averaged functions call be used for optimization of discontinuous functions.

Theorem 3.2 (Ermoliez, et al. [t?]). For any strongly lower semicontinuous, locally integrable function F : R n

-

^{R ,}a n y associated sequence of averaged functions { ~ ' k , Ok

1

0 ) epi-

converges t o F .

Jointly with Propositions 2.1,2.3 Theorem 3.2 gives sufficient conditions for average functions t o epi-converge t o original discontinuous expectation function.

.\ subdifferential calculous for nonsmooth and discontinuous functions can be developed on t h e basis of their mollifier approximations.

(11)

Definition 3 . 3 Let function F : Rn

-

^Rbe locally integrable and { F k := F 0 k } be a sequence of averaged functions generated from F by means of the sequence of mollifiers { $ k :=

+sk

:

Rn ^--t

R }

where Ok

1

0 as k ⁺co. Assume that the mollifiers are such that the averaged functions F~ are smooth (of class C1). T h e set of $-mollifier subgradients (subdiflerential) of

F at x is by definition

d+J'(x) := lim s u p { v ~ ( x k ) ~ x k -+ x } ,

k

i.e. B + F ( x ) consists of the cluster points o f all possible sequences { v F k ( x k ) } such that x k ⁺x . T h e subdifferential a d . F ( x ) has the following properties (see Ermoliev, Norkin a n d Wets

[s]):

d s F ( x ) = d F ( x ) for convex functions F ( x ) ;

c o ~ z e ~ . h u l l d ~ , F ( x ) = d c l a T k , F ( x ) for locally Lipschitzian function F ( x ) ; O s F ( x ) = d w a T S , F ( ~ ) for continuous functions.

Theorem 3 . 3 (Ermoliev et al. 181). Suppose that F : Rn + R is strongly lower serrzicontinu- 011s and locc~lly integrable. T h e n for any sequence { $ s k } of smooth mollifiers, uie have 0 E d + l ; ' ( x )

~rlhencvcr x is (I local minimizer of F .

4 Necessary optimality conditions

Tlleorcm 3.3 can be used for constrained optimization problems if exact penalties are applicable.

I,-nfortunately, this operation can practically remove some important miiiimums of t h e original problem. Corisider the following example:

Point x = 0 is a reasonable minimum of the problem. We coillcl replace this problem, for example, by t h e following one:

T h e penalty function F ( x ) has single discontinuity point x = 0 , wliere

F

achieves its global minimum F ( 0 ) = 0. T h u s penalty functions may have isolated minimums, which are difficult t o discover.

Besides, we also encounter the following difficulties. Consider

(12)

In any reasonable definition of gradients t h e gradient of t h e function

f i

a t point x = 0 equals t o + m . Ilence t o formulate necessary optimality conditions for such problems and possibly involving discontinuities we need a special notion which incorporates infinite quantities. An appropriate notion is a cosmic vector space

Rn

introduced by Rockafellar and \iI1ets [20]. Denote R+ = {x E R( x >_ 0) a n d = R+ U { f c o ) .

Definition 4.1 Define a (cosmic) space

Rn

a s a set of pairs ?f = ( x , u ) , where x E R n , IIxII = 1 a n d a E

z.

All pairs of the fornz ( x , 0) are considered identical a n d are denoted a s

D.

-4 topology in t h e space

F

is defined by means of cosmically convergent sequences.

Definition 4.2 Sequence ( x k , a k ) E

Rn

is called (cosnzically) convergent to a n element ( x , a ) E

F

(denoted c-limk,,(xk, ak)) if either limk ak = a = 0 o r there exist both limits limk xk E R n , limk ak E ^{a n d x}= limk xk, a = limk ak

#

^{0, i.e.}

(limk xk,limk a k ) if (limk ak)

<

+co, c-lirnn.(xk, uk) = (limk x k , f m ) if ak

-

^{+ a ,}

(limk x k , S m ) if ak = S m . Denote

(.-Limsupk(xk, uk) = { ( x , a ) E

Rn( 3 { k m )

: ( x , a ) = ~ - l i m ~ , , ( x ~ , ~ , ak,)).

For closed set 1i C R n denote a tangent cone

ri

^-^x

TI,-(x) = lim sup

-,

7 7-

to the set Ii a t point x , normal cones

J?,(.T) = { u E R n l

<

v , w

><

0 for all w E TI\-(z)), lVli(x) = lim sup NIiT, _-

f 'X

and extended normal cone

For what follows we need t h e following closeness property of normal cone mapping ( x , ^E)

-

lvI,-c ( x )

.

Lemma 4 . 1 Let Ii, = I i + c

x

B , B = {x E RnI J J x J (

5

1 ) . Then f o r a n y sequences .c

-

^T^E^Ii

a n d c

-

⁰

^,

(13)

Proof. For x E R n define y ( x ) E Ii' such t h a t

Let US show t h a t T I , - ( y ( x ) )

C

T 1 i c ( x ) . Let w E T I < ( Y ( x ) ) , i.e.

w = "-03 lim y u - Tu where yV E I i , yu

-

y ( x ) , T"

-

^{0 .}

Denote x" = yu

+

( x - y ( x ) ) E I<,. Then by definition x u - x

w = lim -

T" ^ET l i , ( ~ )

and thus T I i ( y ( x ) )

C

T I c , ( x ) . T h i s inclusion implies ~ 1 , - A X )

C

N I ~ ( ~ ( X ) ) and NI,-c(x)

C

X I , - ( y ( x ) ) . IIence

lim s u p N I , - , ( x ) lim s u p N K ( Y ( x ) )

C

ili~,-(T).O

r-T,c-0 x i

Corollary 4.1 f i r extended normal cones we have the same closeness property, lim s u p :liICL - ( x )

C

F l i ( T ) .

x-Z,c-0

Remark. File could use another sort of relaxation for set K. Suppose I< is convex and is given by an inequality constraint:

w i t h some convex function G ( x ) . Consider a relaxed set

S o r m a l cones t o K , a n d

h'

= I i o a r e formed by subdifferentials 3 G ( x ) , x E I<,, of function G,

S o w closeness property of mapping ( x , ^{E )}

-

^NIi,, stated in Lemma 4.1 follows from closeness of subdifferential mapping x

-

a G ( x ) .

Definition 4.3 Let function F : R n

-

R be locally integrable and { F k := ~ ' k be a sequence )

of azjeraged functions generated from F by convolution with mollifiers { $ k := $sk : R n ^---tR ) where Bk

1

0 as k ^--tm. Assume that the rnollifiers are such that the averaged functions F~ are smooth (of class C ' ) . The set of the extended $-mollifier subgradients of F at x is by definition

w h e n erpression is replaced by any unit vector if V F k ( x k ) = 0, i.e. a U F ' ( x ) consists

l l V F ( x

Ill

v r ^{( x}

r,j the cluster points ( i n cosmic space

F )

of all possible sequences {(&, I I V F ~ ( X ~ ) I I ) ) suck that x k

-

x . The full (extended) Q-rnolli'er subgradient set is 3 * E ' ( x ) := u i 3 + F ( x ) l~ihere ^{$ 9} ranges over all possible sequences of mollifiers that generate smooth averaged functions.

(14)

T h e extended mollifier subdifferential ~ $ F ( x ) is always a non-empty closed set in

R".

Now we can formulate necessary optimality conditions for constrained discontinuous optimization problem: min{F(x)l x E Ii}, where F ( x ) may have t h e form of t h e expectation.

Theorem 4.1 Let Ii be a closed set i n

Rn.

Assume that a locally integrable function F has a local m i n i m u m relative to

Ii

at some point x

E

Ii and there is a sequence x k

c

E i , x k --+ x with F contir~uous at x k and F ( x ~ ) ⁺F ( x ) . Then, for any sequence {I+!Ik} o f smooth rnollifiers, one has

where

-B$F(X)

= { ( - g , a )

E F(

( g , a )

E

~ $ F ( X ) } .

Proof. Let x be a local minimizer of I: on Ii. For a sufficiently small compact neighborhood V of x , define

4

^:=^{F ( z )}

+

^{( / z}^-x1I2 T h e function

4

achieves i t s global minimum on (Ii

n

V ) a t z . Consider also the averaged functions

I n [8] it is shown t h a t (i) functions

4k

are continuously differentiable, (ii) they epi-converge t o

4

relative t o Ii

n

V a n d (iii) their global minimums z k on Ii

n

V converge t o x a s k ^--+ CG. For sufficiently large k t h e following necessary optimality condition is satisfied:

If v F k m ( z k m ) = 0 for some {zkm

-

^x}then a l s o 0

E

3 $ F ( x ) and

O

E XI,-(x). If v F k m ( z k m )

--

g

#

0 for some {zkm + x} then

a n d

(h,

^llgll)^E

^B$F(x), (-h,

^))g11)

^E

NIi(x). If l i m s u p k 1l'i'lik(zk)ll = + m then for some {zkm

-

^x}

and ( g . + m ) E d $ F ( x ) , (-g, + m )

E

N I ~ ( x ) . ~

Nest proposition shows t h a t optimality conditions are also satisfied for limits X' of some local minimizers x, of relaxed problems m i n { F ( x ) J x

E

I{, = K

+

^EB).

(15)

Proposition 4.1 Let x , be a local minimizer such that there exists sequence z f ⁺x,, xt E li, with F continuous at x t and F(x,k) + F ( x , ) as k + CQ. A s s u m e x,,

-

^x^{for some}^t, ⁰

as m

-

^m.T h e n ( 8 ) is satisfied at x .

Proof follows from Theorem 4.1 and closeness of (extended) mollifier subdifferential mapping x + d * F ( x ) and (extended) normal cone mapping ( x , ^t)+ NIi,(x).

Proposition 4 . 2 If F is strongly lsc and the constraint set K i s compact then the set X * of points, satisfying necessary optimality condition ( 8 ) , i s nonempty and contains at least one global minimizer of F i n K .

Proof. Construct a sequence of differentiable averaged functions Fk epi-converging t o F ( w h a t is possible by Theorem 3.2). Relax constraint set K , i.e. define li, = li

+

^t

x

B, where B = { z1 IIxII

5

1). Find a global minimizer x t of Fk over I{,. For x t we have necessary optimality condition (see Rockafellar and Wets [21]):

)lie can assume t h a t x t + y, E K c . From here it follows

S o w let y, ^-+y E l i , t -+ 0. By Theorem 3.1 y E X*. Then by closeness of mappings d F ( . ) and NIit(.) we finally obtain

Now let us come back t o problem (7) and show how the developed theory resolves t h e exposed difficulties.

Example 4.1 Consider again a n optimization problem: min{*l x

>

^{0 ) .}T h e n we have

- -

a * G / ; x = o = ( + l , + m ) , Nx>o(O) - = U a E K ( - l , a ) and thus

-

- d i G l x = o

n

N x 2 0 ( ~ ) = (-1, + C Q )

# 0.

5 On numerical optimization procedures

Theorem 4.1 and Propositions 4.1, 4.2 immediately give a t least t h e following idea for t h e approximate solution of problem ( I ) , (2). Let us fix a small smoothing parameter 9 and a

(16)

small constraint relaxation parameter 6 , choose a mollifier $e(.) = $ ( . l o ) and instead of original tiiscontinuous optimization problem consider a relaxed smoothed optimization problem:

min[l."(x)1 x E K,], (9)

where F s ( x ) is defined by (3). Then stochastic gradient method t o solve (9) has the form:

x0 is an arbitrary starting point;

where E { [ ~ ( X ~ ) ~ X ~ ) = V F ' ( X ~ ) , IIK, denotes the orthogonal projection operator on the set K,, positive step multipliers pk satisfy conditions

Vectors t e ( x k ) can be called stochastic mollifiers gradients.

T h e convergence of such kind stochastic gradient method t o a stationary set

follo~vs from results [5]. Now coming t o the limit in 8

-

⁰and then i n c

-

⁰^{we see}^that

limit points [lim sup,(lim sups

x!)]

satisfy necessary optimality condition (8).

(17)

References

[ I ] Antonov G . E . a n d Katkovnik V.Ya. (1970), Filtration a n d smoothing in extremum search problems for multivariable functions, Avtomatika i vychislitelnaya tehnika, N.4, Riga. (In Russian).

[2] Archetti F . a n d Betri, B. (1975), Convex programming via stochastic regularization, Quaderni del Dipartimento di Ricerca Operativa e Scienze Statistiche, N 17, Universitk di Pisa.

[3] Batuhtin,B.D. (1994), O n one approach t o solving discontinuous extremal problems, Izvestia AX Rossii. Tehnicheskaia kibernetika (Communications of Russian Academy of Sciences.

Technical Cybernetics), No. 3, pp.37-46. (In Russian).

[4] Batuhtin,B.D. a n d Maiboroda L.A. (1984), Optimization of discontinuous functions, Moscow, Nauka. (In Russian).

[5] Dorofeev P.A. (1986), A scheme of iterative minimization methods, U.S.S.R. Comput. M a t h . h l a t h . Phys., Vol. 26, No. 2, pp.131-136. ( I n Russian).

[6] Ermoliev Yu. a n d Gaivoronski A. (1992), Stochastic programming techniques for optimization of discrete event systems, Annals of Operations Research, Vol. 39, pp.120-135.

[7] Ermoliev Yu.M. a n d Norkin V.I. (1995), O n Nonsmooth Problems of Stochastic Systems Optimization, Working P a p e r WP-95-096, Int. Inst. for Appl. Syst. Anal., Laxenburg, Aus- t r i a .

[8] Ermoliev Yu.M, Norkin V.I. and Wets R.J-B. (1995), T h e minimization of semi-continuous functions: Mollifier subgradients, SIAM J . Contr. a n d Opt., No.1, pp.149-167.

[9] Gupal A.M. (1977), O n a method for the minimization of almost differentiable functions, Kibernetika, No. 1, pp.114-116. (In Russian, English translation in: Cybernetics, Vol. 13,

n'.

1 ) .

[ l o ] Gupal A.M. (1979), Stochastic methods for solving nonsmooth extremal problems, Naukova dumka, Kiev. (In Russian).

[ l l ] Gupal A.M. a n d Norkin V.I. (1977), An algorithm for minimization of discontinuous functions, Kibernetika, No. 2, 73-75. (In Russian, English translation in: Cybernetics. Vol. 13, N. 2).

(18)

[12] Hasminski R.Z. (1965), Application of random noise in optimiza.tion and recognition problems, Problemy peredachi informatzii, Vol. 1, N. 3. (In Russian).

[13] I<atkovnik V.Ya. (1976), Linear Estimates and Stochastic Optimization Problems, Nauka, Moscow. (In Russian).

[14] Katkovnik V.Ya. and Kulchitsky Yu. (1972), Convergence of a class of random search algorithms, Automat. Remote Control, No. 8 , pp. 1321-1326. (In Russian).

[15] I<reimer J . and Rubinstein R.Y. (1992), Nondifferentiable optimization via smooth approx- imation: general analytical approach, Annals of Oper. Res., Vol. 39, pp.97-119.

[16] Mayne D.Q. and Polak E. (1984), Nondifferentiable optimization via adaptive smoothing, J . of O p t . Theory and Appl., Vol. 43, pp.601-613.

[l'i] Mikhalevich V.S., Gupal A.M. and Norkin V.I. (1987), Methods of nonconvez optimization, Nauka., Moscow. (In Russian).

[IS] Nikolaeva N.D. (1974), On a n algorithm for solving convex programming problems, Econom.

i M a t e m . Methody, Vol. 10, pp. 941-946 (In Russian).

[19] Rockafellar R . T . and Wets R.J-B. (1984), Variational systems, an introduction, in: Multi- functions a n d Integrands, G.Salinetti, ed., Lecture Notes in Mathematics 1091, Springer-

Verlag, Berlin, pp.1-54.

[20] Rockafellar R.T. and Wets R.J-B. (1991), Cosmic convergence, in: Optimization a n d Non- linear Analysis, eds. A.Ioffe, M.Marcus and S.Reich, P i t m a n Research Notes in Mathematics Series 244, Longman Scientific & Technical, Essex, U.K., pp. 249-272.

[21] Itockafellar R.T. and Wets R.J-B. (1995), Variational Analysis, a monograph t o be pub- lished i n Springer-Verlag.

1221 Rubinstein R.Y. (1983), Smoothed functionals in stochastic optimization, M a t h . Oper. Res, Vol. S, pp.26-33.

[23] Yudin D.B. (1965), Qualitative methods for analysis of complex systems I, Izvestia AN SSSR, Tehnich. Kibernetika, No. 1. (In Russian).

[24] Zaharov V.V. (1970), Integral smoothing method i n multi-extremal and stochastic problems, Izvestia AN SSSR, Tehnich. Kibernetika, No. 4. (In Russian).

[25] Warga J . (1975), Necessary conditions without differentiability assumptions in optimal control, J . Diff. Equations, Vol. 15, pp.41-61.

Constrained Optimization of Discontinuous Systems

Working Paper