Stochastic Optimization Techniques for Finding Optimal Submeasures

(1)

Working Paper

STOCHASTIC OPTIMIZATION TECHNIQUES FOR

FINDING

OPTIMAL SUBMEASURES

Alexei Gaivoronski

May 1985 WP-85-28

International Institute for Applied Systems Analysis

A-2361 Laxenburg, Austria

(2)

NOT FOR QUOTATION WITHOUT PERMISSION OF THE AUTHOR

STOCHASTIC OPTIMIZATION TECHNIQUES FOR

FINDING

OPTIMAL SUBMEASURES

Alexei Gaivoronski

May 1985 WP-85-28

Working Fkpers a r e interim r e p o r t s on work of t h e I n t e r n a t i o n a l I n s t i t u t e f o r Applied Systems Analysis a n d h a v e r e c e i v e d only lim- ited review. Views o r opinions e x p r e s s e d h e r e i n d o n o t neces- s a r i l y r e p r e s e n t t h o s e of t h e I n s t i t u t e or of i t s National Member Organizations.

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS 2361 L a x e n b u r g , Austria

(3)

FOREWORD

In this p a p e r , t h e a u t h o r look at some quite g e n e r a l optimization problems on t h e s p a c e of probabilistic measures. These problems originated in mathematical s t a t i s t i c s b u t have applications in s e v e r a l o t h e r areas of mathematical analysis. The a u t h o r extend previous work by considering a more g e n e r a l form of t h e c o n s t r a i n t s , and develop numerical methods (based on s t o c h a s t i c quasigradient techniques) a n d some duality r e l a t i o n s f o r problems of t h i s type.

This p a p e r i s a contribution t o r e s e a r c h on s t o c h a s t i c optimization c u r r e n t l y underway within t h e Adaptation and Optimization P r o j e c t .

Alexander B. Kurzhanski Chairman

System and Decision Sciences P r o g r a m

(4)

CONTENTS

1. INTRODUCTION

2. CHARACTERIZATION O F T H E OPTIMAL S O L U T I O N S 3. STOCHASTIC OPTIMIZATION METHOD

4. P A R A M E T R I C D E P E N D E N C E O F g (y

,H)

ON M E A S U R E 5. NUMERICAL E X P E R I M E N T

(5)

STOCHASTIC OPTIMIZATION TECHNIQUES FOR FINDING OPT= SUBMEASURES

Alexei Gaivoronski

1. INTRODUCTION

Optimality conditions based on duality r e l a t i o n s were studied in

[I]

f o r t h e following optimization problem.

Find t h e positive Borel measure

H

such t h a t +O(H)

=

rnax with r e s p e c t t o c o n s t r a i n t s

H ~ ( A ) 5 H ( A ) ~ P ( A ) f o r all Borel A cYcR"

where Y- some s u b s e t of Euclidean s p a c e

R n ,

+ ' ( H ) - function which

(6)

depends on the measure H , usually some kind of directional differentiability and convexity is assumed. Hu and $ a r e some positive Bore1 measures.

Stochastic optimization methods f o r solving (1)-(4) in c a s e when functions

(ki ( H ) a r e linear with r e s p e c t t o H were developed in [I]. In this p a p e r such methods a r e developed f o r nonlinear functions (ki(H) and f o r a r b i - t r a r y finite measures. I n t e r e s t f o r such a problem is originated from statis- tics where i t a p p e a r s in finite population sampling [2,3].

Suppose t h a t w e have collection D of N objects, e a c h object is described by p a i r (z* ,yi),i

=

1:N. Variables y l are known and variables zf can be observed f o r e a c h p a r t i c u l a r i in t h e following way:

where c;*

-

random independent variables with z e r o mean, and zf

-

^observa-

tions. It is assumed usually t h a t relationship between

zl

and y f is known up t o t h e s e t of unknown parameters:

where h (y )

=

^{( h}^l(y) ,

...

^,hl^(y)) a r e known functions and .IP

=

(.IP1,..., d l ) a r e p a r a m e t e r s t o be determined.

The problem is t o s e l e c t subset d

cD

consisting of n objects in o r d e r t o get in some sense t h e best possible estimate of p a r a m e t e r s 19. This estimate is based on observations z l f o r objects belonging t o d .

Applying t h e usual a p p r o a c h of optimal experimental design [4-61 _one can substitute t h e collection ^(y ,y, ) by measure HU (y ) and subset of objects t o be observed by measure H(y). The variance matrix D of t h e best l i n e a r estimate in c a s e all o l h a s t h e same variance becomes a f t e r such

(7)

substitution proportional t o matrix M , defined as follows:

and t h e problem becomes t o minimize some function

+

^of^{M ,}^such^asd e t e r - minant, t r a c e , t h e l a r g e s t eigenvalue, etc.

min @ ( M )

H ^{( 5 )}

with r e s p e c t t o obvious c o n s t r a i n t

f o r all Bore1 A CY. Another possible application of t h e problem (1)-(4) are approximation schemes f o r s t o c h a s t i c optimization [7,8].

The p u r p o s e of t h i s p a p e r is t o develop s t o c h a s t i c optimization methods dealing with such problems. In section 2 t h e c h a r a c t e r i z a t i o n of solutions f o r quite g e n e r a l c l a s s e s of measures is obtained. The conseptual algorithm f o r solving nonlinear problems is proposed in t h e section 3, which is applied in section 4 t o p a r t i c u l a r problems of t h e kind ( 5 ) - ( 6 ) . In section 5 r e s u l t s of some numerical experiments are presented.

2. CHARACTEKIZATION OF THE OPTIMAL SOLUTIONS

We shall consider s u b s e t Y of Euclidean s p a c e Rn and some a-field

P

o n it. We shall assume t h a t a l l measures specified below are defined on t h i s a- field.

In t h i s section, t h e r e p r e s e n t a t i o n of measures H, which are t h e solution of t h e following problem, will b e developed:

(8)

s u b j e c t t o c o n s t r a i n t

The c o n s t r a i n t (8) means t h a t H1 (E) S H ( E ) S

HU

(E) for a n y E

c

Z. Define HA

= P - ^{p .}

In what follows t h e s p a c e s L ~ ( Y , ?,HA) a n d L,(Y, P.HA) play a n important r o l e , where L l ( Y , ~ , ~ A ) is t h e s p a c e of all HA-measurable functions g (y ) defined on Y a n d such t h a t

!

g ( y )

1

dHA

<

^a+,L ,(Y.E.HA) i s t h e

r

s p a c e of a l l H k m e a s u r a b l e a n d HA-essentially bounded functions g ( y ), defined on Y. In what follows w e s h a l l d e n o t e by

1 I . 11,

t h e norm in t h e s p a c e L ,(Y, E,&), i.e.

Let u s d e n o t e by G t h e set of all measures, satisfying (8):

a n d by G b t h e set of all m e a s u r e s , satisfying in addition (9):

Suppose t h a t J' ( y ) i s some function defined on Y, c-some number and define t h e following sets

In notations below w e s h a l l s u b s t i t u t e in t h i s definition instead of

3

v a r i o u s p a r t i c u l a r functions. Take

(9)

and define, as usual, by H-,Hf and

1

H

!

positive, negative and t o t a l variation of t h e measure H .

We shall f i r s t consider t h e problem in which function 9 ( H ) is linear:

and d e s c r i b e t h e set of all solutions of (10). The following r e s u l t is general- ization of Lemma 1 from [I].

THEOREM 1. Suppose t h a t t h e following conditions are satisfied:

2. ^{For any}E E z.H'(E)

>

0 e x i s t s El€:, El-, such t h a t e i t h e r El is HA-atom o r 0

<

H'(E~)

<

=

Then t h e solution of problem (10) e x i s t s and. any such solution h a s t h e following r e p r e s e n t a t i o n :

(i) H' (A )

=

H" (A ) f o r any A

EZ,

A CZ+(c ' ,g ) (ii) H' (A )

=

Hi (A ) f o r any A

EZ,

A CZ-(c ' ,g )

(iii) HU (A ) 2 H' (A ) 2 H1 (A ) f o r any A EZ, A CZO(c ' ,g ) and

~ ' ( P ' ( c ' , g ) )

=

b

-H' ^(Y)

- H ~ ( Z + ( C * , ~ ) )

Conversely, any measure defined by (i)-(iii) is t h e solution of t h e problem (10)'

(10)

PROOF. Let us f i r s t p r o v e t h a t t h e measure with p r o p e r t i e s (i)-(iii) exists.

It is c l e a r t h a t any measure on (Y,:) is defined by i t s values on subsets of Z + ( c ' , g ) , Z0(c0 , g ) and Z-(c ' , g ) because t h e s e sets belong t o

-

z due t o g ( y ) E L1(Y, l . H A ) and Y equals t o union of these s e t s . T h e r e f o r e i t i s suffi- cient t o show t h a t among measures satisfying (i)-(ii) e x i s t measure which satisfies also (iii).

From t h e definition of Z+(C ,g ) we have:

and

f o r all c

>

c z . This gives

and t h e r e f o r e

According t o t h e condition 4 we have HA(Y\Z-(c ' ,g )) 2 b -Hi (Y) in c a s e if c '

=

0. In f a c t , i t is t r u e f o r a r b i t r a r y c ' . Suppose a t f i r s t t h a t c '

>

0.

Note t h a t , f o r any c > O we have H ~ ( Z + ( C , ~ ~ ) < - because g ( y ) E L l(Y, E v e ) . Consider now t h e sequence c, :

We have

because

(11)

and

From t h e definition of c ' and t h e f a c t t h a t c ,

<

c * we have:

which gives

The c a s e when c *

<

⁰is t r e a t e d in t h e same way taking into account t h e f a c t t h a t H A ( Z + ( c ,g ))

<

= f o r a l l c if c '

<

0. Thus, we obtain

Now if H A ( Z O ( c ' ,g ))

<

= t h e measure H' :

1

^otherwise

satisfies all conditions ( i ) - ( i i i ) .

When H A ( Z O (c * ,g ))

=

=, condition 2 implies t h e existence El C ? ( c * ,g ) such t h a t e i t h e r

o r HA(E1)

=

^and^{E l} is HA-atom. In t h e former c a s e H' i s defined simi- larly t o (11) using s e t E l instead of Z O ( c ' , g ) and taking H'

=

H' on Z O ( c * , g ) \ E l . In t h e l a t t e r c a s e t a k e

(12)

and again H'

= &

on ZO(c * ,g ) \ E l . All this proves t h e existence of meas- u r e H' which satisfies (i)-(iii). Let us prove t h a t f o r any measure

g

^E^Gb

which violate (i)-(iii), we have

Suppose t h a t f o r measure

H

(i) does not hold, i.e., t h e r e i s some s e t E

c

Z f ( c ' , g ) such t h a t E ( E )

< H"(Z).

This implies t h e existence of s e t El C E such t h a t g ( y )

> c

> c ' f o r y € E l and H8(E1)-g(E1)

>

⁷

>

0.

Notice t h a t H ' ( A ) ~ ~ ( A ) f o r A c Z f ( c ' , g ) and H ' ( A ) ~ ; ( A ) f o r A

c

2 - ( c ' , g ) due t o definition of H'. This gives

This inequalities lead t o t h e following estimate:

+

^{c '}(H' - ~ ) ( Y \ E , )

Thus, if

i

E G b violates (i) i t cannot be t h e solution of (10). Other

(13)

possibilities a r e considered in t h e same way. T h e r e f o r e , any optimal meas- u r e h a s r e p r e s e n t a t i o n (i)-(iii). I t follows from definition t h a t a l l measures satisfying (i)-(iii) h a s t h e same value of

and t h e r e f o r e a r e optimal. Proof is complete.

REMARK. I t is clear t h a t in t h e c h a r a c t e r i z a t i o n of optimal measures

any

c"

c a n be taken instead of c ' such t h a t

C ' 5; < S U P tc: ~ ~ ( z + ( c , g ) r ~ - H ~ ( Y ) {

Note t h a t if measure HA h a s bounded variation conditions 2 and 4 a r e satisfied automatically. For such measures t h e s t r u c t u r e of solutions c a n b e studied using g e n e r a l duality t h e o r y [9].

Let us now consider in more detail t h e s e t G*. If t h e measure

d

^{h a s}

finite variation w e have t h e following r e p r e s e n t a t i o n f o r a r b i t r a r y H E Gb : H

=

HL

+

(H-HL)

where measure H - H L is finite. positive and continuous with r e s p e c t t o measure HA. If HA is a-finite w e c a n use Radon-Nycodym theorem [lo] and f o r a r b i t r a r y H E G b obtain t h e following r e p r e s e n t a t i o n :

H(E)

= H'

( E )

+ f

h H ( y ) d ~ A VE E

Z

E

where h H E L l ( Y , Z , ~ A ) and t h i s r e p r e s e n t a t i o n i s unique. For a r b i t r a r y E E Z w e have:

(14)

and t h e r e f o r e 0 5 h g ( y ) S 1 HA-everywhere. Consider now t h e s e t Kb

c L

1 ( ~ ,

z,+):

Each function from this set defines measure HA from G b : Hh(E)

= F+

( E )

+

Jh ( y ) d ~ A . E EE

E

Therefore ( 1 3 , (14) defines isomorphism between s e t s G b and

Kb

such t h a t t h e problem (7)-(9) is equivalent t o t h e following one:

max T ( h ) (15)

subject t o constraints

where t h e function

T(h

⁾

=

*(HA). Optimal values of problems (15)-(17) and (7)-(9) a r e t h e same and e a c h solution of (15)-(17) defines solution of (7)-(9) through (14) and vice v e r s a . This equivalence together with c e r t a i n convexity assumptions lead t o solution representation f o r problems (7)-(9) similar to theorem 1:

THEOREM 2. Suppose t h a t t h e following assumptions a r e satisfied:

1. Measures HL and

HU

have bounded variation,

&Y) 5s b , P ( Y ) 2 b

(15)

2 . Jr(H) is concave and finite f o r H € G b

,, ⁼

^{Gb +Gc}

where G

, =

fH,:

1

H,

1

( Y ) 5 E , Hc is HA-continuous j f o r some E

>

0.

Then

1) For e a c h H 1 E G b e x i s t s g ( y ,H I ) E L

,(Y,

:,HA) such t h a t

f o r a l l

H 2

E G b

2 ) The solution H e of problem ( 7 ) - ( 9 ) exists.

3) For any E E and any optimal solution ~ H' of t h e problem ( 7 ) - ( 9 ) w e have t h e following r e p r e s e n t a t i o n :

H " ( E ) f o r E

c

Z f ( c * , g ( y , H a ) )

$

( E ) f o r E

c

Z -(c * ,g ( y , H a )) ( 1 9 )

& ( E ) S H ( E ) W ( E ) f o r E c ~ ~ ( c * , g ( y , H ' ) ) where

c'

=

inf )c : H*(z+(c ,g (y , H * ) ) 5 b

-18 M

j and

f o r a l l H E Gb

.

Conversely, if f o r some HI EGb e x i s t s g ( y , H I ) E L ,(Y,

I ,

H A ) such t h a t (18) i s fulfilled and Hl c a n b e r e p r e s e n t e d according to ( 1 9 ) then H 1 i s t h e optimal solution of t h e problem ( 7 ) - ( 9 ) .

(16)

PROOF.

The previous argument shows t h a t under assumptions of t h e theorem problem (7)-(9) is equivalent t o t h e problem (15)-(17) and t h e r e is isomorphism between s e t G b , , a s defined in condition 2, and the,following s e t

Function P ( h ) from (15) i s concave on t h e s e t Kb,., which is &-vicinity of Kb in L ,(y,2,HA). T h e r e f o r e f o r each L ^EKb e x i s t s subdifferential of concave function q ( h ) [ l l , 121, which in t h i s c a s e i s l i n e a r continuous functional f E L ; (Y,

L,

HA) such t h a t

Taking into account r e p r e s e n t a t i o n of L ; ( y , z , H A ) [l o ] w e get:

where

which t o g e t h e r with (12) implies

f o r all H,H1 E G b where g ( y ,H)

=

i ( y ,AH). Thus, (18) is proved. Note t h a t w e may c o n s i d e r function g (y ,HI) from (18) (possibly non-unique) a s subdifferential of t h e function *(H) a t point HI.

N o w o b s e r v e t h a t t h e s e t Kb is weakly sequentially compact in L ,(Y, Z, HA) because H A ( v

<

^and

(17)

uniformly f o r h E Kb ( s e e [ l o , p.2941). Let us p r o v e t h a t i t is a l s o weakly closed. Consider t h e sequence h (y ) , h €& and

f o r some h ^EL ~ ( Y , ~ , H ~ ) and all g ^EL,(Y,I,H~). In p a r t i c u l a r , w e have

/

h S ( y ) d f f A

+ /

h ( y ) a A

E E

f o r all E E

E

because t h e indicator function of t h e set E E

I

c l e a r l y belongs t o L , ( Y , ~ , H ~ ) . This gives 0 S h ( y ) S 1 ~ ~ e v e r ~ w h e r e . Taking g ( y )

=

1 w e have.also

which gives

Thus, h E Kb and Kb i s weakly closed.

It follows from (20) t h a t f o r any sequence h _EK b , h ^-^,h weakly, h E Kb w e have

This t o g e t h e r with sequential compactness and closeness of Kb implies existence of h * such t h a t

Thus, solution of t h e problem (7)-(9) exists.

(18)

The general r e s u l t s of convex analysis [ I l l now imply t h a t under assumption 2 of t h e theorem f o r any solution H* of t h e problem ( 7 ) - ( 9 ) exists subdifferential g (y , H a ) of t h e function * ( H ) at point H* such t h a t

f o r all H E Gb o r , in o t h e r words H * i s one of t h e solutions of t h e following problem:

This problem is exactly of t h e type ( 1 0 ) and i t s solutions are c h a r a c t e r i z e d by t h e Theorem 1 . Conversely, if f o r some H' E Gb exists subdifferential g ( y , ~ ' ) such t h a t H' i s t h e solution of t h e problem ( 2 1 ) then H' is t h e optimal solution of t h e original problem. Proof i s now completed by using theorem 1 . Some r e l a t e d r e s u l t s were obtained f o r a special kind of function \ k ( H ) , atomless probability measure

HU

and

9 =

0 in [2].

Theorem 2 shows t h a t solutions of t h e problem ( 7 ) - ( 9 ) can b e viewed as indicator functions of some sets. Therefore many problems involving selec- tion of optimal set [13] can b e reformulated as problems of finding optimal measures.

3. STOCHASTIC OPTIMIZATION METHOD

Using t h e r e s u l t s of t h e previous section w e can construct numerical methods f o r solving problem ( 7 ) - ( 9 ) . From now on w e shall assume t h a t function * ( H ) is concave and finite on some vicinity of t h e set G and possess c e r t a i n differentiability properties:

* ( H 1 + a ( H 2 - H 1 ) )

=

* ( H 1 ) + a

f

g ( y . H ~ ) ~ ( H ~ - H ~ ) + O ( a ) Y

(19)

where o ( a ) / a --, 0 as a --, 0 f o r all H l , H 2 ^EG . This means t h a t subdif- f e r e n t i a l g ( y ,H I ) from ( 1 8 ) i s unique f o r all i n t e r i o r points of G and w e c a n assume t h a t g ( y ,H ' ) from ( 1 9 ) s a t i s f i e s a l s o ( 2 2 ) .

Consider now t h e mapping r ( c J ) from RXL,(Y, Z, H') t o G : if H

=

r ( C , f ) t h e n

H U W ) f o r E c z C ( c J ) H ( E )

= I

H ' ( E ) f o r

E

C Y \ Z + ( C J ) f o r any E E

r.

F i r s t of all w e shall give a n informal description of t h e algorithm. Sup- pose t h a t some H S E G i s t h e c u r r e n t approximation t o t h e solution of t h e problem ( 7 ) - ( 9 ) . According t o ( 2 2 ) local behavior of + ( H ) around H S i s approximated by l i n e a r form:

and if H f is t h e solution of t h e problem

t h e n direction H s - H S will b e t h e a s c e n t direction at p o i n t

P .

T h e r e f o r e w e can t a k e as t h e next approximation t o t h e optimal solution

HS

+' ⁼

H S + a ( E -HS ) ( 2 5 )

f o r some a

>

0. Consider now t h e problem of finding

E

o r suitable approximation t o it.

Suppose t h a t w e know t h e function g (y , H S ) exactly, Then, according t o theorem 1 , a l l t h e possible H.S are fully d e s c r i b e d by p a i r ( c * , g (y , HS )), where c ' is t h e solution of t h e problem

(20)

Observe now t h a t function

is nonincreasing and t h e r e f o r e solving (25)-(26) is equivalent

to

solving

C

max W S (c ), W S (c )

= /

W t ( t )dt

c T

for some T and W f ( c ) can b e considered as subgradient of t h e function W S ( c ) . Therefore w e c a n use subgradient method f o r finding c ' :

However, computation of W f ( c k , according to ( 2 8 ) involves multidimensional integration o v e r complex regions and this may b e too complicated from t h e computational point of view. In this situation stochastic quasigradient methods [14] can be used. In such methods t h e statistical estimate of W:

is implemented in ( 2 9 ) instead of

w:.

Once c ' is determined t h e measure r ( c ' , g ( y , H S ) ) defined in ( 2 3 ) , may b e a reasonable approximation t o t h e solution H,S of t h e problem ( 2 4 ) and can be used in algorithm ( 2 5 ) . However, p r e c i s e estimation of c' from ( 2 9 ) r e q u i r e s infinite number of iterations and t o make algorithm implementable, i t is necessary t o avoid this. I t a p p e a r s t h a t under c e r t a i n assumptions about stepsizes in ( 2 5 ) and ( 2 9 ) , w e may t a k e in ( 2 9 ) k

=

s and perform only one iteration in ( 2 9 ) p e r iteration in ( 2 5 ) using as approximation t o

Hf

t h e measure fiS

=

r ( c S ,g (y , I f S ) ) . Thus, along with sequence Hs w e obtain also

(21)

t h e sequence of numbers c S . Note now t h a t although

3

i s quite simple, measure

H S

would b e excessively complex even for small s. However, HS is only needed for getting gradient g (y , H S ) and in p a r t i c u l a r c a s e s some approximation ^j^'(s , y ) t o g (y , H S ) c a n be obtained using only

$

in t h e s o r t of updating formula similar t o (25).

Once sequence j' (s ,y ) with p r o p e r t y

I

f ( s , y )

-

^{g ( y}^{, H S )}

I -

⁰ ^is

obtained t o g e t h e r with sequence c =': V S (c )

-

^max ^(c⁾

-

^{0 ,}t h e optimal

C

solution of problem (7)-(9) is defined by Theorem 2 through accumulation points of t h e s e sequences. The s t r u c t u r e of optimal solution is close t o

(23).

Now w e shall define t h e algorithm f o r solving (7)-(9) formally.

1. At t h e beginning s e l e c t initial approximation t o solution

H O ,

function j'(0, y ) and number cO.

2. Suppose t h a t a t t h e s t e p number s w e get measure

HS

, function j'(s , y ) and number c S . Then on t h e next s t e p w e do t h e following:

2a. P a i r (c ,j' (s , y )) defines measure

3

according t o (23):

3 =

F(cS ,P(s ,Y ))

N e w approximation t o solution is obtained in t h e following way:

HS +l

=

^(I-as)HS

+

^a,

' 2

(30)

2b. Now number c

+'

is obtained:

C S +l

=

^{C S}₊_P,

CS

where

(22)

i.e., t h e function Vs ( c ) is defined similarly t o W S ( c ) with t h e difference t h a t j' ( s , y ) is used instead of g ( y , H S ^).

2c. N e w function j ' ( s + l , y ) is obtained in such a way as t o approxi- mate g ( y , H S 'l). The p r e c i s e way of achieving t h i s c a n b e specified only a f t e r considering p a r t i c u l a r ways of dependence g ( y . H ) on

H.

One quite general c a s e i s considered in t h e next section.

Here w e shall only assume t h a t

a s s

- ^=.

The method of achieving t h i s in p a r t i c u l a r situation will be described in t h e next section.

Before stating convergence r e s u l t s f o r algorithm ( 3 0 ) - ( 3 1 ) , two exam- ples of calculating

<'

from (31) a r e presented.

(i) Measures

Hu

and H' have piecewise-continuous densities G ( y ) and $ ( y ) respectively with r e s p e c t t o Lebesque measure. Then w e have

where

(H',(Y) -m

u ^otherwise

and p(Y) i s Lebesque measure of Y. T h e r e f o r e we can t a k e

(23)

w h e r e wS i s d i s t r i b u t e d uniformly o v e r Y.

(ii) Measures HU a n d

&

are defined by f i n i t e n u m b e r of p a i r s

In t h i s c a s e

w h e r e

T h e r e f o r e

P =

N ~ ~ t ( c ~ ) p ~ , + N ( ~ - Y ~ ~ ( ~ ~ ) ) P ~ , - b

w h e r e w S assumes v a l u e i , 1

s

i S N with p r o b a b i l i t y 1 / N

Let u s now i n v e s t i g a t e c o n v e r g e n c e of algorithm ( 3 0 ) - ( 3 1 ) . In a l l state- ments c o n c e r n i n g c o n v e r g e n c e of measures f r o m t h e set G w e s h a l l u s e t h e weak-L c o n v e r g e n c e , used a l r e a d y in t h e proof of Theorem 2:

HI

- H iff

J

g ( y ) d X L ^-+

J

g ( y ) d H

Y Y

f o r all g E L ,(Y,z,H*), a n d topology, induced by t h i s c o n v e r g e n c e will b e used without f u r t h e r r e f e r e n c e .

We s h a l l assume t h a t random v a r i a b l e s

tl, . - . ,tS

^,

.. .

a r e defined on some p r o b a b i l i t y s p a c e , t h e r e f o r e c S

,HS ,is

from ( 3 0 ) - ( 3 1 ) d e p e n d on e v e n t

(24)

o of this space. For simplicity of notations this dependence will b e omitted in formulas. Convergence, boundedness, e t c . will be considered almost everywhere with r e s p e c t t o this probability space. I t should be s t r e s s e d t h a t w e a r e primarily interested in convergence p r o p e r t i e s of t h e sequences c S and f' (s , y ). The following theorem gives r e s u l t s in this direction.

THEOREM

3. Suppose t h a t t h e following assumptions a r e satisfied:

1. Measures H1 and have bounded variation, H1(Y) ⁹b , P ( Y ) 2 b

2. 'k(H) is finite concave function f o r H E G

+

G , where G ,

=

IH,: IH,I(Y) ⁹E , H, is HA-continuous{

f o r some E

>

0, and satisfies (22) f o r H1,H2 ^EG .

3 . Ilg(y,Hlt) - g ( y , ~ ) l i , - - . ~ if

Hlt

--.H;

1 1 g (y , H ~ )

-

^g(y ,H" + I ) 1

I,

⁹^{b s}^-4

o

a s s

--.

^{0 .}

(25)

-

²¹

-

6. One of t h e following conditions is satisfied:

(ii) a s

>

0 and

Then

1) ' k ( H S )

-

^max^+(H),H S ( Y ) ⁺b and all accumulation points of t h e

H E G

sequence HS belong t o t h e set

Q =

f H : H € G , ' k ( H ) = m a x + ( H ) j H E G

2) For any convergent subsequence c S k

-

^c' exists measure H' ^E9 such t h a t

I

^{H + ( A}

^>

^{f o r A}

^c

^{Z + ( c}^*^,g^{( y}^{, H I} ⁾⁾

H ' ( A )

=

H - ( A ) f o r A

c

Z -(c ' ,g ( y ,H * )) H - ( A ) # ( A ) # + ( A ) f o r A c z O ( c ' ,g ( y

,H'

)) and

where s L i s some subsequence of t h e sequence sk

.

Condition 4 of t h e theorem means t h a t it is possible t o use approxima- tions t o gradient g ( y , H ) and it is necessary t h a t precision of t h e s e approx- imations i n c r e a s e as s

-

^a.^Condition⁶i s necessary t o a s s u r e H S ( Y ) ⁺b although

ES

(Y) from ( 3 0 ) may not b e equal t o b . In c a s e if

3 ^(Y) =

b , i.e.

A O s

H ( Z ( c J (S , y )))

=

0 s t a r t i n g from some s , condition 6 i s not necessary.

(26)

(27)

(28)

(29)

(30)

(31)

(32)

(33)

(34)

(35)

(36)

(37)

(38)

(39)

(40)

(41)

(42)

(43)

(44)

(45)

(46)

(47)

(48)

(49)

(50)

(51)

(52)

(53)

(54)

(55)

Stochastic Optimization Techniques for Finding Optimal Submeasures

Working Paper

FINDING

International Institute for Applied Systems Analysis

A-2361 Laxenburg, Austria

FINDING

,H)

[I]

H

=

R n ,

=

-

-

zl

=

...

=

cD

+

P

HU

c

= P - p .

!

1

<

r

1 I . 11,

3

1

!

>

<

<

=

EZ,

=

EZ,

=

-H' (Y)

-

>

=

>

<

<

<

<

<

1

=

=

=

= &

g

H

c

< H"(Z).

> c

>

>

c

+

i

c"

d

=

+

= H'

+ f

Z

c L

z,+):

= F+

+

Kb

T(h

=

HU

= P - ^{p .}

-H' ^(Y)

,, ⁼

+' ⁼

- ^=.