Working Paper
STOCHASTIC OPTIMIZATION TECHNIQUES FOR
FINDING
OPTIMAL SUBMEASURESAlexei Gaivoronski
May 1985 WP-85-28
International Institute for Applied Systems Analysis
A-2361 Laxenburg, Austria
NOT FOR QUOTATION WITHOUT PERMISSION OF THE AUTHOR
STOCHASTIC OPTIMIZATION TECHNIQUES FOR
FINDING
OPTIMAL SUBMEASURESAlexei Gaivoronski
May 1985 WP-85-28
Working Fkpers a r e interim r e p o r t s on work of t h e I n t e r n a t i o n a l I n s t i t u t e f o r Applied Systems Analysis a n d h a v e r e c e i v e d only lim- ited review. Views o r opinions e x p r e s s e d h e r e i n d o n o t neces- s a r i l y r e p r e s e n t t h o s e of t h e I n s t i t u t e or of i t s National Member Organizations.
INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS 2361 L a x e n b u r g , Austria
FOREWORD
In this p a p e r , t h e a u t h o r look at some quite g e n e r a l optimization prob- lems on t h e s p a c e of probabilistic measures. These problems originated in mathematical s t a t i s t i c s b u t have applications in s e v e r a l o t h e r areas of mathematical analysis. The a u t h o r extend previous work by considering a more g e n e r a l form of t h e c o n s t r a i n t s , and develop numerical methods (based on s t o c h a s t i c quasigradient techniques) a n d some duality r e l a t i o n s f o r problems of t h i s type.
This p a p e r i s a contribution t o r e s e a r c h on s t o c h a s t i c optimization c u r r e n t l y underway within t h e Adaptation and Optimization P r o j e c t .
Alexander B. Kurzhanski Chairman
System and Decision Sciences P r o g r a m
CONTENTS
1. INTRODUCTION
2. CHARACTERIZATION O F T H E OPTIMAL S O L U T I O N S 3. STOCHASTIC OPTIMIZATION METHOD
4. P A R A M E T R I C D E P E N D E N C E O F g (y
,H)
ON M E A S U R E 5. NUMERICAL E X P E R I M E N TSTOCHASTIC OPTIMIZATION TECHNIQUES FOR FINDING OPT= SUBMEASURES
Alexei Gaivoronski
1. INTRODUCTION
Optimality conditions based on duality r e l a t i o n s were studied in
[I]
f o r t h e following optimization problem.Find t h e positive Borel measure
H
such t h a t +O(H)=
rnax with r e s p e c t t o c o n s t r a i n t sH ~ ( A ) 5 H ( A ) ~ P ( A ) f o r all Borel A cYcR"
where Y- some s u b s e t of Euclidean s p a c e
R n ,
+ ' ( H ) - function whichdepends on the measure H , usually some kind of directional differentiability and convexity is assumed. Hu and $ a r e some positive Bore1 measures.
Stochastic optimization methods f o r solving (1)-(4) in c a s e when functions
(ki ( H ) a r e linear with r e s p e c t t o H were developed in [I]. In this p a p e r such methods a r e developed f o r nonlinear functions (ki(H) and f o r a r b i - t r a r y finite measures. I n t e r e s t f o r such a problem is originated from statis- tics where i t a p p e a r s in finite population sampling [2,3].
Suppose t h a t w e have collection D of N objects, e a c h object is described by p a i r (z* ,yi),i
=
1:N. Variables y l are known and variables zf can be observed f o r e a c h p a r t i c u l a r i in t h e following way:where c;*
-
random independent variables with z e r o mean, and zf-
observa-tions. It is assumed usually t h a t relationship between
zl
and y f is known up t o t h e s e t of unknown parameters:where h (y )
=
( h l(y ) ,...
,hl (y )) a r e known functions and .IP=
(.IP1,..., d l ) a r e p a r a m e t e r s t o be determined.The problem is t o s e l e c t subset d
cD
consisting of n objects in o r d e r t o get in some sense t h e best possible estimate of p a r a m e t e r s 19. This esti- mate is based on observations z l f o r objects belonging t o d .Applying t h e usual a p p r o a c h of optimal experimental design [4-61 one can substitute t h e collection (y ,y, ) by measure HU (y ) and subset of objects t o be observed by measure H(y). The variance matrix D of t h e best l i n e a r estimate in c a s e all o l h a s t h e same variance becomes a f t e r such
substitution proportional t o matrix M , defined as follows:
and t h e problem becomes t o minimize some function
+
of M , such as d e t e r - minant, t r a c e , t h e l a r g e s t eigenvalue, etc.min @ ( M )
H ( 5 )
with r e s p e c t t o obvious c o n s t r a i n t
f o r all Bore1 A CY. Another possible application of t h e problem (1)-(4) are approximation schemes f o r s t o c h a s t i c optimization [7,8].
The p u r p o s e of t h i s p a p e r is t o develop s t o c h a s t i c optimization methods dealing with such problems. In section 2 t h e c h a r a c t e r i z a t i o n of solutions f o r quite g e n e r a l c l a s s e s of measures is obtained. The conseptual algorithm f o r solving nonlinear problems is proposed in t h e section 3, which is applied in section 4 t o p a r t i c u l a r problems of t h e kind ( 5 ) - ( 6 ) . In section 5 r e s u l t s of some numerical experiments are presented.
2. CHARACTEKIZATION OF THE OPTIMAL SOLUTIONS
We shall consider s u b s e t Y of Euclidean s p a c e Rn and some a-field
P
o n it. We shall assume t h a t a l l measures specified below are defined on t h i s a- field.In t h i s section, t h e r e p r e s e n t a t i o n of measures H, which are t h e solu- tion of t h e following problem, will b e developed:
s u b j e c t t o c o n s t r a i n t
The c o n s t r a i n t (8) means t h a t H1 (E) S H ( E ) S
HU
(E) for a n y Ec
Z. Define HA= P - p .
In what follows t h e s p a c e s L ~ ( Y , ?,HA) a n d L,(Y, P.HA) play a n important r o l e , where L l ( Y , ~ , ~ A ) is t h e s p a c e of all HA-measurable func- tions g (y ) defined on Y a n d such t h a t!
g ( y )1
dHA<
a+, L ,(Y.E.HA) i s t h er
s p a c e of a l l H k m e a s u r a b l e a n d HA-essentially bounded functions g ( y ), defined on Y. In what follows w e s h a l l d e n o t e by
1 I . 11,
t h e norm in t h e s p a c e L ,(Y, E,&), i.e.Let u s d e n o t e by G t h e set of all measures, satisfying (8):
a n d by G b t h e set of all m e a s u r e s , satisfying in addition (9):
Suppose t h a t J' ( y ) i s some function defined on Y, c-some number and define t h e following sets
In notations below w e s h a l l s u b s t i t u t e in t h i s definition instead of
3
v a r i o u s p a r t i c u l a r functions. Takeand define, as usual, by H-,Hf and
1
H!
positive, negative and t o t a l variation of t h e measure H .We shall f i r s t consider t h e problem in which function 9 ( H ) is linear:
and d e s c r i b e t h e set of all solutions of (10). The following r e s u l t is general- ization of Lemma 1 from [I].
THEOREM 1. Suppose t h a t t h e following conditions are satisfied:
2. For any E E z.H'(E)
>
0 e x i s t s El€:, El-, such t h a t e i t h e r El is HA-atom o r 0<
H'(E~)<
=Then t h e solution of problem (10) e x i s t s and. any such solution h a s t h e following r e p r e s e n t a t i o n :
(i) H' (A )
=
H" (A ) f o r any AEZ,
A CZ+(c ' ,g ) (ii) H' (A )=
Hi (A ) f o r any AEZ,
A CZ-(c ' ,g )(iii) HU (A ) 2 H' (A ) 2 H1 (A ) f o r any A EZ, A CZO(c ' ,g ) and
~ ' ( P ' ( c ' , g ) )
=
b-H' (Y)
- H ~ ( Z + ( C * , ~ ) )Conversely, any measure defined by (i)-(iii) is t h e solution of t h e problem (10)'
PROOF. Let us f i r s t p r o v e t h a t t h e measure with p r o p e r t i e s (i)-(iii) exists.
It is c l e a r t h a t any measure on (Y,:) is defined by i t s values on subsets of Z + ( c ' , g ) , Z0(c0 , g ) and Z-(c ' , g ) because t h e s e sets belong t o
-
z due t o g ( y ) E L1(Y, l . H A ) and Y equals t o union of these s e t s . T h e r e f o r e i t i s suffi- cient t o show t h a t among measures satisfying (i)-(ii) e x i s t measure which satisfies also (iii).From t h e definition of Z+(C ,g ) we have:
and
f o r all c
>
c z . This givesand t h e r e f o r e
According t o t h e condition 4 we have HA(Y\Z-(c ' ,g )) 2 b -Hi (Y) in c a s e if c '
=
0. In f a c t , i t is t r u e f o r a r b i t r a r y c ' . Suppose a t f i r s t t h a t c '>
0.Note t h a t , f o r any c > O we have H ~ ( Z + ( C , ~ ~ ) < - because g ( y ) E L l(Y, E v e ) . Consider now t h e sequence c, :
We have
because
and
From t h e definition of c ' and t h e f a c t t h a t c ,
<
c * we have:which gives
The c a s e when c *
<
0 is t r e a t e d in t h e same way taking into account t h e f a c t t h a t H A ( Z + ( c ,g ))<
= f o r a l l c if c '<
0. Thus, we obtainNow if H A ( Z O ( c ' ,g ))
<
= t h e measure H' :1
otherwisesatisfies all conditions ( i ) - ( i i i ) .
When H A ( Z O (c * ,g ))
=
=, condition 2 implies t h e existence El C ? ( c * ,g ) such t h a t e i t h e ro r HA(E1)
=
and E l is HA-atom. In t h e former c a s e H' i s defined simi- larly t o (11) using s e t E l instead of Z O ( c ' , g ) and taking H'=
H' on Z O ( c * , g ) \ E l . In t h e l a t t e r c a s e t a k eand again H'
= &
on ZO(c * ,g ) \ E l . All this proves t h e existence of meas- u r e H' which satisfies (i)-(iii). Let us prove t h a t f o r any measureg
E Gbwhich violate (i)-(iii), we have
Suppose t h a t f o r measure
H
(i) does not hold, i.e., t h e r e i s some s e t Ec
Z f ( c ' , g ) such t h a t E ( E )< H"(Z).
This implies t h e existence of s e t El C E such t h a t g ( y )> c
> c ' f o r y € E l and H8(E1)-g(E1)>
7>
0.Notice t h a t H ' ( A ) ~ ~ ( A ) f o r A c Z f ( c ' , g ) and H ' ( A ) ~ ; ( A ) f o r A
c
2 - ( c ' , g ) due t o definition of H'. This givesThis inequalities lead t o t h e following estimate:
+
c ' (H' - ~ ) ( Y \ E , )Thus, if
i
E G b violates (i) i t cannot be t h e solution of (10). Otherpossibilities a r e considered in t h e same way. T h e r e f o r e , any optimal meas- u r e h a s r e p r e s e n t a t i o n (i)-(iii). I t follows from definition t h a t a l l measures satisfying (i)-(iii) h a s t h e same value of
and t h e r e f o r e a r e optimal. Proof is complete.
REMARK. I t is clear t h a t in t h e c h a r a c t e r i z a t i o n of optimal measures
any
c"
c a n be taken instead of c ' such t h a tC ' 5; < S U P tc: ~ ~ ( z + ( c , g ) r ~ - H ~ ( Y ) {
Note t h a t if measure HA h a s bounded variation conditions 2 and 4 a r e satisfied automatically. For such measures t h e s t r u c t u r e of solutions c a n b e studied using g e n e r a l duality t h e o r y [9].
Let us now consider in more detail t h e s e t G*. If t h e measure
d
h a sfinite variation w e have t h e following r e p r e s e n t a t i o n f o r a r b i t r a r y H E Gb : H
=
HL+
(H-HL)where measure H - H L is finite. positive and continuous with r e s p e c t t o measure HA. If HA is a-finite w e c a n use Radon-Nycodym theorem [lo] and f o r a r b i t r a r y H E G b obtain t h e following r e p r e s e n t a t i o n :
H(E)
= H'
( E )+ f
h H ( y ) d ~ A VE EZ
E
where h H E L l ( Y , Z , ~ A ) and t h i s r e p r e s e n t a t i o n i s unique. For a r b i t r a r y E E Z w e have:
and t h e r e f o r e 0 5 h g ( y ) S 1 HA-everywhere. Consider now t h e s e t Kb
c L
1 ( ~ ,z,+):
Each function from this set defines measure HA from G b : Hh(E)
= F+
( E )+
Jh ( y ) d ~ A . E EEE
Therefore ( 1 3 , (14) defines isomorphism between s e t s G b and
Kb
such t h a t t h e problem (7)-(9) is equivalent t o t h e following one:max T ( h ) (15)
subject t o constraints
where t h e function
T(h
)=
*(HA). Optimal values of problems (15)-(17) and (7)-(9) a r e t h e same and e a c h solution of (15)-(17) defines solution of (7)-(9) through (14) and vice v e r s a . This equivalence together with c e r t a i n con- vexity assumptions lead t o solution representation f o r problems (7)-(9) similar to theorem 1:THEOREM 2. Suppose t h a t t h e following assumptions a r e satisfied:
1. Measures HL and
HU
have bounded variation,&Y) 5s b , P ( Y ) 2 b
2 . Jr(H) is concave and finite f o r H € G b
,, =
Gb +Gcwhere G
, =
fH,:1
H,1
( Y ) 5 E , Hc is HA-continuous j f o r some E>
0.Then
1) For e a c h H 1 E G b e x i s t s g ( y ,H I ) E L
,(Y,
:,HA) such t h a tf o r a l l
H 2
E G b2 ) The solution H e of problem ( 7 ) - ( 9 ) exists.
3) For any E E and any optimal solution ~ H' of t h e problem ( 7 ) - ( 9 ) w e have t h e following r e p r e s e n t a t i o n :
H " ( E ) f o r E
c
Z f ( c * , g ( y , H a ) )$
( E ) f o r Ec
Z -(c * ,g ( y , H a )) ( 1 9 )& ( E ) S H ( E ) W ( E ) f o r E c ~ ~ ( c * , g ( y , H ' ) ) where
c'
=
inf )c : H*(z+(c ,g (y , H * ) ) 5 b-18 M
j andf o r a l l H E Gb
.
Conversely, if f o r some HI EGb e x i s t s g ( y , H I ) E L ,(Y,I ,
H A ) such t h a t (18) i s fulfilled and Hl c a n b e r e p r e s e n t e d according to ( 1 9 ) then H 1 i s t h e optimal solution of t h e problem ( 7 ) - ( 9 ) .PROOF.
The previous argument shows t h a t under assumptions of t h e theorem problem (7)-(9) is equivalent t o t h e problem (15)-(17) and t h e r e is isomorphism between s e t G b , , a s defined in condition 2, and the,following s e tFunction P ( h ) from (15) i s concave on t h e s e t Kb,., which is &-vicinity of Kb in L ,(y,2,HA). T h e r e f o r e f o r each L E Kb e x i s t s subdifferential of concave function q ( h ) [ l l , 121, which in t h i s c a s e i s l i n e a r continuous functional f E L ; (Y,
L,
HA) such t h a tTaking into account r e p r e s e n t a t i o n of L ; ( y , z , H A ) [l o ] w e get:
where
which t o g e t h e r with (12) implies
f o r all H,H1 E G b where g ( y ,H)
=
i ( y ,AH). Thus, (18) is proved. Note t h a t w e may c o n s i d e r function g (y ,HI) from (18) (possibly non-unique) a s subdifferential of t h e function *(H) a t point HI.N o w o b s e r v e t h a t t h e s e t Kb is weakly sequentially compact in L ,(Y, Z, HA) because H A ( v
<
anduniformly f o r h E Kb ( s e e [ l o , p.2941). Let us p r o v e t h a t i t is a l s o weakly closed. Consider t h e sequence h (y ) , h €& and
f o r some h E L ~ ( Y , ~ , H ~ ) and all g E L,(Y,I,H~). In p a r t i c u l a r , w e have
/
h S ( y ) d f f A+ /
h ( y ) a AE E
f o r all E E
E
because t h e indicator function of t h e set E EI
c l e a r l y belongs t o L , ( Y , ~ , H ~ ) . This gives 0 S h ( y ) S 1 ~ ~ e v e r ~ w h e r e . Taking g ( y )=
1 w e have.alsowhich gives
Thus, h E Kb and Kb i s weakly closed.
It follows from (20) t h a t f o r any sequence h E K b , h -,h weakly, h E Kb w e have
This t o g e t h e r with sequential compactness and closeness of Kb implies existence of h * such t h a t
Thus, solution of t h e problem (7)-(9) exists.
The general r e s u l t s of convex analysis [ I l l now imply t h a t under assumption 2 of t h e theorem f o r any solution H* of t h e problem ( 7 ) - ( 9 ) exists subdifferential g (y , H a ) of t h e function * ( H ) at point H* such t h a t
f o r all H E Gb o r , in o t h e r words H * i s one of t h e solutions of t h e following problem:
This problem is exactly of t h e type ( 1 0 ) and i t s solutions are c h a r a c t e r i z e d by t h e Theorem 1 . Conversely, if f o r some H' E Gb exists subdifferential g ( y , ~ ' ) such t h a t H' i s t h e solution of t h e problem ( 2 1 ) then H' is t h e optimal solution of t h e original problem. Proof i s now completed by using theorem 1 . Some r e l a t e d r e s u l t s were obtained f o r a special kind of func- tion \ k ( H ) , atomless probability measure
HU
and9 =
0 in [2].Theorem 2 shows t h a t solutions of t h e problem ( 7 ) - ( 9 ) can b e viewed as indicator functions of some sets. Therefore many problems involving selec- tion of optimal set [13] can b e reformulated as problems of finding optimal measures.
3. STOCHASTIC OPTIMIZATION METHOD
Using t h e r e s u l t s of t h e previous section w e can construct numerical methods f o r solving problem ( 7 ) - ( 9 ) . From now on w e shall assume t h a t func- tion * ( H ) is concave and finite on some vicinity of t h e set G and possess c e r t a i n differentiability properties:
* ( H 1 + a ( H 2 - H 1 ) )
=
* ( H 1 ) + af
g ( y . H ~ ) ~ ( H ~ - H ~ ) + O ( a ) Ywhere o ( a ) / a --, 0 as a --, 0 f o r all H l , H 2 E G . This means t h a t subdif- f e r e n t i a l g ( y ,H I ) from ( 1 8 ) i s unique f o r all i n t e r i o r points of G and w e c a n assume t h a t g ( y ,H ' ) from ( 1 9 ) s a t i s f i e s a l s o ( 2 2 ) .
Consider now t h e mapping r ( c J ) from RXL,(Y, Z, H') t o G : if H
=
r ( C , f ) t h e nH U W ) f o r E c z C ( c J ) H ( E )
= I
H ' ( E ) f o rE
C Y \ Z + ( C J ) f o r any E Er.
F i r s t of all w e shall give a n informal description of t h e algorithm. Sup- pose t h a t some H S E G i s t h e c u r r e n t approximation t o t h e solution of t h e problem ( 7 ) - ( 9 ) . According t o ( 2 2 ) local behavior of + ( H ) around H S i s approximated by l i n e a r form:
and if H f is t h e solution of t h e problem
t h e n direction H s - H S will b e t h e a s c e n t direction at p o i n t
P .
T h e r e f o r e w e can t a k e as t h e next approximation t o t h e optimal solutionHS
+' =
H S + a ( E -HS ) ( 2 5 )f o r some a
>
0. Consider now t h e problem of findingE
o r suitable approxi- mation t o it.Suppose t h a t w e know t h e function g (y , H S ) exactly, Then, according t o theorem 1 , a l l t h e possible H.S are fully d e s c r i b e d by p a i r ( c * , g (y , HS )), where c ' is t h e solution of t h e problem
Observe now t h a t function
is nonincreasing and t h e r e f o r e solving (25)-(26) is equivalent
to
solvingC
max W S (c ), W S (c )
= /
W t ( t )dtc T
for some T and W f ( c ) can b e considered as subgradient of t h e function W S ( c ) . Therefore w e c a n use subgradient method f o r finding c ' :
However, computation of W f ( c k , according to ( 2 8 ) involves multidimensional integration o v e r complex regions and this may b e too complicated from t h e computational point of view. In this situation stochastic quasigradient methods [14] can be used. In such methods t h e statistical estimate of W:
is implemented in ( 2 9 ) instead of
w:.
Once c ' is determined t h e measure r ( c ' , g ( y , H S ) ) defined in ( 2 3 ) , may b e a reasonable approximation t o t h e solution H,S of t h e problem ( 2 4 ) and can be used in algorithm ( 2 5 ) . However, p r e c i s e estimation of c' from ( 2 9 ) r e q u i r e s infinite number of iterations and t o make algorithm implementable, i t is necessary t o avoid this. I t a p p e a r s t h a t under c e r t a i n assumptions about stepsizes in ( 2 5 ) and ( 2 9 ) , w e may t a k e in ( 2 9 ) k
=
s and perform only one iteration in ( 2 9 ) p e r iteration in ( 2 5 ) using as approximation t oHf
t h e measure fiS=
r ( c S ,g (y , I f S ) ) . Thus, along with sequence Hs w e obtain alsot h e sequence of numbers c S . Note now t h a t although
3
i s quite simple, measureH S
would b e excessively complex even for small s. However, HS is only needed for getting gradient g (y , H S ) and in p a r t i c u l a r c a s e s some approximation j'(s , y ) t o g (y , H S ) c a n be obtained using only$
in t h e s o r t of updating formula similar t o (25).Once sequence j' (s ,y ) with p r o p e r t y
I
f ( s , y )-
g ( y , H S )I -
0 isobtained t o g e t h e r with sequence c =': V S (c )
-
max (c )-
0 , t h e optimalC
solution of problem (7)-(9) is defined by Theorem 2 through accumulation points of t h e s e sequences. The s t r u c t u r e of optimal solution is close t o
(23).
Now w e shall define t h e algorithm f o r solving (7)-(9) formally.
1. At t h e beginning s e l e c t initial approximation t o solution
H O ,
function j'(0, y ) and number cO.2. Suppose t h a t a t t h e s t e p number s w e get measure
HS
, function j'(s , y ) and number c S . Then on t h e next s t e p w e do t h e following:2a. P a i r (c ,j' (s , y )) defines measure
3
according t o (23):3 =
F(cS ,P(s ,Y ))N e w approximation t o solution is obtained in t h e following way:
HS +l
=
(I-as)HS+
a,' 2
(30)2b. Now number c
+'
is obtained:C S +l
=
C S + P,CS
where
i.e., t h e function Vs ( c ) is defined similarly t o W S ( c ) with t h e difference t h a t j' ( s , y ) is used instead of g ( y , H S ).
2c. N e w function j ' ( s + l , y ) is obtained in such a way as t o approxi- mate g ( y , H S 'l). The p r e c i s e way of achieving t h i s c a n b e speci- fied only a f t e r considering p a r t i c u l a r ways of dependence g ( y . H ) on
H.
One quite general c a s e i s considered in t h e next section.Here w e shall only assume t h a t
a s s
- =.
The method of achieving t h i s in p a r t i c u l a r situation will be described in t h e next section.Before stating convergence r e s u l t s f o r algorithm ( 3 0 ) - ( 3 1 ) , two exam- ples of calculating
<'
from (31) a r e presented.(i) Measures
Hu
and H' have piecewise-continuous densities G ( y ) and $ ( y ) respectively with r e s p e c t t o Lebesque measure. Then w e havewhere
(H',(Y) -m
u otherwiseand p(Y) i s Lebesque measure of Y. T h e r e f o r e we can t a k e
w h e r e wS i s d i s t r i b u t e d uniformly o v e r Y.
(ii) Measures HU a n d
&
are defined by f i n i t e n u m b e r of p a i r sIn t h i s c a s e
w h e r e
T h e r e f o r e
P =
N ~ ~ t ( c ~ ) p ~ , + N ( ~ - Y ~ ~ ( ~ ~ ) ) P ~ , - bw h e r e w S assumes v a l u e i , 1
s
i S N with p r o b a b i l i t y 1 / NLet u s now i n v e s t i g a t e c o n v e r g e n c e of algorithm ( 3 0 ) - ( 3 1 ) . In a l l state- ments c o n c e r n i n g c o n v e r g e n c e of measures f r o m t h e set G w e s h a l l u s e t h e weak-L c o n v e r g e n c e , used a l r e a d y in t h e proof of Theorem 2:
HI
- H iffJ
g ( y ) d X L -+J
g ( y ) d HY Y
f o r all g E L ,(Y,z,H*), a n d topology, induced by t h i s c o n v e r g e n c e will b e used without f u r t h e r r e f e r e n c e .
We s h a l l assume t h a t random v a r i a b l e s
tl, . - . ,tS
,.. .
a r e defined on some p r o b a b i l i t y s p a c e , t h e r e f o r e c S,HS ,is
from ( 3 0 ) - ( 3 1 ) d e p e n d on e v e n to of this space. For simplicity of notations this dependence will b e omitted in formulas. Convergence, boundedness, e t c . will be considered almost everywhere with r e s p e c t t o this probability space. I t should be s t r e s s e d t h a t w e a r e primarily interested in convergence p r o p e r t i e s of t h e sequences c S and f' (s , y ). The following theorem gives r e s u l t s in this direc- tion.
THEOREM
3. Suppose t h a t t h e following assumptions a r e satisfied:1. Measures H1 and have bounded variation, H1(Y) 9 b , P ( Y ) 2 b
2. 'k(H) is finite concave function f o r H E G
+
G , where G ,=
IH,: IH,I(Y) 9 E , H, is HA-continuous{f o r some E
>
0, and satisfies (22) f o r H1,H2 E G .3 . Ilg(y,Hlt) - g ( y , ~ ) l i , - - . ~ if
Hlt
--.H;1 1 g (y , H ~ )
-
g (y ,H" + I ) 1I,
9 b s -4o
a s s--.
0 .-
21-
6. One of t h e following conditions is satisfied:
(ii) a s
>
0 andThen
1) ' k ( H S )
-
max +(H), H S ( Y ) + b and all accumulation points of t h eH E G
sequence HS belong t o t h e set
Q =
f H : H € G , ' k ( H ) = m a x + ( H ) j H E G2) For any convergent subsequence c S k
-
c' exists measure H' E 9 such t h a tI
H + ( A>
f o r Ac
Z + ( c * ,g ( y , H I ))H ' ( A )
=
H - ( A ) f o r Ac
Z -(c ' ,g ( y ,H * )) H - ( A ) # ( A ) # + ( A ) f o r A c z O ( c ' ,g ( y,H'
)) andwhere s L i s some subsequence of t h e sequence sk
.
Condition 4 of t h e theorem means t h a t it is possible t o use approxima- tions t o gradient g ( y , H ) and it is necessary t h a t precision of t h e s e approx- imations i n c r e a s e as s
-
a. Condition 6 i s necessary t o a s s u r e H S ( Y ) + b althoughES
(Y) from ( 3 0 ) may not b e equal t o b . In c a s e if3 (Y) =
b , i.e.A O s
H ( Z ( c J (S , y )))