Working Paper
COUBINING GENERALIZED PROGRAMMING AND SAMPLING TECHNIQUES FOR
ETOCWHITC PROGRAMS WITH RECOURSE
A. Gaivoronski J.L. Nazareth
September 1986 WP-06-44
International Institute for Applied Systems Analysis
A-2361 Laxenburg, Austria
NOT FOR QUOTATION WITHOUT THE PERMISSION OF THE AUTHORS
COMBINING GENERALIZED PROGRAMMING AND SAHPLING TECHNIQUES FDR
SToCHASlXC PROGRAMS WITH RECOURSE
A. Gaivoronski J.L. Nazareth
September 1986 WP-86-44
Working Papers a r e interim r e p o r t s on work of t h e International Institute f o r Applied Systems Analysis and have received only limited review. Views o r opinions expressed herein do not necessarily r e p r e s e n t those of the Institute o r of its National Member Organizations.
INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS A-2361 Laxenburg, Austria
FOREWORD
This p a p e r deals with a n application of generalized l i n e a r programming tech- niques f o r s t o c h a s t i c programming problems, p a r t i c u l a r l y t o stochastic program- ming problems with r e c o u r s e . The major points which needed a clarification h e r e were t h e possibility t o use t h e estimates of t h e objective function instead of t h e e x a c t values and t o use t h e approximate solutions of t h e dual subproblem instead of t h e e x a c t ones.
In this p a p e r conditions are p r e s e n t e d which allow to use estimates and ap- proximate solutions and s t i l l maintain convergence. The p a p e r i s a p a r t of t h e ef- f o r t on t h e development of stochastic optimization techniques at t h e Adaptation and Optimization P r o j e c t of t h e System and Decision Sciences Program.
Alexander B. Kurzhanski Chairman System and Decision Sciences Program
CONTENTS
1 Introduction
2 A conceptual Algorithm 3 Extension
References
COMBINING G - PROGRAMMING
ANDSAMPLING TECHNIQUES FOR STOCHASTIC PROGRAMS WITH RECOURSE
A. G a i v o r o n s k i and J.L. N a z a r e t h
Generalized Programming Techniques of Wolfe (see Dantzig [I]) enjoyed e a r l y use f o r solving s t o c h a s t i c p r o g r a m s with simple r e c o u r s e (Williams [14], P a r i k h [8]) and t h e r e h a s r e c e n t l y been renewed i n t e r e s t in t h e i r r e l e v a n c e f o r solving more g e n e r a l c l a s s e s of s t o c h a s t i c programs (see Nazareth & W e t s [7]
-
s t o c h a s t i c programs with r e c o u r s e and nonstochastic t e n d e r s , Ermoliev, Gaivoronski & Nede- va [4]-
s t o c h a s t i c p r o g r a m s with incomplete information). Our i n t e r e s t h e r e is in s t o c h a s t i c programs with r e c o u r s e of t h e form:minimize Ew[< c ( w ) , z
> +
Q ( z , w ) ] s.t.Ax
= b L S z S u wherewhere w i s a n element of some probability s p a c e (W.
B.
P), A ( m l X n l ) i s a fixed matrix, T(*) (m2 X n 2 ) are random matrices, c(*) ( n l ) , q ( * ) ( n 2 ) and A(*) (m2) are random v e c t o r s and b (ml) a fixed v e c t o r . W e assume complete r e c o u r s e i.e. ( l . l b ) always h a s a solution. E, denotes expectation. Define c=
E, [c (w )I. Then w e c a n e x p r e s s ( l . l a , b) asminimize < c , z
> +
Q ( z )=
F ( z )w h e r e
The set of c o n s t r a i n t s in (1.Za) w e s h a l l d e n o t e b y X. P r o p e r t i e s of (l.Za, b ) h a v e b e e n e x t e n s i v e l y s t u d i e d ( s e e Wets [13]) a n d , in p a r t i c u l a r , Q ( z ) c a n b e shown to b e convex b u t i s , in g e n e r a l , nonsmooth.
The g e n e r a l i z e d programming a p p r o a c h a p p l i e d to (1.2a) involves i n n e r or g r i d l i n e a r i z a t i o n of t h i s convex p r o g r a m a n d r e q u i r e s c o o r d i n a t e d solution of a m a s t e r p r o g r a m a n d a (Lagrangian) s u b p r o b l e m defined as follows:
Master:
k k
minimize
x
< c , z j > A j+ x
~ ( z j ) A jj = 1 j = l
w h e r e
d
, v k are t h e d u a l multipliers a s s o c i a t e d with t h e optimal solution of (1.3a) Sllbproblem: Find zk ER ~ '
s u c h t h a t L S zks
u a n db y p a r t i a l l y optimizing t h e problem minimize
<dC
, z> +
Q ( z )l r z r u
w h e r e
d
a n d v k d e n o t e t h e d u a l multipliers a s s o c i a t e d with t h e optimal solution of t h e master p r o g r a m (1.3a) a n dWe temporarily ignore a l l considerations r e l a t e d to initialization of (1.3a), un- boundedness of t h e solution in (1.3b), recognition of optimality and so on. (1.3a-c) show only t h e essential f e a t u r e s of t h e method, namely, t h a t t h e master sends ( p r i c e s ) nk to t h e subproblem which, in t u r n , uses t h e s e quantities to identify a n i m p r o v i n g (grid) point z k + l.
In many p r a c t i c a l applications, t h e probability distribution of t h e random events i s d i s c r e t e with relatively few points in t h e distribution and randomness i s often r e s t r i c t e d to c e r t a i n components of (l.la-b), f o r example, to h ( . ) . In such cases j u d i c i o u s computation enables ~ ( z j ) and i t s subgradients to b e found ex- actly, see Nazareth [6]. These quantities are r e q u i r e d both to define t h e objective function of t h e m a s t e r (1.3a) and during t h e solution of ( 1 . 3 ~ ) to give a n improving point satisfying (1.3b). More generally however, ~ ( z j ) c a n only b e approximated in (l.Zb), f o r example, by a sampling p r o c e d u r e , and e x a c t computation of i t s value or of i t s subgradients i s o u t of t h e question because, i t would b e too expensive. We then s e e k to r e p l a c e ~ ( z j ) in (1.3a) by a n estimate, s a y Ql. The generalized pro- gramming a p p r o a c h , extended in this manner, s t i l l continues to a p p e a r viable and d e s e r v e s f u r t h e r investigation, f o r t h e following reasons:
a ) I t is w e l l known (and in t h e n a t u r e of a "folk theorem") t h a t f a i r l y c r u d e ap- proximations of t h e underlying distribution in ( l . l a - b ) (which t h e n permit e x a c t solution of t h e resulting approximated r e c o u r s e program) o f t e n produce quite rea- sonable estimates of t h e "optimal" f i r s t s t a g e decision. This c a n b e i n t e r p r e t e d to mean t h a t f a i r l y c r u d e estimates Q l in t h e master program will often b e adequate to guide t h e algorithm to a "reasonable" neighborhood of t h e d e s i r e d solution of t h e original r e c o u r s e problem (1.la-b).
The (Lagrangian) subproblem ( 1 . 3 ~ ) does not h a v e to be o p t i m i z e d at e a c h cy- c l e . For example, a l l t h a t i s needed in t h e case of e x a c t estimates Q ( z k + I ) , to pro- duce a n improving point is t h a t t h e condition (1.3b) b e satisfied. This suggests t h e r e f o r e t h a t one s e e k to r e e x p r e s s t h i s condition in t h e t e r m s of estimates
9 : ft
and combine i t with utilizing s t o c h a s t i c estimates of subgradients, s t o c h a s t i c quasi- g r a d i e n t p r o c e d u r e s (see Ermoliev and Gaivoronski [5]) which are generally effec- tive, when t h e y are applied to a problem t h a t does not have to b e pushed all t h e way to optimality.
Our p a p e r c a n b e viewed as a s t u d y of g e n e r a l i z e d programming in t h e p r e s - e n c e of noise (whose magnitude d e c r e a s e s as t h e number of i t e r a t i o n s i n c r e a s e s ) a n d with t h e s p e c i a l c h a r a c t e r i s t i c s of r e c o u r s e problems t a k e n i n t o c o n s i d e r a - tion. In s e c t i o n 2 w e state a c o n c e p t u a l algorithm a n d e s t a b l i s h c o n v e r g e n c e u n d e r a p p r o p r i a t e assumptions t h e r e b y extending t h e s t a n d a r d p r o o f s ( s e e , f o r example, S h a p i r o [lo], f o r t h e case when Q ( z ) i s known e x a c t l y ) . Some c o n s i d e r a t i o n s con- c e r n i n g implementation are b r i e f l y discussed. Finally extension to o t h e r s t o c h a s t i c programming problems i s c o n s i d e r e d in s e c t i o n 3.
2.
A CONCEPTURAL ALGOEUTEW
W e u s e t h e t e r m "conceptual" h e r e i n t h e s e n s e of Polak [9], a n d s t u d y t h e fol- lowing algorithm f o r solving (1.la-b). I t will b e convenient t o assume t h a t all bounds L a n d u are finite so t h e L 5 z 5 u i s a c o m p a c t set.
The algorithm g e n e r a t e s s e q u e n c e of points z 0
- .
z k- - .
which depend o n element w of some p r o b a b i l i t y s p a c e (W. B.P)
where w E W C RP, B-
a-
field, P-
p r o b a b i l i t y measure. The s e q u e n c e zk c o n v e r g e s to t h e solution of t h e problem (1.1) in a c e r t a i n p r o b a b i l i s t i c s e n s e .Step 1: (Initialize): Choose a set of m l g r i d points z l ,
....
zml so t h a t t h e con- s t r a i n t sh a v e a f e a s i b l e solution. Set k --r m l.
Step 2. (Form estimates)
Define a s u b s e t Nk of i n t e g e r s , Nk
c
11,.. . .
, k j, t h i s being t h e set of g r i d p o i n t s in- d i c e s f o r which e s t i m a t e s will b e made. Define a n i n t e g e r s ( k ) , which c o n t r o l s t h e p r e c i s i o n of e s t i m a t e s . G e n e r a l l y s p e a k i n g s ( k ) i s t h e number of o b s e r v a t i o n s of t h e function Q ( z , o) used to form t h e estimate. Obtain t h e new e s t i m a t e s QJ of~ ( z j ) , j E Nk a n d f o r j C Nk t a k e QJ
=
QJ I t will b e assumed t h a t f o r j E Nkin some suitable probabilistic sense. Initially f o r k
=
m i , l e t Nk=
11,. . .
, m For subsequent k , t h e s e t Nk , i n t e g e r s ( k ) and estimates QI c a n b e selected in a number of d i f f e r e n t ways, some of which will b e specified l a t e r .Step 3: (Solve Master):
k
minimize
C
( < c , z j 7+
Qi) A jj = I
Let
#
and vk be t h e associated optimal dual multipliers and Xf-
optimal pri- mal variables. Define f l k=
f j : A f>
O j . In some versions of o u r method i t i s neces- s a r y at t h i s point t o redefine t h e set Nk and g o t o s t e p 2 (examples will b e given l a t e r ) . Otherwise, g o t o s t e p 4.Step 4: (Define new g r i d point zk 'I).
Define
and consider t h e (Lagrangian) subproblem minimize <uk , z
> +
Q ( z )1 r t r u
The new point z k i s taken t o b e a n "approximate" solution t o this problem, more precisely, i t i s n e c e s s a r y t h a t f o r almost all w E W t h e r e e x i s t s a subsequence k,(,) such t h a t
< u k r , z k r + l > + Q ( z " + I )
-
min [<akr, z>+
Q ( z ) ] -+ 01 s t s u
Note, t h a t i t i s not necessary t h a t
< o k , zk 'I
>+
Q ( z k 'I)-
min[<&,
z>+
Q ( z ) ] --, 0 l % z % uf o r t h e whole sequence z k . This makes i t possible, f o r instance, t o use random s e a r c h techniques f o r getting z k 'I. Some p a r t i c u l a r methods of choosing t h e point zk 'l with t h i s p r o p e r t y will b e specified at t h e end of t h i s section.
Step 5: ( I t e r a t e ) : k --, k
+
1 . Go to s t e p 2.This algorithm h a s t w o important d i f f e r e n c e s from t h e usual generalized l i n e a r programming algorithm. Firstly, i t does not r e q u i r e e x a c t values of t h e ob- jective function ( s t e p 2). I t i s only n e c e s s a r y t o have estimates of t h e objective values at t h e g r i d points whose precision gradually increases. Secondly i s i s not n e c e s s a r y t o minimize t h e Lagrangian subproblem a t s t e p 4 , precisely; i t i s only n e c e s s a r y t h a t t h e c u r r e n t point z k 'l r e g u l a r l y comes to t h e vicinity of such a solution.
Both modifications are n e c e s s a r y in o r d e r t o make use of generalized l i n e a r programming in a s t o c h a s t i c setting.
In o r d e r t o p r o v e convergence of t h i s algorithm l e t us consider i t ' s dual re- formulation. Take
Nrr)
=
min q ( z , rr) l % r % u@(n)
=
min q k ( j , rr)l % j % k (2.7)
Then algorithm (2.1)-(2.3) c a n b e considered as a maximization method f o r t h e concave function Jl(rr) by successive polyhedral approximation of q(rr) by @(m).
A t s t e p 1 t h e initial polyhedral approximation i s c o n s t r u c t e d , in s t e p 3 t h e c u r r e n t polyhedral approximation
@
(rr) i s maximized, optimal dual multipliersd
being t h e solution of t h e problemIn s t e p s 2 and 4 t h e polyhedral approximation i s updated.
Theorem 1. Make t h e following assumptions:
1. Initial points x l ,
. . .
, x m l are such t h a t b E int c o f ~ x j , j=
1: m l ]where int means i n t e r i o r and c o convex hull.
I Q ~
-
Q ( x k ) l , rnax (QJ - Q ( x ~ ) I=
~k -+ 0j Ehk
I
lim inf < U ~ , X ~ + ~ > + Q ( X ~ + ~ )
T --.= f L T
I
-
rnin [ < u i , x > + Q ( x ) ] L S x S uThen F ( r k ) 4 min F ( x ) a s . where
ik =
A! x i and a l l accumulation points of z E Xj E hk t h e sequence
zk
are solutions of (1.2) a s . Roof. Due t o t h e assumption 2 w e haves u p J Q ~ - Q ( x ~ ) ( <
c < =
a.s.k,j Ehk
This t o g e t h e r with boundedness of x k gives:
This t o g e t h e r with assumption 1 implies t h e boundedness of t h e sequence
# ,
which c a n b e s e e n as follows: Indeed,@ (d
)=
rnax@
(n) and t h e r e f o r en
which follows from (2.5). On t h e o t h e r hand
) rnin [ < c , x j > + ~ i - < + , h j - b > ] l r j r m l
r rnax [ C c , x j >
+
@ ( k ) ]+
min [-<#, AX^ -
b > ]l r j r m l l r j r m l
C - rnax < + , h j - b >
l r j r m l
=
C ,- I I + ) )
max ,AX^ -
b- n k in max < e , ~ z j - b > s C ~ - I l d ( l b 13.1 i r j r n l
f o r some b
>
0 due t o assumption 1.Thus.
which gives
T h e r e f o r e t h e sequence
2
i s bounded. According t o t h e assumption 3 of t h e theorem f o r almost a l l w E W e x i s t subsequence k, ( w ) such t h a t<u&, 21; + I >
+
Q ( Z li + I )-
min [<crli,z > +
Q ( Z ) ] --ro
l r r r u
T 4
Using equality u4
=
c-
A n w e obtaink , +1 k , + 1
< c , z > + Q ( z ) - < n k r , ~ z k,+1
-
b> -
,b(nk') --r 0 (2.8) Due t o t h e boundedness of t h e sequenced
we may now assume without loss of generality t h a t n4 -, n*
and t h e r e f o r e (1nli-
n4+11) -, 0 . F u r t h e r more from t h e definitions (2.4), (2.6) of t h e function ,b(n) and boundedness of the admissible s e t X follows t h a t t h e function ,b(n) satisfies Lipshitz condition uniformly on n and t h e r e - f o r eas r -, 0 where C2
<
0.
Thus (2.8) implies
k , + 1 k , + 1
<c ,
z > +
Q ( 2 )-
<nkr+', AZ k , +1-
b> -
,b(nkr+l)s
T ,where max 10, 7 , -, 0 as r 4 0 .
Consequently
k , + 1 k , + 1
< c ,
z > +
Q4+,-
<nkr+l, kr + 1-
b> -
,b(nk'*l)But
4 + I k , + I
<c. Z
> +
Qkrll-
<TT4+l, AZ'-
b>
L min [<c. z j
> +
Q L + ~- <dl+'
,~ z f -
b>I =
9kr+l(,kr+l1 S j S 4 + 1
1
Inequalities (2.9) and (2.10) g i v e
k , + I
( T T )
-
( T T )S T+
Q-
Q ( z 4 + I )where t o g e t h e r with assumption 2 mean
,t+(TTk.)
- +(a) s
7 :where max 10, ~ : j -, 0 as r -,
-
a.s.On t h e o t h e r hand
#r(TTkr)
=
max9
k ' ( r r )lr
a m a x min [ < c ,
d >
+ Q ( z ~ )-
<n. A Z ~ - b > ] - c k ,lr f E * %
2 max ~ ( T T )
-
cr;lr
Inequality (2.11) now g i v e s
+ ( n r; )
+
T: 2 max ~ ( T T )-
ck,lr
which implies
k
~ ( T T ')
-
max ~ ( T T ) -, 0lr
and
It; It;
9
( T T )s
rnax ~ ( T T )+
T:lr
where max 10,
T F ~
-, 0 as r -,=
a . s . . The last inequality t o g e t h e r with (2.12) g i v e sk k
9
' ( T T ')-
max ~ ( T T ) -, 0lr
Taking now a r b i t r a r y k
>
k, w e get:s m a x min [ < c , x j > + Q ( x ~ )
-
<rr,AX^ -
b > ]"
j ~ ~ k U ~ ' +-max min [ < c , x j > + Q ( x ~ )
-
< T T , A Z ~-
b > ] jeA'+ 2 max E~ 5 2 max E~
k T S i S k k r S i S k
which t o g e t h e r with ( 2 . 1 2 ) and (2.13) gives
The problem of maximization of
@
(rr) is dual to ( 2 . 2 ) and t h e r e f o r eC
( < c , z j >+
Q t ) Af-
min F ( x )j E A ~ 2 EX
Finally due to convexity of F ( x )
which t o g e t h e r with ( 2 . 1 4 ) gives F ( z ~ )
-
min 2 EX F ( x ) a.s.which completes t h e proof.
W e now study i n t u r n e a c h of t h e assumptions upon which t h e preceding theorem depends. Assumption 1 of t h e theorem c a n always b e satisfied if matrix A i s of r a n k m
Let u s consider in m o r e detail assumption 2, which deals with precision of function values estimates at "essential" points. It's fulfillment depends on t h e r u l e used a t s t e p 2 to determine t h e s e t Nk of c u r r e n t new estimates, t h e i n t e g e r s ( k ) which c o n t r o l s a c c u r a c y and t h e method of obtaining estimates. Consider t w o such
r u l e s which g u a r a n t e e t h a t condition 2 i s satisfied.
I. This i s t h e simplest a d hoc rule. Before s t a r t i n g t h e algorithm define a se- quence Ikp
jF=
l l kp + 1>
kp a n d t a k e s ( m l ) = s oNk
=
11,.. .
, k j , s ( k ) = s ( k - 1 )+ I
Nk
=
[ k j , s ( k )=
s ( k-
1 ) otherwisein o t h e r words f o r k
=
kp estimates at all g r i d points are updated with i n c r e a s e d a c c u r a c y while f o r k #$
t h e estimate i s made only at t h e l a t e s t point z k to e n t e r t h e set of g r i d points. The estimates themselves should possess only t h e p r o p e r t y t h a tAn example of such estimate i s
z w k f o r j
<
kwhere w
'
are independent observations of random p a r a m e t e r s from ( 1 . 1 )2. The previous r u l e does not discriminate between r e c e n t points and old ones, which might become redundant. Furthermore i t i s b e t t e r to b a s e decisions on whether to i n c r e a s e precision on information which becomes available during iterations. The following a d a p t i v e precision ruLe t a k e s account of t h e s e f a c t o r s .
Let us define f o r e a c h estimate
Q l
of t h e function value ~ ( z f ) , t h e number k j such t h a tz j E N 4 , z j Ni f o r kt
<
i L ki.e. k j i s s t e p number when t h e estimate of
Ql
w a s last updated. Then t h e precision of t h e estimate c h a r a c t e r i z e d by number s ( k j ) :The s t e p s 2 and 3 of t h e method with this adaptive precision r u l e are specified a s follows:
Step 2 (Form estimates). T h e r e a r e two possibilities (i) Preceding s t e p w a s s t e p 3. Then
N~
=
[ j : j E4,
and s ( k j )<
s ( k ) (s ( k ) remains t h e same. For j E
@
g e t estimates Q l such t h a tg o to s t e p 3
(ii) Preceding s t e p w a s s t e p 5. Take s ( k )
=
s ( k-
1 ) and g e t estimate Q: with t h e p r o p e r t y (2.16). P u t Q l=
Q l - 1 , j<
k . Ifthen t a k e s ( k )
=
s ( k )+
1and update estimates f o r j E Nk such t h a t (2.16) i s satisfied. If (2.17) i s not satis- fied don't do any additional estimation and g o t o s t e p 3.
Step 3 (Solve Master). Solve (2.2) and t a k e Ak
=
[ j :hf >
O ( wherehf -
solutionsof (2.2). If s ( k j )
=
~ ( k ) f o r a l l j E4
then t a k e hk= 4
and g o t o s t e p 4 o t h e r - wise g o t o s t e p 2.Thus, in this modification i t i s always a s s u r e d t h a t through repetition of s t e p s 2 and 3 t h a t w e g e t such s e t h k t h a t f o r a l l j E hk precision of estimates Q l corresponds t o number s ( k ) . In this c a s e besides p r o p e r t y (2.16) some mild "in- dependence" conditions should b e satisfied. Let us define by Bko-field g e n e r a t e d by [ z l ,
. . .
, z k , Q 1. . .
Q:( a t t h e moment when k j=
k f o r a l l j E h k . I t i s necessary t h a t e x i s t s o>
0 and f o r any s ( k ) e x i s t s8,
( k )>
0 such t h a tThese conditions are satisfied, for instance for t h e estimates of t h e type ( 2 . 1 5 ) :
This formula i s also valid for t h e f i r s t estimate at t h e point z k if w e t a k e in t h i s case s ( k k )
=
0. I t i s assumed t h a t values w i of t h e random p a r a m e t e r s are in- dependent. Estimates ( 2 . 1 9 ) satisfy p r o p e r t y ( 2 . 1 8 ) e x c e p t in t h e trivial case~ ( z j ) E ~ ( z j , w ), f o r almost all w
.
Theorem 2 . Suppose t h a t conditions 1 and 3 of theorem 1 are satisfied and, in addition, ( 2 . 1 6 ) , ( 2 . 1 8 ) are fulfilled and rrk i s bounded a s . Then ( 2 . 1 7 ) i s satisfied infinitely often with probability 1 and, consequently, f o r precision c o n t r o l r u l e , based on (2.17) assumption 2 of t h e theorem 1 i s satisfied.
Proof. Suppose t h a t e x i s t s set Wl
c
W s u c h t h a t f o r w E Wl condition (2.17) is satisfied only on finite number of iterations. This means t h a t f o r any o E Wl t h e r e e x i s t s k l ( o ) s u c h t h a t f o r k>
k l ( o ) w e h a v e s ( k )=
s ( o )=
const. T h e r e f o r e any number L c a n e n t e r t h e set Nk only once f o r k>
k l ( o ) . T h e r e f o r e f o r o E W1 t r a n - sition from t h e s t e p 3 to s t e p 2 c a n o c c u r only finite number of times. Thus, f o r al- m o s t all o E Wl e x i s t s k z ( o ) k l ( o ) s u c h t h a t f o r k>
k z ( o ) t h e r e are n o transi- tions from s t e p 3 to s t e p 2 , i.e., only new estimates Q: will b e made f o r k>
k z ( o ) . T h e r e f o r e f o r k>
k z ( o ) w e h a v ewhere
@
(.rr) i s defined in ( 2 . 7 ) andAccording to t h e assumption 3 of t h e theorem f o r almost a l l w E W1 e x i s t s sequence k , (w ) s u c h t h a t
Due t o boundedness of t h e sequence
d
w e c a n assume without loss of g e n e r a l i t y t h a td - r r S . Taking i n t o a c c o u n t t h e f a c t t h a t ,b(.rr) and @ ( r r ) satisfy t h e
Lipshitz condition uniformly o v e r .rr and k w e obtain f o r o E Wl and k , >
k 2 ( o ) :
k + 1 k , + 1
+ ~ W n * ) + y , + Q c + ~ - Q ( z
I + ? ,
*
4where?, = 2 C 2 ( J n
-
1)-Oasr--.
Condition (2.18) gives f o r k ,
>
k 2 ( o )f o r some o
>
0 a n db =
@,(,)>
0. T h e r e f o r e f o r almost all o E Wl e x i s t k ,>
k 2 ( o ) such t h a t4
+ l-
Q ( z k r + l )< - 8
Qk,
+
1and 7 ,
+ 7, <
b / 2 . This t o g e t h e r with (2.20) gives f o r sufficiently l a r g e r : q 4 ( n * >s
q ( n * ) - b / 2k k k
and t h e r e f o r e
q
' ( n ') 5q(n
') f o r sufficiently l a r g e r and w E W1. Hencek , + 1 k, + l
+ Qk, + 1
- 4 ( z 4 + ' )
+ Y , = V ~ + Q ~ , + ~-
Q ( z 4 t 1 ) + 7 ~ (2.21)The condition (2.18) implies
with o
>
0 ,3 >
0 , k ,>
k 2(o). T h e r e f o r e f o r almost all o E W1 e x i s t k ,>
k 2 ( o ) andI
7 ,I < p/
2. This gives t o g e t h e r with (2.21):f o r almost a l l o E W1 and some k ,
>
k 2 ( o ) . W e a r r i v e d in contradiction with o u r ini-tial assumption. T h e r e f o r e assumption 2 of t h e theorem 1 is satisfied. Proof i s completed.
Let us now consider in more detail Assumption 3 of t h e Theorem 1 and t h e specific p r o c e d u r e s f o r selection of t h e point z k at s t e p 4 of t h e algorithm.
These p r o c e d u r e s should s a t i s f y assumption 3 of t h e theorem; namely with proba- bility 1 e x i s t s a subsequence k, such t h a t
~ ( z " ; + l . n " ; ) - min ~ ( z , n " ; ) d . l r t r u
The b e s t choice i s v ( z k ' I ,
d ) =
min ~ ( z ,d )
but t h i s i s not feasible becausel r t r u
of inaccessibility of e x a c t function values ~ ( z , n). W e shall c o n s i d e r two pro- c e d u r e s which d o not r e q u i r e objective function values.
1 Random s e a r c h . Take probability measure R with nonzero density in t h e set L 6 z 6 u and t a k e successive points z 1
. .
z k as independent observations of random variable z with distribution R . Then (2.22) i s fulfilled due to continuity of a ( z , n).2 Stochastic q u a s i - g r a d i e n t method. (Ermoliev [ 3 ] ) This method will produce se- quence of points z such t h a t
a ( z k , # )
-
min ~ ( z , # ) - 0.
l r t r u
On e a c h i t e r a t i o n t h e following calculation, are performed at t h e s t e p 4 of t h e algorithm:
if z i
<
Liif
'
"iz i otherwise
In p a r t i c u l a r , i t i s possible to t a k e
where d S are optimal dual multipliers of t h e following problem:
Q ( z [ , u s )
=
min [ < q ( w S ) , y>
) W ( w S ) y=
h ( w S )-
T ( w S ) z [ ]l r y r u
and wS are independent observations of random parameters.
-
ODIf problem (2.25) h a s bounded solutions f o r a l l w ,
2
ps=
w,2
p:<
and s =o s = Omk -+ w as k -+ w then (2.23) i s satisfied and, consequently, assumption 3 of t h e theorem 1 i s satisfied too.
3. EXTENSION
Method, described in t h i s section i s applicable not only t o t h e stochastic pro- grams with r e c o u r s e (1.1) but t o more g e n e r a l problems of stochastic programming as well. Consider t h e following problem:
mimimize E'j (z , w ) (2.26)
s u b j e c t t o p ( z ) S 0 , z E X
The method and r e s u l t s remain essentially t h e s a m e if w e denote E'j ( z , w )
=
Q ( z ) and substitute e v e r y w h e r e in t h e above discussion Q ( z ) f o r < c , z
> +
Q ( z ) and p ( z ) f o r Az-
b . The initial points should satisfy nowMaster problem (2.2) obtains t h e form
k
minimize
C
QJ hjj =I
X j 2 0
where
91
a r e estimates of Ef ( z j , w ). Subproblem (2.3) becomes min E f ( z , w )- <#,
p ( z ) >t E X
The theorem l a i s proved similarly t o t h e theorem 1:
Theorem la. Take t h e following assumptions
1 Function Ef ( z
,
w ), p ( z ) are convex, t h e set X i s compact.2 Exists e" X such t h a t p
(g) <
0 and initial points z l,. . .
, z m l a r e such t h a t a x min <e,
p ( z j ) ><
0d = l , q "0 j
max I~Q;
-
9 ( z k > I , max191
- ~ f ( z j , w)lj=
c k - 0 41.5.j E A ~
lim inf ~ ~ f ( z ' + I , w )
-
< x i , p ( z i +')>If + - i S f
-
min [ E f ( z , w )-
< x i , p ( z ) > ]=
0 a.s.t E X
Then E ~ ( z ' , w ) --r min[Ef(z, w ) l p ( z ) s O , z ~ X I where
Zk = C
X:xj and a l lj E A ~
accumulation points of t h e sequence
zk
a r e solutions of t h e problem (2.26).Although o u r primary c o n c e r n h e r e i s with a conceptual algorithm, l e t us con- clude t h i s section with a b r i e f discussion of some considerations which apply in o r d e r t o make t h e algorithm implementable.
a) Purging S t r a t e g y f o r Grid Points: The above algorithm assumes t h a t all g r i d points a r e retained but, when s t o r a g e i s limited, i t will be n e c e s s a r y t o periodical- ly remove grid points. This s u b j e c t h a s been extensively studied, see Eaves and Zangwill [2], Topkis
[Ill
in t h e context of cutting plane algorithms, and similar considerations apply h e r e .b ) Variance of Estimates: When developing estimates
~d
using, f o r example, (2.15) o r (2.19), w e c a n a l s o maintain a n d u p d a t e t h e v a r i a n c e of e s t i m a t e s f o r e a c h g r i d point x i . These c a n t h a n b e usefully employed in refining t h e decision r u l e s at S t e p s 2 a n d 3.c ) Induced Constraints: When t h e assumption of complete r e c o u r s e (i.e. t h a t ( l . l b ) always h a s a solution) c a n n o t b e v e r i f i e d a priori, t h e n i t may h a p p e n t h a t f o r some combination of g r i d point xi a n d random p a r a m e t e r s w i (in (2.15) a n d (2.19)). t h e problem ( l . l b ) is infeasible. Following Van-Slyke a n d Wets [12], a n in- duced c o n s t r a i n t o r f e a s i b i l i t y c u t must t h e n b e deduced a n d i n t r o d u c e d i n t o t h e problem ( l . l a ) a n d c o r r e s p o n d i n g l y i n t o t h e master p r o g r a m (2.2). This e x t e n s i o n r e q u i r e s f u r t h e r study.
T h e r e are a l s o a number of s p e c i a l cases of t h e g e n e r a l problem ( l . l a , b ) which p e r m i t s refinements, with a view t o enhancing efficiency, of t h e algorithm d e s c r i b e d a b o v e . One case of p r a c t i c a l i n t e r e s t i s s t o c h a s t i c p r o g r a m s with r e c o u r s e a n d non-stochastic t e n d e r s ( s e e N a z a r e t h a n d Wets [7], w h e r e T(ml x n l ) i s a m e d matrix. The master/subproblem p a i r c o r r e s p o n d i n g t o (2.2) a n d (2.3) c a n t h e n b e r e f o r m u l a t e d as follows:
Master:
k minimize < c , x
> + z ~1
Ajj = I
w h e r e
a n d
Q l i s a n estimate of t h e value Q(xl) at the g r i d point
XI,
andcf
,d
and v k are t h e dual multipliers associated with t h e optimal solution of t h e master (2.27).Subproblem: Consider t h e (Lagrangian) subproblem, minimize < q k , X>
+
Q(x)L sxsu
where L and U are any s u i t a b l e bounds implied by
x
s Zk a n d 1 4 z 4 u . i s again t a k e n to b e a n "approximate" solution to (2.29). in t h e s e n s e discussed in S t e p 4, a f t e r e x p r e s s i o n (2.3).I t frequently happens t h a t m i
<<
n l i.e. t h a t only a few elements of t h e prob- lem are stochastic. In t h i s c a s e , t h e above reformulation c a n considerably enhance efficiency, because t h e optimization in t h e subproblem (2.29) and t h e l i n e a r pro- gram in (2.28) which must be solved to obtain estimates Q! are both in a s p a c e of relatively low dimension.REFERENCES
[ I] Dantzig, G.B., L i n e a r Programming a n d E z t e n s i o n s , P r i n c e t o n University P r e s s (1980).
[ 21 Eaves, B.C. a n d W. Zangwill, "Generalized cutting plane algorithms," S I N J.
Control 9(1971).
[
31
Ermoliev, Yu., Methods of S t o c h a s t i c Programming (in Russian), Nauka, Mos- cow (1976).[ 43 Ermoliev, Yu., A. Gaivoronski a n d C. Nedeva, Stochastic programming prob- lems with incomplete information on objective functions, SIM J. Control a n d Q p t i m i z a t i o n vol. 23, No. 5.
[
51
Ermoliev, Yu. and A. Gaivoronski, Stochastic quasigradient methods and t h e i r implementation. IIASA Working P a p e r WP-84-55, 1984.[ 61 Nazareth, J.L., Design a n d implementation of a s t o c h a s t i c programming optim- i z e r with r e c o u r s e a n d t e n d e r s . IIASA Working P a p e r WP-85-063, 1985.
[ 71 Nazareth, J.L. and R.J-B. Wets, "Algorithms f o r s t o c h a s t i c programs: t h e case of non-stochastic t e n d e r s , " A. P r d k o p a a n d R.J-B. Wets, eds., Mathematical Programming S t u d y 28(1986).
[
81
P a r i k h , S.C., L e c t u r e n o t e s on s t o c h a s t i c programming, unpublished, Universi- t y of California, Berkeley (1968).[ 93 Polak, E., Theory of optimal c o n t r o l and mathematical programming. N.Y.
McGraw-Hill Book Co. 1970.
[lo]
S h a p i r o , J.F.,
"Mathematical programming: s t r u c t u r e s and algorithms", John Wiley, New York (1979).[ll] Topkis, D., "Cutting plane methods without nested c o n s t r a i n t sets", m e r a t i o n s Research 18 (1970).
[12] Van Slyke, R. and R.J-B. Wets, "L-shaped linear program with applications t o optimal control and stochastic linear programs", SIAM Journal o n Applied Mathematics 17(1969) 638-663.
[13] Wets, R., "Programming under uncertainty: the complete problem", 2.
Wahrsch. v e r w . Gebiete 4(1966) 316-339.
[14] Williams, A.C., "Approximation formulas f o r stochastic linear programming", SIAM Journal o n Applied Mathematics 14(1966) 668-677.