Combining Generalized Programming and Sampling Techniques for Stochastic Programs with Recourse

(1)

Working Paper

COUBINING GENERALIZED PROGRAMMING AND SAMPLING TECHNIQUES FOR

ETOCWHITC PROGRAMS WITH RECOURSE

A. Gaivoronski J.L. Nazareth

September 1986 WP-06-44

International Institute for Applied Systems Analysis

A-2361 Laxenburg, Austria

(2)

NOT FOR QUOTATION WITHOUT THE PERMISSION OF THE AUTHORS

COMBINING GENERALIZED PROGRAMMING AND SAHPLING TECHNIQUES FDR

SToCHASlXC PROGRAMS WITH RECOURSE

A. Gaivoronski J.L. Nazareth

September 1986 WP-86-44

Working Papers a r e interim r e p o r t s on work of t h e International Institute f o r Applied Systems Analysis and have received only limited review. Views o r opinions expressed herein do not necessarily r e p r e s e n t those of the Institute o r of its National Member Organizations.

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS A-2361 Laxenburg, Austria

(3)

FOREWORD

This p a p e r deals with a n application of generalized l i n e a r programming techniques f o r s t o c h a s t i c programming problems, p a r t i c u l a r l y t o stochastic programming problems with r e c o u r s e . The major points which needed a clarification h e r e were t h e possibility t o use t h e estimates of t h e objective function instead of t h e e x a c t values and t o use t h e approximate solutions of t h e dual subproblem instead of t h e e x a c t ones.

In this p a p e r conditions are p r e s e n t e d which allow to use estimates and approximate solutions and s t i l l maintain convergence. The p a p e r i s a p a r t of t h e ef- f o r t on t h e development of stochastic optimization techniques at t h e Adaptation and Optimization P r o j e c t of t h e System and Decision Sciences Program.

Alexander B. Kurzhanski Chairman System and Decision Sciences Program

(4)

CONTENTS

1 Introduction

2 A conceptual Algorithm 3 Extension

References

(5)

COMBINING G - PROGRAMMING

AND

SAMPLING TECHNIQUES FOR STOCHASTIC PROGRAMS WITH RECOURSE

A. G a i v o r o n s k i and J.L. N a z a r e t h

Generalized Programming Techniques of Wolfe (see Dantzig [I]) enjoyed e a r l y use f o r solving s t o c h a s t i c p r o g r a m s with simple r e c o u r s e (Williams [14], P a r i k h [8]) and t h e r e h a s r e c e n t l y been renewed i n t e r e s t in t h e i r r e l e v a n c e f o r solving more g e n e r a l c l a s s e s of s t o c h a s t i c programs (see Nazareth & W e t s [7]

-

s t o c h a s t i c programs with r e c o u r s e and nonstochastic t e n d e r s , Ermoliev, Gaivoronski & Nede- va [4]

-

s t o c h a s t i c p r o g r a m s with incomplete information). Our i n t e r e s t h e r e is in s t o c h a s t i c programs with r e c o u r s e of t h e form:

minimize Ew[< c ( w ) , z

> +

Q ( z , w ) ] s.t.

Ax

= b L S z S u where

where w i s a n element of some probability s p a c e (W.

B.

P), A ( m l X n l ) i s a fixed matrix, T(*) (m2 X n 2 ) are random matrices, c(*) ( n l ) , q ( * ) ( n 2 ) and A(*) (m2) are random v e c t o r s and b (ml) a fixed v e c t o r . W e assume complete r e c o u r s e i.e. ( l . l b ) always h a s a solution. E, denotes expectation. Define c

=

E, [c (w )I. Then w e c a n e x p r e s s ( l . l a , b) as

minimize < c , z

> +

Q ( z )

=

F ( z )

(6)

w h e r e

The set of c o n s t r a i n t s in (1.Za) w e s h a l l d e n o t e b y X. P r o p e r t i e s of (l.Za, b ) h a v e b e e n e x t e n s i v e l y s t u d i e d ( s e e Wets [13]) a n d , in p a r t i c u l a r , Q ( z ) c a n b e shown to b e convex b u t i s , in g e n e r a l , nonsmooth.

The g e n e r a l i z e d programming a p p r o a c h a p p l i e d to (1.2a) involves i n n e r or g r i d l i n e a r i z a t i o n of t h i s convex p r o g r a m a n d r e q u i r e s c o o r d i n a t e d solution of a m a s t e r p r o g r a m a n d a (Lagrangian) s u b p r o b l e m defined as follows:

Master:

k k

minimize

x

^{< c ,}z j > A j

+ x

~ ( z j ) A j

j = 1 j = l

w h e r e

d

, v k are t h e d u a l multipliers a s s o c i a t e d with t h e optimal solution of (1.3a) Sllbproblem: Find zk E

R ~ '

s u c h t h a t L S zk

s

u a n d

b y p a r t i a l l y optimizing t h e problem minimize

<dC

, z

> +

Q ( z )

l r z r u

w h e r e

d

a n d v k d e n o t e t h e d u a l multipliers a s s o c i a t e d with t h e optimal solution of t h e master p r o g r a m (1.3a) a n d

(7)

We temporarily ignore a l l considerations r e l a t e d to initialization of (1.3a), un- boundedness of t h e solution in (1.3b), recognition of optimality and so on. (1.3a-c) show only t h e essential f e a t u r e s of t h e method, namely, t h a t t h e master sends ( p r i c e s ) nk to t h e subproblem which, in t u r n , uses t h e s e quantities to identify a n i m p r o v i n g (grid) point z k + l.

In many p r a c t i c a l applications, t h e probability distribution of t h e random events i s d i s c r e t e with relatively few points in t h e distribution and randomness i s often r e s t r i c t e d to c e r t a i n components of (l.la-b), f o r example, to h ( . ) . In such cases j u d i c i o u s computation enables ~ ( z j ) and i t s subgradients to b e found ex- actly, see Nazareth [6]. These quantities are r e q u i r e d both to define t h e objective function of t h e m a s t e r (1.3a) and during t h e solution of ( 1 . 3 ~ ) to give a n improving point satisfying (1.3b). More generally however, ~ ( z j ) c a n only b e approximated in (l.Zb), f o r example, by a sampling p r o c e d u r e , and e x a c t computation of i t s value or of i t s subgradients i s o u t of t h e question because, i t would b e too expensive. We then s e e k to r e p l a c e ~ ( z j ) in (1.3a) by a n estimate, s a y Ql. The generalized programming a p p r o a c h , extended in this manner, s t i l l continues to a p p e a r viable and d e s e r v e s f u r t h e r investigation, f o r t h e following reasons:

a ) I t is w e l l known (and in t h e n a t u r e of a "folk theorem") t h a t f a i r l y c r u d e ap- proximations of t h e underlying distribution in ( l . l a - b ) (which t h e n permit e x a c t solution of t h e resulting approximated r e c o u r s e program) o f t e n produce quite reasonable estimates of t h e "optimal" f i r s t s t a g e decision. This c a n b e i n t e r p r e t e d to mean t h a t f a i r l y c r u d e estimates Q l in t h e master program will often b e adequate to guide t h e algorithm to a "reasonable" neighborhood of t h e d e s i r e d solution of t h e original r e c o u r s e problem (1.la-b).

The (Lagrangian) subproblem ( 1 . 3 ~ ) does not h a v e to be o p t i m i z e d at e a c h cy- c l e . For example, a l l t h a t i s needed in t h e case of e x a c t estimates Q ( z k + I ) , to produce a n improving point is t h a t t h e condition (1.3b) b e satisfied. This suggests t h e r e f o r e t h a t one s e e k to r e e x p r e s s t h i s condition in t h e t e r m s of estimates

9 : ft

and combine i t with utilizing s t o c h a s t i c estimates of subgradients, s t o c h a s t i c quasi- g r a d i e n t p r o c e d u r e s (see Ermoliev and Gaivoronski [5]) which are generally effec- tive, when t h e y are applied to a problem t h a t does not have to b e pushed all t h e way to optimality.

(8)

Our p a p e r c a n b e viewed as a s t u d y of g e n e r a l i z e d programming in t h e p r e s - e n c e of noise (whose magnitude d e c r e a s e s as t h e number of i t e r a t i o n s i n c r e a s e s ) a n d with t h e s p e c i a l c h a r a c t e r i s t i c s of r e c o u r s e problems t a k e n i n t o c o n s i d e r a - tion. In s e c t i o n 2 w e state a c o n c e p t u a l algorithm a n d e s t a b l i s h c o n v e r g e n c e u n d e r a p p r o p r i a t e assumptions t h e r e b y extending t h e s t a n d a r d p r o o f s ( s e e , f o r example, S h a p i r o [lo], f o r t h e case when Q ( z ) i s known e x a c t l y ) . Some c o n s i d e r a t i o n s con- c e r n i n g implementation are b r i e f l y discussed. Finally extension to o t h e r s t o c h a s t i c programming problems i s c o n s i d e r e d in s e c t i o n 3.

2.

A CONCEPTURAL ALGOEUTEW

W e u s e t h e t e r m "conceptual" h e r e i n t h e s e n s e of Polak [9], a n d s t u d y t h e following algorithm f o r solving (1.la-b). I t will b e convenient t o assume t h a t all bounds L a n d u are finite so t h e L 5 z 5 u i s a c o m p a c t set.

The algorithm g e n e r a t e s s e q u e n c e of points z 0

- .

z k

- - .

which depend o n element w of some p r o b a b i l i t y s p a c e (W. B.

P)

where w E W C RP, B

-

^a

-

field, P

-

p r o b a b i l i t y measure. The s e q u e n c e zk c o n v e r g e s to t h e solution of t h e problem (1.1) in a c e r t a i n p r o b a b i l i s t i c s e n s e .

Step 1: (Initialize): Choose a set of m l g r i d points z l ,

....

zml so t h a t t h e con- s t r a i n t s

h a v e a f e a s i b l e solution. Set k --r m l.

Step 2. (Form estimates)

Define a s u b s e t Nk of i n t e g e r s , Nk

c

11,.

. . .

, k j, t h i s being t h e set of g r i d p o i n t s in- d i c e s f o r which e s t i m a t e s will b e made. Define a n i n t e g e r s ( k ) , which c o n t r o l s t h e p r e c i s i o n of e s t i m a t e s . G e n e r a l l y s p e a k i n g s ( k ) i s t h e number of o b s e r v a t i o n s of t h e function Q ( z , o) used to form t h e estimate. Obtain t h e new e s t i m a t e s QJ of

~ ( z j ) , j E Nk a n d f o r j C Nk t a k e QJ

=

QJ I t will b e assumed t h a t f o r j E Nk

(9)

in some suitable probabilistic sense. Initially f o r k

=

^{m i ,}l e t Nk

=

^11,

. . .

, m For subsequent k , t h e s e t Nk , i n t e g e r s ( k ) and estimates QI c a n b e selected in a number of d i f f e r e n t ways, some of which will b e specified l a t e r .

Step 3: (Solve Master):

k

minimize

C

( < c , z j 7

+

^{Qi) A j}

j = I

Let

#

and vk be t h e associated optimal dual multipliers and Xf

-

optimal pri- mal variables. Define f l k

=

^{f j :}^{A f}

>

O j . In some versions of o u r method i t i s neces- s a r y at t h i s point t o redefine t h e set Nk and g o t o s t e p 2 (examples will b e given l a t e r ) . Otherwise, g o t o s t e p 4.

Step 4: (Define new g r i d point zk 'I).

Define

and consider t h e (Lagrangian) subproblem minimize <uk , z

> +

Q ( z )

1 r t r u

The new point z k i s taken t o b e a n "approximate" solution t o this problem, more precisely, i t i s n e c e s s a r y t h a t f o r almost all w E W t h e r e e x i s t s a subsequence k,(,) such t h a t

< u k r , z k r + l > + Q ( z " + I )

-

^min [<akr, z

>+

Q ( z ) ] ^-+0

1 s t s u

Note, t h a t i t i s not necessary t h a t

(10)

< o k , zk 'I

>+

Q ( z k 'I)

-

^min

[<&,

z

>+

Q ( z ) ] --, 0 l % z % u

f o r t h e whole sequence z k . This makes i t possible, f o r instance, t o use random s e a r c h techniques f o r getting z k 'I. Some p a r t i c u l a r methods of choosing t h e point zk 'l with t h i s p r o p e r t y will b e specified at t h e end of t h i s section.

Step 5: ( I t e r a t e ) : k --, k

+

1 . Go to s t e p 2.

This algorithm h a s t w o important d i f f e r e n c e s from t h e usual generalized l i n e a r programming algorithm. Firstly, i t does not r e q u i r e e x a c t values of t h e objective function ( s t e p 2). I t i s only n e c e s s a r y t o have estimates of t h e objective values at t h e g r i d points whose precision gradually increases. Secondly i s i s not n e c e s s a r y t o minimize t h e Lagrangian subproblem a t s t e p 4 , precisely; i t i s only n e c e s s a r y t h a t t h e c u r r e n t point z k 'l r e g u l a r l y comes to t h e vicinity of such a solution.

Both modifications are n e c e s s a r y in o r d e r t o make use of generalized l i n e a r programming in a s t o c h a s t i c setting.

In o r d e r t o p r o v e convergence of t h i s algorithm l e t us consider i t ' s dual reformulation. Take

Nrr)

=

min q ( z , rr) l % r % u

@(n)

=

min q k ( j , rr)

l % j % k (2.7)

Then algorithm (2.1)-(2.3) c a n b e considered as a maximization method f o r t h e concave function Jl(rr) by successive polyhedral approximation of q(rr) by @(m).

A t s t e p 1 t h e initial polyhedral approximation i s c o n s t r u c t e d , in s t e p 3 t h e c u r r e n t polyhedral approximation

@

(rr) i s maximized, optimal dual multipliers

d

being t h e solution of t h e problem

In s t e p s 2 and 4 t h e polyhedral approximation i s updated.

Theorem 1. Make t h e following assumptions:

(11)

1. Initial points x l ,

. . .

^,x m l are such t h a t b E int c o f ~ x j , ^j

=

1: m l ]

where int means i n t e r i o r and c o convex hull.

I Q ~

-

Q ( x k ) l , rnax (QJ - Q ( x ~ ) I

=

^~k ^-+ ⁰

j Ehk

I

lim inf < U ~ , X ~ + ~ > + Q ( X ~ + ~ )

T --.= f L T

I

-

^rnin [ < u i , x > + Q ( x ) ] L S x S u

Then F ( r k ) ⁴min F ( x ) a s . where

ik =

A! x i and a l l accumulation points of z E X

j E hk t h e sequence

zk

^aresolutions of (1.2) a s . Roof. Due t o t h e assumption 2 w e have

s u p J Q ~ - Q ( x ~ ) ( <

c < =

^a.s.

k,j Ehk

This t o g e t h e r with boundedness of x k gives:

This t o g e t h e r with assumption 1 implies t h e boundedness of t h e sequence

# ,

which c a n b e s e e n as follows: Indeed,

@ (d

)

=

rnax

@

(n) and t h e r e f o r e

n

which follows from (2.5). On t h e o t h e r hand

) rnin [ < c , x j > + ~ i - < + , h j - b > ] l r j r m l

r rnax [ C c , x j >

+

@ ( k ) ]

+

^min ^[-

<#, AX^ -

^{b > ]}

l r j r m l l r j r m l

C - rnax < + , h j - b >

l r j r m l

=

C ,

- I I + ) )

^max ^,

AX^ -

^b

(12)

- n k in max < e , ~ z j - b > s C ~ - I l d ( l b 13.1 i r j r n l

f o r some b

>

⁰due t o assumption 1.

Thus.

which gives

T h e r e f o r e t h e sequence

2

i s bounded. According t o t h e assumption 3 of t h e theorem f o r almost a l l w E W e x i s t subsequence k, ( w ) such t h a t

<u&, 21; + I >

+

Q ( Z li + I )

-

^min [<crli,

z > +

Q ( Z ) ] --r

o

l r r r u

T 4

Using equality u4

=

c

-

^A ⁿ w e obtain

k , +1 k , + 1

< c , z > + Q ( z ) - < n k r , ~ z ^k,+1

-

^b

> -

,b(nk') --r 0 _(2.8) Due t o t h e boundedness of t h e sequence

d

we may now assume without loss of generality t h a t n4 -, n

*

and t h e r e f o r e (1nli

-

ⁿ⁴⁺¹¹⁾^-,^{0 .}F u r t h e r more from t h e definitions (2.4), (2.6) of t h e function ,b(n) and boundedness of the admissible s e t X follows t h a t t h e function ,b(n) satisfies Lipshitz condition uniformly on n and t h e r e - f o r e

as r -, 0 where C2

<

⁰

.

Thus (2.8) implies

k , + 1 k , + 1

<c ,

z > +

Q ( 2 )

-

<nkr+', AZ ^{k , +1}

-

^b

> -

^,b(nkr+l)

^s

^{T ,}

where max 10, 7 , -, 0 as r 4 0 .

Consequently

k , + 1 k , + 1

< c ,

z > +

Q4+,

-

<nkr+l, kr + 1

-

^b

> -

^,b(nk'*l)

(13)

But

4 + I k , + I

<c. Z

> +

Qkrll

-

<TT4+l, AZ'

-

^b

>

L min [<c. z j

> +

Q L + ~

- <dl+'

,

~ z f -

^b

^>I =

9kr+l(,kr+l

1 S j S 4 + 1

1

Inequalities (2.9) and (2.10) g i v e

k , + I

( T T )

-

⁽ ^T ^T ⁾^S^T

⁺

^Q

-

^{Q ( z}⁴^{+ I )}

where t o g e t h e r with assumption 2 mean

,t+(TTk.)

- +(a) ^s

^{7 :}

where max 10, ~ : j -, 0 as r -,

-

^a.s.

On t h e o t h e r hand

#r(TTkr)

=

max

9

k ' ( r r )

lr

a m a x min [ < c ,

d >

+ Q ( z ~ )

-

<n. A Z ~ - b > ] - c k ,

lr f E * %

2 max ~ ( T T )

-

^cr;

lr

Inequality (2.11) now g i v e s

+ ( n r; )

+

^T: ²max ~ ( T T )

-

^ck,

lr

which implies

k

~ ( T T ')

-

max ~ ( T T ) -, 0

lr

and

It; It;

9

( T T )

s

rnax ~ ( T T )

+

^T:

lr

where max 10,

T F ~

^-,^{0 as}^r^-,

⁼

a . s . . The last inequality t o g e t h e r with (2.12) g i v e s

k k

9

^{' ( T T}')

-

^max~ ( T T ) -, 0

lr

(14)

Taking now a r b i t r a r y k

>

k, w e get:

s m a x min [ < c , x j > + Q ( x ~ )

-

^<rr,

^{AX^} -

^{b > ]}

"

j ~ ~ k U ~ ' +

-max min [ < c , x j > + Q ( x ~ )

-

< T T , A Z ~

-

b > ] jeA'

+ 2 max E~ 5 2 max E~

k T S i S k k r S i S k

which t o g e t h e r with ( 2 . 1 2 ) and (2.13) gives

The problem of maximization of

@

(rr) is dual to ( 2 . 2 ) and t h e r e f o r e

C

( < c , z j >

+

Q t ) Af

-

^min^{F ( x )}

j ^E ^A ^~ ²EX

Finally due to convexity of F ( x )

which t o g e t h e r with ( 2 . 1 4 ) gives F ( z ~ )

-

^min2 EX ^{F ( x )}^a.s.

which completes t h e proof.

W e now study i n t u r n e a c h of t h e assumptions upon which t h e preceding theorem depends. Assumption 1 of t h e theorem c a n always b e satisfied if matrix A i s of r a n k m

Let u s consider in m o r e detail assumption 2, which deals with precision of function values estimates at "essential" points. It's fulfillment depends on t h e r u l e used a t s t e p 2 to determine t h e s e t Nk of c u r r e n t new estimates, t h e i n t e g e r s ( k ) which c o n t r o l s a c c u r a c y and t h e method of obtaining estimates. Consider t w o such

r u l e s which g u a r a n t e e t h a t condition 2 i s satisfied.

(15)

I. This i s t h e simplest a d hoc rule. Before s t a r t i n g t h e algorithm define a se- quence Ikp

jF=

^{l l} ^kp⁺¹

^>

kp a n d t a k e s ( m l ) = s o

Nk

=

11,.

. .

^,k j , s ( k ) = s ( k - 1 )

+ I

Nk

=

[ k j , s ( k )

=

s ( k

-

1 ) otherwise

in o t h e r words f o r k

=

kp estimates at all g r i d points are updated with i n c r e a s e d a c c u r a c y while f o r k #

$

t h e estimate i s made only at t h e l a t e s t point z k to e n t e r t h e set of g r i d points. The estimates themselves should possess only t h e p r o p e r t y t h a t

An example of such estimate i s

z w k f o r j

<

k

where w

'

^areindependent observations of random p a r a m e t e r s from ( 1 . 1 )

2. The previous r u l e does not discriminate between r e c e n t points and old ones, which might become redundant. Furthermore i t i s b e t t e r to b a s e decisions on whether to i n c r e a s e precision on information which becomes available during iterations. The following a d a p t i v e precision ruLe t a k e s account of t h e s e f a c t o r s .

Let us define f o r e a c h estimate

Q l

of t h e function value ~ ( z f ) , t h e number k j such t h a t

z j E N 4 , ^{z j} Ni f o r kt

<

ⁱ^L^k

i.e. k j i s s t e p number when t h e estimate of

Ql

w a s last updated. Then t h e precision of t h e estimate c h a r a c t e r i z e d by number s ( k j ) :

(16)

The s t e p s 2 and 3 of t h e method with this adaptive precision r u l e are specified a s follows:

Step 2 (Form estimates). T h e r e a r e two possibilities (i) Preceding s t e p w a s s t e p 3. Then

N~

=

^{[ j}: j E

4,

and s ( k j )

<

s ( k ) (

s ( k ) remains t h e same. For j E

@

g e t estimates Q l such t h a t

g o to s t e p 3

(ii) Preceding s t e p w a s s t e p 5. Take s ( k )

=

s ( k

-

1 ) and g e t estimate Q: with t h e p r o p e r t y (2.16). P u t Q l

=

Q l - 1 , j

<

k . If

then t a k e s ( k )

=

s ( k )

+

1

and update estimates f o r j E Nk such t h a t (2.16) i s satisfied. If (2.17) i s not satisfied don't do any additional estimation and g o t o s t e p 3.

Step 3 (Solve Master). Solve (2.2) and t a k e Ak

=

^{[ j}:

hf >

O ( where

hf -

^solutions

of (2.2). If s ( k j )

=

~ ( k ) f o r a l l j E

4

then t a k e hk

= 4

and g o t o s t e p 4 o t h e r - wise g o t o s t e p 2.

Thus, in this modification i t i s always a s s u r e d t h a t through repetition of s t e p s 2 and 3 t h a t w e g e t such s e t h k t h a t f o r a l l j E hk precision of estimates Q l corresponds t o number s ( k ) . In this c a s e besides p r o p e r t y (2.16) some mild "in- dependence" conditions should b e satisfied. Let us define by Bko-field g e n e r a t e d by [ z l ,

. . .

^,z k , Q 1

. . .

^Q:( a t t h e moment when k j

=

k f o r a l l j E h k . I t i s necessary t h a t e x i s t s o

>

0 and f o r any s ( k ) e x i s t s

8,

^{( k )}

>

0 such t h a t

(17)

These conditions are satisfied, for instance for t h e estimates of t h e type ( 2 . 1 5 ) :

This formula i s also valid for t h e f i r s t estimate at t h e point z k if w e t a k e in t h i s case s ( k k )

=

^0.I t i s assumed t h a t values w i of t h e random p a r a m e t e r s are in- dependent. Estimates ( 2 . 1 9 ) satisfy p r o p e r t y ( 2 . 1 8 ) e x c e p t in t h e trivial case

~ ( z j ) E ~ ( z j , w ), f o r almost all w

.

Theorem 2 . Suppose t h a t conditions 1 and 3 of theorem 1 are satisfied and, in addition, ( 2 . 1 6 ) , ( 2 . 1 8 ) are fulfilled and rrk i s bounded a s . Then ( 2 . 1 7 ) i s satisfied infinitely often with probability 1 and, consequently, f o r precision c o n t r o l r u l e , based on (2.17) assumption 2 of t h e theorem 1 i s satisfied.

Proof. Suppose t h a t e x i s t s set Wl

c

W s u c h t h a t f o r w E Wl condition (2.17) is satisfied only on finite number of iterations. This means t h a t f o r any o E Wl t h e r e e x i s t s k l ( o ) s u c h t h a t f o r k

>

k l ( o ) w e h a v e s ( k )

=

s ( o )

=

const. T h e r e f o r e any number L c a n e n t e r t h e set Nk only once f o r k

>

k l ( o ) . T h e r e f o r e f o r o E W1 t r a n - sition from t h e s t e p 3 to s t e p 2 c a n o c c u r only finite number of times. Thus, f o r al- m o s t all o E Wl e x i s t s k z ( o ) k l ( o ) s u c h t h a t f o r k

>

k z ( o ) t h e r e are n o transi- tions from s t e p 3 to s t e p 2 , i.e., only new estimates Q: will b e made f o r k

>

k z ( o ) . T h e r e f o r e f o r k

>

k z ( o ) w e h a v e

where

@

(.rr) i s defined in ( 2 . 7 ) and

According to t h e assumption 3 of t h e theorem f o r almost a l l w E W1 e x i s t s sequence k , (w ) s u c h t h a t

Due t o boundedness of t h e sequence

d

w e c a n assume without loss of g e n e r a l i t y t h a t

d -

^{r r S .} Taking i n t o a c c o u n t t h e f a c t t h a t ,b(.rr) and @ ( r r ) satisfy t h e Lipshitz condition uniformly o v e r .rr and k w e obtain f o r o E Wl and k ,

>

k 2 ( o ) :

(18)

k + 1 k , + 1

+ ~ W n * ) + y , + Q c + ~ - Q ( z

I + ? ,

*

⁴

where?, = 2 C 2 ( J n

-

^1)-Oasr

^--.

Condition (2.18) gives f o r k ,

>

k 2 ( o )

f o r some o

>

^{0 a n d}

b =

^@,(,)

>

0. T h e r e f o r e f o r almost all o E Wl e x i s t k ,

>

k 2 ( o ) such t h a t

4

^{+ l}

-

Q ( z k r + l )

< - 8

Qk,

+

¹

and 7 ,

+ 7, <

b / 2 . This t o g e t h e r with (2.20) gives f o r sufficiently l a r g e r : q 4 ( n * >

s

q ( n * ) - b / 2

k k k

and t h e r e f o r e

q

' ( n ') 5

q(n

') f o r sufficiently l a r g e r and w ^EW1. Hence

k , + 1 k, + l

+ Qk, + 1

- 4 ( z 4 + ' )

+ Y , = V ~ + Q ~ , + ~

-

^{Q ( z}^{4 t 1 )} ^{+ 7 ~}_(2.21)

The condition (2.18) implies

with o

>

0 ,

3 >

0 , k ,

>

k 2(o). T h e r e f o r e f o r almost all o E W1 e x i s t k ,

>

k 2 ( o ) and

I

7 ,

I < p/

2. This gives t o g e t h e r with (2.21):

f o r almost a l l o ^EW1 and some k ,

>

k 2 ( o ) . W e a r r i v e d in contradiction with o u r ini-

(19)

tial assumption. T h e r e f o r e assumption 2 of t h e theorem 1 is satisfied. Proof i s completed.

Let us now consider in more detail Assumption 3 of t h e Theorem 1 and t h e specific p r o c e d u r e s f o r selection of t h e point z k at s t e p 4 of t h e algorithm.

These p r o c e d u r e s should s a t i s f y assumption 3 of t h e theorem; namely with probability 1 e x i s t s a subsequence k, such t h a t

~ ( z " ; + l . n " ; ) - min ~ ( z , n " ; ) d . l r t r u

The b e s t choice i s v ( z k ' I ,

d ) =

min ~ ( z ,

d )

but t h i s i s not feasible because

l r t r u

of inaccessibility of e x a c t function values ~ ( z , n). W e shall c o n s i d e r two pro- c e d u r e s which d o not r e q u i r e objective function values.

1 Random s e a r c h . Take probability measure R with nonzero density in t h e set L 6 z 6 u and t a k e successive points z 1

. .

z k as independent observations of random variable z with distribution R . Then (2.22) i s fulfilled due to continuity of a ( z , n).

2 Stochastic q u a s i - g r a d i e n t method. (Ermoliev [ 3 ] ) This method will produce sequence of points z such t h a t

a ( z k , # )

-

^min ^{~ ( z ,}^{# )}^{- 0}

.

l r t r u

On e a c h i t e r a t i o n t h e following calculation, are performed at t h e s t e p 4 of t h e algorithm:

if z i

<

^Li

if

'

^"i

z i otherwise

In p a r t i c u l a r , i t i s possible to t a k e

where d S are optimal dual multipliers of t h e following problem:

(20)

Q ( z [ , u s )

=

^min [ < q ( w S ) , y

>

) W ( w S ) y

=

h ( w S )

-

T ( w S ) z [ ]

l r y r u

and wS are independent observations of random parameters.

-

^OD

If problem (2.25) h a s bounded solutions f o r a l l w ,

2

^ps

=

^w,

2

^p:

<

and s =o s = O

mk -+ w as k ^-+w then (2.23) i s satisfied and, consequently, assumption 3 of t h e theorem 1 i s satisfied too.

3. EXTENSION

Method, described in t h i s section i s applicable not only t o t h e stochastic programs with r e c o u r s e (1.1) but t o more g e n e r a l problems of stochastic programming as well. Consider t h e following problem:

mimimize E'j (z , w ) (2.26)

s u b j e c t t o p ( z ) S 0 , z E X

The method and r e s u l t s remain essentially t h e s a m e if w e denote E'j ( z , w )

=

Q ( z ) and substitute e v e r y w h e r e in t h e above discussion Q ( z ) f o r < c , z

> +

Q ( z ) and p ( z ) f o r Az

-

^{b .} The initial points should satisfy now

Master problem (2.2) obtains t h e form

k

minimize

C

^QJhj

j =I

(21)

X j ²0

where

91

a r e estimates of Ef ( z j , w ). Subproblem (2.3) becomes min E f ( z , w )

- <#,

p ( z ) >

t E X

The theorem l a i s proved similarly t o t h e theorem 1:

Theorem la. Take t h e following assumptions

1 Function Ef ( z

,

w ), p ( z ) are convex, t h e set X i s compact.

2 Exists e" X such t h a t p

(g) <

⁰and initial points z l,

. . .

, z m l a r e such t h a t a x min <e

,

p ( z j ) >

<

⁰

d = l , q "0 ^j

max I~Q;

-

9 ( z k > I , max

191

- ~ f ( z j , w)lj

=

c k - 0 41.5.

j E A ~

lim inf ~ ~ f ( z ' + I , w )

-

< x i , p ( z i +')>I

f + - i S f

-

min [ E f ( z , w )

-

< x i , p ( z ) > ]

=

0 a.s.

t E X

Then E ~ ( z ' , w ) --r min[Ef(z, w ) l p ( z ) s O , z ~ X I where

Zk = C

X:xj and a l l

j E A ~

accumulation points of t h e sequence

zk

a r e solutions of t h e problem (2.26).

Although o u r primary c o n c e r n h e r e i s with a conceptual algorithm, l e t us con- clude t h i s section with a b r i e f discussion of some considerations which apply in o r d e r t o make t h e algorithm implementable.

a) Purging S t r a t e g y f o r Grid Points: The above algorithm assumes t h a t all g r i d points a r e retained but, when s t o r a g e i s limited, i t will be n e c e s s a r y t o periodical- ly remove grid points. This s u b j e c t h a s been extensively studied, see Eaves and Zangwill [2], Topkis

[Ill

in t h e context of cutting plane algorithms, and similar considerations apply h e r e .

(22)

b ) Variance of Estimates: When developing estimates

~d

using, f o r example, (2.15) o r (2.19), w e c a n a l s o maintain a n d u p d a t e t h e v a r i a n c e of e s t i m a t e s f o r e a c h g r i d point x i . These c a n t h a n b e usefully employed in refining t h e decision r u l e s at S t e p s 2 a n d 3.

c ) Induced Constraints: When t h e assumption of complete r e c o u r s e (i.e. t h a t ( l . l b ) always h a s a solution) c a n n o t b e v e r i f i e d a priori, t h e n i t may h a p p e n t h a t f o r some combination of g r i d point xi a n d random p a r a m e t e r s w i (in (2.15) a n d (2.19)). t h e problem ( l . l b ) is infeasible. Following Van-Slyke a n d Wets [12], a n induced c o n s t r a i n t o r f e a s i b i l i t y c u t must t h e n b e deduced a n d i n t r o d u c e d i n t o t h e problem ( l . l a ) a n d c o r r e s p o n d i n g l y i n t o t h e master p r o g r a m (2.2). This e x t e n s i o n r e q u i r e s f u r t h e r study.

T h e r e are a l s o a number of s p e c i a l cases of t h e g e n e r a l problem ( l . l a , b ) which p e r m i t s refinements, with a view t o enhancing efficiency, of t h e algorithm d e s c r i b e d a b o v e . One case of p r a c t i c a l i n t e r e s t i s s t o c h a s t i c p r o g r a m s with r e c o u r s e a n d non-stochastic t e n d e r s ( s e e N a z a r e t h a n d Wets [7], w h e r e T(ml x n l ) i s a m e d matrix. The master/subproblem p a i r c o r r e s p o n d i n g t o (2.2) a n d (2.3) c a n t h e n b e r e f o r m u l a t e d as follows:

Master:

k minimize < c , x

> + z ~1

^Aj

j = I

w h e r e

a n d

(23)

Q l i s a n estimate of t h e value Q(xl) at the g r i d point

XI,

^and

cf

,

d

and v k are t h e dual multipliers associated with t h e optimal solution of t h e master (2.27).

Subproblem: Consider t h e (Lagrangian) subproblem, minimize < q k , X>

+

Q(x)

L sxsu

where L and U are any s u i t a b l e bounds implied by

x

^s^Zk^{a n d}1 4 z 4 u . i s again t a k e n to b e a n "approximate" solution to (2.29). in t h e s e n s e discussed in S t e p 4, a f t e r e x p r e s s i o n (2.3).

I t frequently happens t h a t m i

<<

n l i.e. t h a t only a few elements of t h e problem are stochastic. In t h i s c a s e , t h e above reformulation c a n considerably enhance efficiency, because t h e optimization in t h e subproblem (2.29) and t h e l i n e a r program in (2.28) which must be solved to obtain estimates Q! are both in a s p a c e of relatively low dimension.

REFERENCES

[ I] Dantzig, G.B., L i n e a r Programming a n d E z t e n s i o n s , P r i n c e t o n University P r e s s (1980).

[ 21 Eaves, B.C. a n d W. Zangwill, "Generalized cutting plane algorithms," S I N J.

Control 9(1971).

[

31

Ermoliev, Yu., Methods of S t o c h a s t i c Programming (in Russian), Nauka, Mos- cow (1976).

[ 43 Ermoliev, Yu., A. Gaivoronski a n d C. Nedeva, Stochastic programming prob- lems with incomplete information on objective functions, SIM J. Control a n d Q p t i m i z a t i o n vol. 23, No. 5.

[

51

Ermoliev, Yu. and A. Gaivoronski, Stochastic quasigradient methods and t h e i r implementation. IIASA Working P a p e r WP-84-55, 1984.

[ 61 Nazareth, J.L., Design a n d implementation of a s t o c h a s t i c programming optim- i z e r with r e c o u r s e a n d t e n d e r s . IIASA Working P a p e r WP-85-063, 1985.

[ 71 Nazareth, J.L. and R.J-B. Wets, "Algorithms f o r s t o c h a s t i c programs: t h e case of non-stochastic t e n d e r s , " A. P r d k o p a a n d R.J-B. Wets, eds., Mathematical Programming S t u d y 28(1986).

[

81

P a r i k h , S.C., L e c t u r e n o t e s on s t o c h a s t i c programming, unpublished, Universi- t y of California, Berkeley (1968).

[ 93 Polak, E., Theory of optimal c o n t r o l and mathematical programming. N.Y.

McGraw-Hill Book Co. 1970.

[lo]

S h a p i r o , J

.F.,

"Mathematical programming: s t r u c t u r e s and algorithms", John Wiley, New York (1979).

[ll] Topkis, D., "Cutting plane methods without nested c o n s t r a i n t sets", m e r a t i o n s Research 18 (1970).

(24)

[12] Van Slyke, R. and R.J-B. Wets, "L-shaped linear program with applications t o optimal control and stochastic linear programs", SIAM Journal o n Applied Mathematics 17(1969) 638-663.

[13] Wets, R., "Programming under uncertainty: the complete problem", 2.

Wahrsch. v e r w . Gebiete 4(1966) 316-339.

[14] Williams, A.C., "Approximation formulas f o r stochastic linear programming", SIAM Journal o n Applied Mathematics 14(1966) 668-677.