Working Paper
STOC-IC PROGRAMMING WITH I N c o I d P m INFORMA.TI0N
Rtka h p a E ovh
February 1986 WP-86-00
International Institute for Applied Systems Analysis
A-2361 Laxenburg, Austria
NOT FOR QUOTATION
WITHOUT THE PERMISSION OF THE AUTHOR
STOCHASTIC PROGRAMMING
WITHINcowLErE INMIRMATION
F e b r u a r y 1986 WP-86-08
Working Papers are interim r e p o r t s on work of t h e International Institute f o r Applied Systems Analysis and have r e c e i v e d only limited review. Views or opinions expressed herein do not necessarily r e p r e s e n t t h o s e of t h e Institute or of i t s National Member Organizations.
INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS 2361 Laxenburg, Austria
FOREWORD
One of t h e activities of t h e Adaptation and Optimization P r o j e c t of t h e System and Decision Sciences Program is t o develop mathematical methods and a p p r o a c h e s f o r t r e a t i n g models of systems c h a r a c t e r i z e d by limited information about parame- t e r distribution.
This p a p e r p r e s e n t s t h r e e a p p r o a c h e s which r e f l e c t different assumptions about t h e incomplete knowledge of t h e distribution and which c a n b e applied t o model building as well as t o sensitivity analysis, approximation and robustness stu- dies in stochastic programming problems. The suggested methods build a bridge between t h e purely deterministic a p p r o a c h e s of nonlinear programming stability and t h e tools of mathematical statistics.
Alexander B. Kurzhanski Chairman System and Decision Sciences Program
CONTENTS
Abstract
1. Introduction 2. Examples
3. Nonlinear programming stability results and estimated parameters 4 . Contaminated distributions
5 . The minimax approach References
STOCHASTIC PROGRAMMING WITH INCOMPLETE INFORISIIATION
J i t k a DupaE ovh
Abstract
The possibility of successful applications of stochastic programming decision models h a s been limited by t h e assumed complete knowledge of t h e distribution F of t h e random p a r a m e t e r s as well as by t h e limited scope of t h e existing numerical procedures.
We shall introduce selected methods which c a n b e used t o deal with t h e incom- plete knowledge of t h e distribution F, t o study robustness of t h e optimal solution and t h e optimal value of t h e objective function r e l a t i v e t o small changes of t h e underlying distribution and t o g e t e r r o r bounds in approximation schemes.
The r e s e a r c h w a s mostly c a r r i e d out at t h e Department of Statistics, Charles University, P r a g u e and i t w a s stimulated by a close collaboration of t h e a u t h o r with t h e ADO p r o j e c t of SDS. The p r e s e n t version of t h e p a p e r w a s written at IIASA Laxenburg.
1. Introduction
Quite a l a r g e c l a s s of stochastic programming decision problems can b e transformed t o t h e following mathematical programming problem
maximize g o ( x ; F )
s u b j e c t t o g i ( x ; F ) r O , 1 5 i S m ,
g , ( x ; F ) = O , m + 1 5 i S m + p , x E X
w h e r e X c R n i s a given nonempty set. The functions gi , 0 5 i 5 m
+
p , d o not depend on random p a r a m e t e r s d i r e c t l y b u t by means of t h e i r d i s t r i b u t i o n F only.An example of (1.1) i s when a nonlinear p r o g r a m
maximize h o ( x ; o) (1.2)
s u b j e c t t o hk ( x ; o ) 2 0 ,1 S k 5 1 , hk ( x ; o)
=
0 ,1+
I S k 5 s , E x 0contains random p a r a m e t e r s o in hk (z ; o ) , 0 5 k 5 s , and t h e decision x E
5
h a s to b e chosen b e f o r e t h e values of t h e s e p a r a m e t e r s are o b s e r v e d .
Among o t h e r s , two well known decision models of s t o c h a s t i c programming c a n b e evidently written in form (1.1):
Stochastic program w i t h recourse
maximize
EF
tho ( x ; o )-
q ( x ; o ) j (1.3)s u b j e c t t o x E X C
w h e r e t h e penalty function q ( x ; o ) e v a l u a t e s t h e loss c o r r e s p o n d i n g t o t h e case t h a t t h e c h o s e n X E X d o e s n o t fulfill t h e c o n s t r a i n t s hk ( x ; o )
r
0 , 1 s k 5 1 , h k ( x ; o )=
0,l+
1 5 k S s , f o r t h e o b s e r v e d values of t h e random p a r a m e t e r s . The set X cX,
i s defined by induced c o n s t r a i n t s which g u a r a n t e e t h a t q i s well defined.Stochastic program w i t h probabilistic c o n s t r a i n t s maximize
EF
tho ( x , o)js u b j e c t t o
PF
{hk ( x ; o) 2 0 , k E Iij 2 ai , 1 S i S m , x E X C X ' ,w h e r e Ii c 11,
. . .
,1 j,ai E <0,1>
, 1 C i S m , are given in advance.F o r both mentioned b a s i c t y p e s of decision models, numerous r e m a r k a b l e t h e o r e t i -
c a l r e s u l t s were achieved and numerical a p p r o a c h e s suggested. However, t h e numerical solution is r a t h e r complicated in general, mainly due t o t h e f a c t t h a t r e p e a t e d evaluation of function values and gradients i s needed which is r a t h e r time consuming and demands special simulation and/or approximation techniques.
The question of e r r o r bounds i s evidently both of p r a c t i c a l and t h e o r e t i c a l i n t e r e s t .
The optimal solution z (F) and t h e optimal value of t h e objective function in (1.1) depend on t h e chosen t y p e of model and on t h e distribution F which is usually assumed t o b e completely known and independent of t h e chosen decision z . How- e v e r , t h e distribution F i s hardly known completely in r e a l situations. The numeri- c a l r e s u l t s obtained should t h u s b e at l e a s t complemented by a n additional informa- tion about sensitivity of t h e optimal solution with r e s p e c t t o eventual changes of t h e distribution F. In t h e r o b u s t c a s e , a small change in t h e distribution F should cause only a small change in t h e optimal solution.
A f i r s t idea could b e t o study stability of t h e optimal solution of program (1.1) with r e s p e c t t o t h e underlying distribution F d i r e c t l y . However, t h e s p a c e of pro- bability measures provided with a metric corresponding t o t h e weak topology is not a l i n e a r one, s o t h a t t h e g e n e r a l r e s u l t s of p a r a m e t r i c programming are not appli- c a b l e directly.
In t h i s p a p e r t h r e e a p p r o a c h e s will b e presented. They r e f l e c t d i f f e r e n t assumptions on t h e (incomplete) knowledge of t h e distribution F. A s w e shall s e e , t h e y may b e used t o perform sensitivity analysis and postoptimality studies, t o g e t e r r o r bounds and t o solve problems of s t o c h a s t i c programming under a n explicitly given assumption of incomplete knowledge of t h e distribution F.
(i) Assuming t h a t t h e considered distribution is known t o belong t o a p a r a m e t r i c family of distributions, say F E IFy , y E
Yj,
w e can r e w r i t e program (1.1) making t h e dependence on t h e p a r a m e t e r v e c t o r y explicit:maximize g, ( z ; y )
subject t o g i ( z ; y ) 2 0 , 1
s
i 5 m ,g i ( z ; y ) = O , r n + l S i 5 m + p , z E X
where gi ( z ; y ) , 0 S i 5 m + p , are used instead of gi ( z ; Fy ) , 0 S i s m +p , respectively. The stability of t h e optimal solution of program (1.5) with r e s p e c t t o t h e p a r a m e t e r v e c t o r y E Y c a n b e studied t o a c e r t a i n e x t e n t through t h e methods of p a r a m e t r i c programming and through t h e methods developed f o r non- l i n e a r programming stability studies (see e.g. Armacost and Fiacco (1974), G a r s t k a
Having in mind t h e s t a t i s t i c a l background of t h e p a r a m e t e r values which are typically statistical estimates of t h e t r u e p a r a m e t e r values, t h e r e s u l t s of p a r a m e t r i c programming have been complemented by statistical a p p r o a c h e s (see
~ u p a g o v d (1983), (1984) f o r problem (1.3), Dupazovd (1986a) f o r problem (1.4)).
The r e s u l t s are summarized in Section 3.
(ii) The local behaviour of t h e optimal solution x ( F ) with r e s p e c t t o small changes of t h e underlying distribution F c a n b e studied via t-contamination F by a suitably chosen distribution G , i.e., instead of F, distributions of t h e form
Ft = ( 1 - t ) F + t G , o s t s1
are considered (see D u p a ~ o v d (1983), (1985a) f o r problem (1.3), ~ u ~ a z o v d (1986a) f o r problem (1.4)). The original stability problem t h u s r e d u c e s t o t h a t linearly p e r t u r b e d by a s c a l a r p a r a m e t e r t . This a p p r o a c h gives a basis f o r performing sensitivity analysis of t h e optimal solution z (F) and f o r post-optimality studies.
(See Section 4.)
(iii) In typical cases of incomplete knowledge of t h e distribution, F i s known t o belong t o a specified s e t F of distributions. One a p p r o a c h i s via minimax. W e shall discuss in Section 5 t h e c a s e when t h e c o n s t r a i n t s in (1.1) do not depend on F and are incorporated into X For convex compact set F, t h e m i n i m a x s o l u t i o n z ( F
*
)is t h e optimal solution of t h e problem (1.1) corresponding t o t h e l e a s t favourable
* * *
distribution F E F and, similarly, t h e maximax solution z ( F ) c o r r e s p o n d s t o t h e most favourable distribution z (F**) E F. Even without compactness of F w e may g e t minimax and maximax bounds
max inf g o ( z ; F ) I E X F E F
and
max sup g o ( z ; F ) I E X F E F
which provide a n i n t e r v a l estimate f o r t h e optimal value max go ( z ; F ) f o r any
I E X
F E F. This f a c t can b e used t o draw conclusions about t h e dependence of t h e optimal solution on changes of F within t h e given set F and t o g e t e r r o r bounds in numerical methods.
In specific cases (reliability, worst case analysis) t h e minimax solution itself i s of g r e a t i n t e r e s t . In addition, i t is possible t o combine complete and incomplete knowledge of t h e distribution of specific random p a r a m e t e r s of t h e given problem ( ~ u ~ a 6 o v b (1985b)). Even in t h e minimax a p p r o a c h , however, t h e solution depends on t h e choice of t h e set of distributions F and i t i s necessary t o choose such a set which f i t s t o t h e p r e s e n t e d problem as well as possible, using a l l t h e available information. For getting e r r o r bounds (as a p a r t of a n i t e r a t i v e algorithm) one cannot probably i n c r e a s e t h e level of information too much.
A s t h e set F i s often defined by p r e s c r i b i n g values of c e r t a i n moments of t h e distributions F E F, t h e r e s u l t s of t h e moment problem can b e used t o g e t comput- a b l e minimax/maximax solutions and bounds (see ~ u ~ a 6 o v d (1977) ,(1978)). When t h e p r e s c r i b e d values 7 of moments are not known precisely enough, namely, when they are estimated on t h e basis of observed d a t a , t h e problem of stability of t h e minimax solution comes t o t h e f o r e and t o solve i t , methods mentioned sub (i) can b e applied.
2. Examples
To g e t some motivation, l e t us consider f i r s t a f e w examples.
Example 2.1. The cattle-feed problem (van d e Panne and Popp (1963)). The prob- l e m is t o find t h e amounts x j of input j which lead t o t h e minimum c o s t of t h e final mixture in which r e s t r a i n t s on t h e nutrition contents are satisfied. In t h e formula- tion, t h e protein content weight p e r c e n t a g e s p e r ton, a j , f o r each of f o u r con- s i d e r e d inputs are assumed t o b e normally distributed random v a r i a b l e s with means p, and v a r i a n c e s of , 1
a
ja
4. Besides of deterministic l i n e a r c o n s t r a i n t s , one probabilistic c o n s t r a i n tis constructed.
Under normality assumption, (2.1) can b e written in t h e following way
where ( a ) denotes t h e a
-
quantil of t h e N ( 0 , l ) distribution. The p a r a m e t e r s p j.
o f . 1a
ja
4 , are estimated by sampling and in applications, t h e estimates are used instead of t h e t r u e p a r a m e t e r values. In Armacost and Fiacco (1974) t h e problem of stability of t h e optimal solution with r e s p e c t t o p a r a m e t e r values was solved, namely, derivatives of t h e optimal solution with r e s p e c t t o t h e p a r a m e t e r values were obtained.Having in mind t h e s t a t i s t i c a l background of t h e considered p a r a m e t e r s w e shall aim t o complement t h e deterministic stability r e s u l t s by s t a t i s t i c a l ones.
Example 2.2. A simple s t o c h a s t i c model of w a t e r r e s e r v o i r design. The problem is t o minimize t h e r e q u i r e d capacity c of t h e r e s e r v o i r s u b j e c t t o t h e following con- s t r a i n t s :
F r e e b o a r d c o n s t r a i n t
P I s i S c - v i j 2 a l , l S i S n , Minimum s t o r a g e c o n s t r a i n t
P I s i 2 m i j 2 a z , l ~ i ~ n , Minimum r e l e a s e c o n s t r a i n t
P f x i r y i j 2 a , , l s i s n ,
where, in t h e p a r t i c u l a r time i n t e r v a l , si i s t h e s t o r a g e , v i i s t h e flood c o n t r o l f r e e b o a r d s t o r a g e , mi i s t h e minimum s t o r a g e , xi i s t h e t o t a l release a n d y i i s t h e p r e s c r i b e d minimum release.
Using l i n e a r decision r u l e , t h e v a r i a b l e s xi , si are e x p r e s s e d via .monthly inflows ri , ri whose marginal d i s t r i b u t i o n s Fi are supposed t o b e known. Usu- ally, log-normal d i s t r i b u t i o n i s used a n d i t s p a r a m e t e r s are estimated o n t h e b a s i s of r e l a t i v e l y long time s e r i e s of t h e o b s e r v e d monthly inflows. However, in p a r t i c - u l a r months, s p e c i f i c deviations from t h e assumed distribution may a p p e a r : in s p r i n g , t h e d i s t r i b u t i o n may b e r e l a t i v e l y c l o s e t o t h e normal one. Under t h e s e c i r c u m s t a n c e s , w e c a n a c c e p t t h e hypothesis t h a t t h e t r u e marginal d i s t r i b u t i o n s are m i x t u r e s of given log-normal a n d normal ones. We are i n t e r e s t e d t o d e s c r i b e
. c h a n g e s of t h e o r i g i n a l optimal decision d u e t o t h e influence of t h e a l t e r n a t i v e dis- tribution.
Even in t h i s simple example, t h r e e d i f f e r e n t t y p e s of v a r i a b l e s t y p i c a l f o r sto- c h a s t i c models of water r e s o u r c e s systems c a n b e distinguished at f i r s t sight:
-
c o n s t a n t coefficients a n d p a r a m e t e r s , s u c h as system r e l i a b i l i t i e s , flood con- t r o l f r e e b o a r d s t o r a g e , minimum s t o r a g e o r r u l e c u r v e a n d penalty coeffi- c i e n t s in t h e c o r r e s p o n d i n g r e c o u r s e model-
random v a r i a b l e s with a known d i s t r i b u t i o n (i.e., with a w e l l estimated d i s t r i - bution), e.g. t h e monthly inflows-
random v a r i a b l e s with a n incomplete knowledge of distribution, such as t h e f u t u r e demands ( ~ u p a z o v d (1985b)).A d e e p e r insight into t h e modelled r e a l life problem, however, leads t o t h e conclusion t h a t t h e p a r a m e t e r s are far from being known precisely, t h a t t h e dis- tribution h a s been estimated from time s e r i e s of d a t a which are observed with a relatively high measurement e r r o r o r t h a t t h e t y p e of t h e distribution follows from t h e p a s t e x p e r i e n c e and t h e p a r a m e t e r s of t h e distribution are estimated on t h e basis of random input d a t a . On t h e o t h e r hand, t h e final decision should not b e too sensitive t o t h e changes of t h e p a r a m e t e r s and distributions, i t should b e r o b u s t enough.
Example 2.3. The STABIL model (Prdkopa et a l . (1980)) w a s applied t o t h e f o u r t h Five-Year Plan of t h e e l e c t r i c a l energy s e c t o r of Hungary. Besides numerous deterministic l i n e a r c o n s t r a i n t s , one joint probabilistic constraint
w a s used; t h e f o u r right-hand sides w i , 1 S i
s
4, were r e g a r d e d s t o c h a s t i c and t h e joint distribution of t h e s e random v a r i a b l e s w a s supposed t o b e normal. Due t o t h e lack of r e l i a b l e d a t a , some of t h e c o r r e l a t i o n s could not b e given precisely enough. That is why two a l t e r n a t i v e c o r r e l a t i o n matrices were considered in P r d k o p a e t al. (1980) and t h e numerical r e s u l t s were compared.Alternatively, instead of given normal distributions N ( p , El) o r N ( p , E2) t h e i r mixture
(1-t) N ( p , El)
+
t N ( p , E2) (2.2) c a n b e considered which helps t o study t h e changes of t h e optimal solution in prin- ciple f o r 0 S t S 1 ; (2.2) c o r r e s p o n d s t o t h e g r o s s e r r o r o r contamination model.Example 2.4. P r o j e c t planning. The problem i s t o fix t h e completion time T of t h e given p r o j e c t . The reduction of t h e completion time i s profitable a t t h e rate of
c r 0 and t h e eventual delay in t h e completion i s penalized with q 2 c p e r time unit. The p r o j e c t i s r e p r e s e n t e d by a network whose a r c s correspond t o t h e planned activities. Assume t h a t t h e r e i s one sink and one s o u r c e only and t h a t t h e activities a r e numbered by indices 1 S i S n
.
Whereas t h e s t r u c t u r e of t h e pro- ject (the network) is supposed t o be given, t h e completion times, say o i , of t h e activities a r e random variables and s o is t h e t o t a l completion time T . According t o o u r formulation, t h e decision T h a s t o be made b e f o r e t h e realizations of oils a r e known and one h a s t o solve t h e stochastic programmin IcT
+
q EF [T(w)-
TI + jT (2.3)
where F denotes t h e joint distribution of t h e n-dimensional random v e c t o r w and t h e explicit form of t h e ~ ( w ) can be derived.
In p r a c t i c e , t h e distribution F is hardly known completely. Using t h e PERT- method, one usually solves t h e problem (2.3) under assumption t h a t t h e random completion times wi a r e independently distributed with a Beta-distribution o v e r a given interval. The p a r a m e t e r s p , q of t h e Beta distribution a r e usually fixed on t h e basis of t h e available information about some c h a r a c t e r i s t i c s of t h e distribu- tion, such as t h e mean value, mode and variance.
EzampLe 2.5. (Seppala ( 1 9 7 5 ) ) . In his stochastic multi-facility problem Seppala considers t h e c a s e of stochastically dependent weight coefficients. In o r d e r t o eliminate t h e estimation of t h e correlation coefficients, h e introduces a p a r a m e t e r t o t h e model which weights t h e totally c o r r e l a t e d c a s e and t h e uncorrelated one.
3. N o n l i n e a r programming s t a b i l i t y r e s u l t s and e s t i m a t e d p a r a m e t e r s
A s o u r s t a r t i n g point, consider t h e following deterministic nonlinear program depending on a v e c t o r p a r a m e t e r y:
Let Y c R q b e a n open s e t , h : Rn XY -+ R m + P f l b e given continuously dif- ferentiable functions. For a fixed y E Y, t h e problem i s t o
maximize ho ( z ; y )
s u b j e c t t o h i ( z ; y ) 2 O , 1 5 i 5 m , h i ( z ; y ) = O , m + l S i S m + p . The corresponding Lagrange function h a s t h e form
m
L ( z . u . v : y ) = h o ( z ; y )
+
ui hi ( z ; y ) +f
vi h,,, ( z ; y )i =1 i =I
and by w ( y )
=
[ z ( y ) , u ( y ) , v ( y ) ] ER'" X R ? x R P , t h e Kuhn-Tucker point of M ( y ) will b e denoted. The knowledge of t h e Kuhn-Tucker conditions of t h e f i r s t and second o r d e r as well a s t h e knowledge of t h e l i n e a r independence condition and t h e s t r i c t complementarity conditions (Fiacco (1976), Robinson (1980)) will b e assumed throughout t h e t e x t .Theorem 3.1. Let y O E Y a n d let w ( y O ) be t h e K u h n - T u c k e r p o i n t of M ( y O ) for w h i c h t h e K u h n - T u c k e r c o n d i t i o n s of t h e f i r s t a n d second o r d e r , t h e l i n e a r i n d e p e n d e n c e c o n d i t i o n a n d t h e s t r i c t c o m p l e m e n t a r i t y c o n d i t i o n s hold t r u e . Let o n a n e i g h b o u r h o o d of
[z
( y O ) ; y O ] , hi , 0 S i S m +p , be t w i c e c o n t i n u -o u s l y d i f f e r e n t i a b l e w i t h .respect to z a n d c o n t i n u o u s d e r i v a t i v e s
a2
hi( z
; 31)e x i s t s f o r a l l 1 S k s q , 1 s j S n , O s i s m + p . ayk
azj
T h e n t h e following s t a t e m e n t s hold t r u e :
(a) For y E 0 , ( y o ) , t h e r e exists a u n i q u e once c o n t i n u o u s l y d w e r e n t i a b l e j'unction w ( y )
=
[ z ( y ) , u ( y ) , v ( y )] s a t i s f i i n g t h e K u h n - T u c k e r c o n d i - t i o n s of t h e f i r s t a n d second o r d e r , t h e l i n e a r i n d e p e n d e n c e c o n d i t i o n a n d t h e s t r i c t c o m p l e m e n t a r i t y c o n d i t i o n s f o r M (y).( b ) Let I ( y ) c i l ,
. . .
, m1
c o n t a i n t h e i n d i c e s of t h e a c t i v e i n e q u a l i t y con-straints
h i ( z ( v ) ; y ) = O , i E I ( u ) , a n d denote b y
V , h I ( x ; y ) = [ V , h I ( x ; ~ ) , i ~ I ( y ) , V x h i ( x ; y ) , m + 1 ~ i ~ ~ + ~ l ~ V , h I ( x ; y ) = [ V , h I ( x ; y ) , i € I ( y ) , V , h i ( x ; y ) , m + l S i s m + ~ l .
Let f u r t h e r
and t h e remaining components of equal t o 0 .
av
The statements of Theorem 3.1 a r e a modification of r e s u l t s by Fiacco (1976), and Robinson (1974). Due t o t h e assumptions, t h e implicit function theorem c a n be applied t o t h e system of equations which correspond t o t h e active constraints in t h e Kuhn-Tucker conditions of t h e f i r s t o r d e r . Namely, t h e s t r i c t complementarity conditions play a n important r o l e reducing locally t h e program M ( y O ) t o a classi- c a l maximization problem with equality constraints.
The assumptions c a n b e weakened using r e s u l t s by Robinson (1980): Without assuming t h e s t r i c t complementarity conditions in M ( y ), l e t us denote
and formulate t h e strong second order sugpicient condition:
For each n # 0 with
n T v ,
hi ( x ; y ) = ~ , i E I + ( Y ) nT V , hi ( x ; y ) = 0 , m + l S i S m + p , t h e inequality nTv:,
L ( w ( y ); y )n<
0 holds t r u e .Except f o r t h e differentiability of t h e Kuhn-Tucker points w ( y ) , th e f i r s t a s s e r - tion of Theorem 3.1 c a n b e parallelly reformulated. The differentiability p r o p e r t y
was studied, e.g., by Jittorntrum (1984). I t is possible t o g e t directional deriva- tives of w ( y ) in any direction under t h e s t r o n g second o r d e r sufficient condition without assuming s t r i c t complementarity conditions. W e shall u s e t h i s r e s u l t l a t e r in connection with t h e contamination method (see Section 4). The most g e n e r a l r e s u l t on differentiability is due t o Robinson (1984); f o r i t s application see t h e forthcoming p a p e r ~ u p a z o v d (1986b).
A s w e shall see later, t h e p a r a m e t e r v e c t o r y may c o r r e s p o n d t o t h e parame- ters of t h e underlying distribution F (see Theorem 3.2), t o t h e p a r a m e t e r of con- tamination ( s e e Section 4) and, eventually, t o t h e probability levels ai , 1
s
is m , in (1.4) or to o t h e r p a r a m e t e r s used to build a specific decision model of stochas- t i c programming.Assume now t h a t t h e p a r a m e t e r v e c t o r y in M(y ) i s connected with statistical assumptions about t h e distribution F of random coefficients in a stochastic pro- gramming decision model. I t comes typically when F i s known to belong to a p a r a m e t r i c family of distributions IFy , y E
Y ] ,
so t h a t y i s t h e p a r a m e t e r v e c t o r identifying t h e distribution.For t h e s t o c h a s t i c program with r e c o u r s e (1.3) i t means t h a t for a fixed dis- tribution Fy , M(y ) is t h e program
maximize go ( z ; y ):
=
EFY [ho (Z ; a )-
~ ( z ; o ) ] on a set Xwhich does not depend on Fy, e.g.,X = [ z ~ R ~ : g ~ ( z ) 2 O , l s i ~ m , g i ( z ) = O , m + l s i s m + p ],
f o r t h e stochastic program with probabilistic c o n s t r a i n t s (1.4), M ( y ) i s t h e pro- gram
maximize go ( z ; y ) :
=
EFnjho ( z ; o ) js u b j e c t t o g , ( z ; y ) : = P F Y l h , ( z ; o ) 2 O , n € I i ] - a i 2 0 , 1 5 i s m , EXO
In g e n e r a l o u r aim i s t o solve program (1.5) f o r t h e t r u e p a r a m e t e r v e c t o r , say
7 ) E Y. However, o u r decision can only b e based on t h e knowledge of a n estimate,
say y N , of 7 . A s a r e s u l t , t h e substitute program A l ( y N ) i s solved instead of M ( 7 ) . Under t h e asymptotic normality assumption on t h e distribution of t h e estimate y N in ~ ( y ~ ) , t h e deterministic stability r e s u l t s of Theorem 3.1 c a n b e complemented by statistical ones.
Theorem 3.2. Let y N be a n asymptotically normally distributed estimate of the t r u e parameter vector 7 t h a t i s based o n the sample of size N :
- ( y N
-
7 ) " N(O ,C)
w i t h a k n o w n variance m a t r i z C . Let the a s s u m p t i o n s of Theorem 3.2 be f u l - filled for M ( 7 ) . Then the optimal solution z ( y N ) of h f ( y N ) i s asymptotically
normal
f l ( z : ( y N >
-
Z ( 7 ) ) " N ( O ,n
w i t h the variance m a t r i x
i s the ( n , p ) submatrix of (3.3).
where
[
a y]
R o o $ Under assumptions of Theorem 3.1, x ( y ) i s a continuously differentiable (vector) function on a neighbourhood of x ( 7 ) . Using t h e normality assumption and t h e &method [ Rao, 1973, p.3881, w e g e t t h e r e s u l t immediately.
Remark 3.3. All elements of
[el
are continuous on a neighbourhood of 7 , so t h a t t h e asymptotic distribution (3.4) c a n b e substituted bysee Rao (1973, p. 388).
Ezample 3.4. The application of Theorem 3.2 t o Example 2.1 i s straightforward.
Let y b e t h e v e c t o r consisting of asymptotically normal estimates s j , 1 S j S 4 , of t h e t r u e v a r i a n c e s a j , 1 S j S 4 . According t o Theorem 3.1, t h e derivatives
- az
e x i s t and t h e i r values were obtained by Armacost and Fiacco (1974). W e have a yt h u s asymptotic normality of t h e optimal solution. To g e t t h e v a r i a n c e matrix of t h e resulting distribution, t h e v a r i a n c e matrix C (diagonal in o u r c a s e ) should b e known besides of
-. az
a y
Special cases 3.5. In some special cases, i t is possible t o g e t explicit formulas f o r t h e derivatives
- az
and t h u s f o r t h e v a r i a n c e matrix V of t h e asymptotic distribu-a y
tion (3.4). W e shall introduce t h e r e s u l t s applied t o t h e simple recourse problem (see ~ u p a E o v d (1984)):
maximize go ( Z ; y ):
=
c T~-
EFv15
i =1 qi =I a i j z j-
ui]+I
on t h e set
X =
Iz € R n
: R = p ,z
2 0 1 ,where P is a given
( r
, n ) matrix of r a n k r , c and p are fixed v e c t o r s , qi>
0 , 1 S i S m , are given and A=
( a i j ) i s of t h e full column r a n k .To g e t r e g u l a r i t y w e assume t h a t X i s nonempty, bounded with nondegenerated v e r t i c e s . F u r t h e r w e assume asymptotic normality of t h e estimates y N of t h e t r u e p a r a m e t e r v e c t o r q . The differentiability p r o p e r t i e s of g o ( z ; q ) in a neighbour- hood of
[ z
( 7 ) , q ] are implied by assuming t h a t t h e marginal densities f i are con-tinuous and positive in neighbourhoods of t h e points
I n L=1
aij
z j
( 7 ) ; q ,1 S i zs m , respectively.i
Two t y p e s of p a r a m e t r i c families will b e considered:
3.5.1. yi , 1 S i S m are location parameters. Then w e have f o r t h e nonzero
components z j ( 7 ) , j E J of t h e optimal solution z ( 7 )
where
Pj
=
@ y ) l < i < r e C=
-AT QA t B = A T Qj d
with
3.5.2. y i , 1
s
i S m a r e scale parameters, yi>
0 Vi.
Thenwhere
c
= - A ~ Q A , B = ~ ~ ~ d i a g a i j z j ( 7 ) . 1 s i s m1
and Q i s given by ( 3 . 5 ) .
4. Contaminated distributions
Throughout this section, t h e functions g i , 0
s
is
m+
p in ( 1 . 1 ) will b e assumed t o depend Linearly on t h e distribution F . This assumption i s evidently satisfied f o r t h e stochastic programs with r e c o u r s e as well as f o r those with pro- babilistic constraints, and in all c a s e s when gi are expectations of suitable func- tions derived from h i . Furthermore, we shall assume t h a t X=
Rn , i t means only t h a t t h e original deterministic constraints and t h e induced ones have been incor- porated into t h e explicit c o n s t r a i n t s in ( 1 . 1 ) ( with gi ( z ; F ) independent of F , of course).The local behaviour of t h e optimal solution z ( F ) of t h e program ( 1 . 1 ) with r e s p e c t t o small changes of t h e distribution F can b e studied via t-contamination of t h e distribution F by a suitably chosen distribution G , i.e., instead of F , distri- bution of t h e form
Ft = ( l - t ) F + t G ,
O s t
5 1 (4.1) will b e considered. In (4.1), Ft is called d i s t r i b u t i o n F t-contaminated b y d i s t r i - b u t i o n G. Due t o o u r assumption, t h e original stability problem t h u s r e d u c e s t o t h a t linearly p e r t u r b e d by a s c a l a r p a r a m e t e rt
E<
0 , 1>:
maximize ( 1
- t )
go( z
; F )+
tg,(z
; G )s u b j e c t t o ( 1 - t ) g i ( z ; F ) + t g i ( z ; G ) 2 0 , I s i s m , ( 1 - t ) g i ( z ; F ) + t g , ( z ; G ) = 0 , m + 1 5 i 5 m + p .
In principle, i t is possible t o g e t t h e t r a j e c t o r y of t h e optimal solutions z (Ft ) , 0 5
t
5 1; f o r a n a p p r o p r i a t e method see e.g. G f r e r e r et al. (1983).We shall aim t o obtain t h e ~ S t e a u x differential d z ( F ; G -F) of t h e optimal solution of (1.1) in t h e direction of G -F. To g e t t h e explicit r e s u l t s , one h a s t o check t h e differentiability and r e g u l a r i t y assumptions of Theorem 3.1 and t o com- p u t e matrices B ( O ) , D(0) corresponding t o t h e contamination p a r a m e t e r
t =
0 .The knowledge of t h e Ggteaux differential of z ( F ) at F in t h e direction of G-F is useful not only f o r t h e f i r s t o r d e r approximation of t h e optimal solutions corresponding t o distributions belonging t o a neighbourhood of F b u t a l s o f o r d e e p e r statistical conclusions on robustness, namely, in connection with statistical p r o p e r t i e s of t h e estimate z ( F , ) of z ( F ) , which i s based on t h e empirical distribu- tion F,. F o r fixed c o n s t r a i n t s in (4.2) and f o r t h e special choices G
=
6, (degen- e r a t e d distribution c o n c e n t r a t e d at o n e point w ) , t h e Ggteaux differential d z ( F ; 6, -F) c o r r e s p o n d s t o t h e influence c u r v e R F ( w ) widely used in asymptotic statistics. Different c h a r a c t e r i s t i c s of R F ( w ) suggested by Hampel (1974) measure t h e e f f e c t of contamination of t h e d a t a by g r o s s e r r o r s , t h e local e f f e c t of round- ing o r grouping of t h e observations, etc. For a n example see ~ u p a z o v d (1985a).Theorem 4.1. For t h e program maximize go
( z
; F )s u b j e c t t o gi
(z
; F ) 2 0 , 1 5 i 5 m , g i ( z ; F ) = O , m + 1 5 i S m + passume:
( i ) g f ( 0 ; F ) :
R n
-,R '
a r e twice c o n t i n u o u s l y d m e r e n t i a b l e , 0s
is
m + p , ( i i ) t h e K u h n - i k c k e r c o n d i t i o n s of t h e f i r s t a n d second order, t h e l i n e a rindependence c o n d i t i o n a n d t h e s t r i c t complementarity c o n d i t i o n s a r e fulfilled for w ( F )
= [ z
( F ) , u ( F ) , v ( F ) ] ER n
XRT
XRP
,(iii) there i s a neighbourhood 19(z ( F ) )
c R n
o n w h i c h gi (* ; G ) , 0s
is
m +p a r e twice c o n t i n u o u s l y dinerentiable..
Then:
( a ) There i s a neighbourhood 19(w ( F ) )
c R n
xRT
XRP
, a real n u m b e rto >
0 a n d a c o n t i n u o u s f u n c t i o n w :<
0 ,t o )
-, 19(w ( F ) ) , w ( 0 )=
w ( F ) s u c h t h a t for a n yt
E<
O , t o ) , w ( t )=
[ x ( t ) ,u ( t )
, v ( t ) ] i s t h e Kuhn-Tucker p o i n t of(4.2) for w h i c h t h e second order sugpicient condition, t h e l i n e a r i n d e p e n - dence c o n d i t i o n a n d t h e s t r i c t complementarity c o n d i t i o n s a r e fulfilled.
( b ) The G t e a u z d i n e r e n t i a l dw ( F ;G -F) of t h e Kuhn-Tucker p o i n t w ( F ) of (4.3) in t h e d i r e c t i o n of G -F i s g i v e n b y
The r e m a i n i n g components of d w ( F ; G -F), w h i c h correspond t o t h e n o n - a c t i v e c o n s t r a i n t s in (4.3), equal t o 0.
R o o f is a straightforward application o f Theorem 3.1. We took t h e liberty o f adopting t h e notation t o our case; namely
wr(F>
= Cz
( F ) , ~1 ( F ) , i E I ( F ) , v ( F ) ]with I ( F )
c
i l ,. . .
, m j containing t h e indices o f t h e active inequality constraints g i ( z ( F ) ; F )=
0 , g I ( x ( F ) ; G ) and g l ( x ( F ) ; F ) a r e vectors consisting of com- ponents g f ( x ( F ) ; G ) and g f ( z ( F ) ; F ) for i ~ I ( F ) , m + l s i s m + p ,r e s p e c t i v e l y ,
V, g1 ( x ( F ) ; F ) i s a matrix consisting of columns V, gi ( x ( F ) ; F ) f o r i E I ( F ) a n d m + l S i S m + p .
Due t o t h e f a c t t h a t (4.2) i s linearly p e r t u r b e d , w e have
V:,L ( w ; t ) = V , L ( w ; G ) - V t L ( w ; F ) ,
vt
91 ( x ;t )
=g1 ( X ; G ) -91 ( X ; F ) ,s o t h a t
v:,
L ( w ( F ) ;t ) =
V, L ( w ( F ) ; G )vt
91 ( X ( F ) ;t )
= B I ( X ( F ) ; G ) .Remark 4.2. For fixed c o n s t r a i n s in (4.2), i.e., f o r gi ( x ; F ) independent of F , w e have evidently gi ( x ( F ) ; G )
=
gi ( x ( F ) ; F )=
0 f o r i E I ( F ) or m +1s
is
m +pin ( 4 . 4 ) . In t h e case of s t o c h a s t i c program with r e c o u r s e maximize go ( x ; F )
on a s e t X d e s c r i b e d by fixed c o n s t r a i n t s g i ( x ) 2 0 , 1 S i S m , g i ( x ) = 0 , m + l S i S m + p , w e h a v e t h u s
Theorem 4.3. Let a s s u m p t i o n s of Theorem 4.1 hold t r u e . (a) Let t h e m a t r i x
L
= v:,
L ( w ( F ) ; F )be n o n s i n g u l a r . T h e n t h e G t e a u x d w r e n t i a l of t h e i s o l a t e d Local m a x i m - i z e r x ( F ) of (4.3) in t h e d i r e c t i o n of G -F i s g i v e n by
d x ( F ; G - F )
=
-C-~V,L ( w ( F ) ; G ) w h e r ec-1 =
[ I -L - l p ( p T ~ -1p) - 1 p T - j ~ -1P
=
V, 91 ( x (F ) ; F )a n d I i s t h e n - d i m e n s i o n a l
unit
m a t r i x .(b) Let t h e m a t r i x P
=
V , g I ( x ( F ) ; F ) be of r a n k n . m e n t h e S t e a u x d m r e n - tiaL of t h e isoLated LocaL m a x i m i z e r x ( F ) of (4.3) in t h e d i r e c t i o n of G -F i s g i v e n b yd z ( F ; G-F)
=
-(pT)-l gI ( z ( F ) ; G ) .A.ooj'follows from (4.4) by well known formulas f o r inversion of t h e matrix
which is nonsingular and which contains t h e nonsingular s q u a r e submatrix L in t h e c a s e of (a) o r the nonsingular s q u a r e submatrix P in c a s e (b).
The assumptions of s t r i c t complementarity play a n essential r o l e in t h e proof of Theorem 4.1. They guarantee t h a t t h e interval
<
0 ,t o )
on which w( t )
i s t h e Kuhn-Tucker point of (4.2) i s nonempty. Aitsrnatively, t h e s t r i c t complementarity conditions can b e replaced by t h e s t r o n g second o r d e r sufficient condition which w a s s t a t e d in Section 3. Thanks t o t h e f a c t , t h a t we have a s c a l a r p a r a m e t e r only and t h a t w e are in f a c t i n t e r e s t e d in t h e right-hand derivatives of t h e optimal solu- tion with r e s p e c t t o t h e p a r a m e t e r at t h e given pointt =
0 , t h e r e s u l t of Jit- torntrum (1984) applied t o o u r problem gives t h e desired assertion on t h e Ggteaux differential.Denote I + ( F )
=
ti G ( F ) : ui ( F )>
01, I0 ( F )=
ti G ( F ) : ui ( F )=
01. Under s t r i c t complementarity conditions, I 0 ( F )=
$.Theorem 4.4. Let in a s s u m p t i o n s of m e o r e m 4.1, s t r i c t compLementatity condi- t i o n s be repLaced b y t h e s t r o n g second order suppicient condition. Then:
(a) There i s a neighbourhood ?P(w ( F ) )
c
Rn xRy
x RP , a reaL n u m b e rto >
0a n d a c o n t i n u o u s f u n c t i o n w :
< O , t o )
-, ?P(w ( F ) ) , w (0)=
w ( F ) s u c h t h a t for a n yt
E<
0 , t o ) , w ( t )= [ z ( t )
,u ( t ) ,
v ( t ) ] i s t h e Kuhn-Tucker p o i n tof (4.2) for w h i c h t h e strong second order s u m c i e n t c o n d i t i o n a n d t h e Linear independence c o n d i t i o n a r e fuLfiLLed.
(b) There i s a s e t R of i n d i c e s s u c h t h a t
I f ( F )
c
Rc
I + ( F )u
I 0 ( F )=
I ( F ) ,for w h i c h t h e n o n z e r o c o m p o n e n t s of t h e G i t e a u z d i n e r e n t i a l d w ( F ; G -F) a r e g i v e n b y
'-I [ v z L ( w (F) ;
C ) 1
d w R ( F ; G-F)
= -
[ V, g~ (2 ( F ) ; F )
I T o
g~ (2 (F) ; G)1.
S p e c i a l c a s e s 4.5. For specific decision models of stochastic programming, w e c a n g e t correspondingly t h e specific form of t h e assumptions as well as t h e explicit formulas f o r t h e ~ g t e a u x differentials. The assumptions c a n b e subdivided into t h r e e categories:
(A) The basic model assumptions, including t h e absolute continuity of t h e distribu- tion F.
(B)
The g e n e r a l assumptions such as t h e existence of t h e Kuhn-Tucker point f o r which s t r i c t complementarity conditions are fulfilled.(C) The assumptions of differentiability, t h e linear independence condition and t h e 2nd o r d e r sufficient condition, which c a n b e fitted t o t h e considered model.
In t h e following s u r v e y of r e s u l t s , w e shall l i s t mostly t h e form of t h e assump- tions of t h e l a s t c a t e g o r y and w e shall give explicit formulas f o r t h e r e d u c e d vec- t o r s of t h e Gsteaux differentials containing t h e nonzero components only. The full statements can b e found in Dupa;ovd (1983), (1985a), (1986a).
4.5.1. S i m p l e r e c o u r s e problem (Dupazovd (1983))
with qi
>
0 , 1s
i 5 m , A=
(ai,) and c given and F such t h a t EFa exists.A s s u m p t i o n s :
(i) Denote J
=
[ j : xj ( F )>
01
; t h e matrix AJ=
(aij)l, ,,,
h a s full column r a n k . f € 3(ii) The marginal d e n s i t i e s f , , 1 S i S m , are continuous a n d positive at t h e points
+
( F )= z
aij x j ( F ) , 1r
i S m , r e s p e c t i v e l y .j
(iii) G i s a n m-dimensional distribution whose marginal d i s t r i b u t i o n functions Gi h a v e continuous d e r i v a t i v e s o n neighbourhoods of t h e points
+
( F ) , 1 S ?; i m , r e s p e c t i v e l y .&tea- d i f l e r e n t i a l
d z j ( F ; G - F )
=
( A ~ K A ~ ) " (CJ -~!k), w h e r ewith
ki = q i Gi (Xi(F)), 1 i i s m , a n d
K = d i a g I q i f i ( + ( F ) ) , 1 i i s m
1 .
4.5.2. I n d i v i d u a l p r o b a b i l i s t i c c o n s t r a i n t s ( ~ u ~ a d o v d (1986a)) maximize c ( x )
n
s u b j e c t t o P p i
z
aij xj 2 O,1
2 a, , 1 i i i m1
j =I
with a, E (0 , I ) , , 1 i i i m a n d A
=
( a i j ) given; a i d e n o t e s t h e i -th row of A . A s s u m p t i o n s :(i) c :
Rn
-,R1
i s twice continuously d i f f e r e n t i a b l e .(ti) The r a n k of AI
=
( a i j ),
I equals t o c a r d Z(F)=
c a r d I.l r j r n
(iii) The marginal d e n s i t i e s f i , 1 9 i i m , are continuously d i f f e r e n t i a b l e on neighbourhoods of
+
(F)= z
aij x j ( F ) , 1 i i i m , r e s p e c t i v e l y , a n dj
(iv) G i s a n m-dimensional distribution whose marginal distribution functions Gt are twice continuously differentiable on neighbourhoods of t h e points
3
(F) , 1 5 i 5 m , respectively.(v) For a l l 1
ERn
, 1 # 0, f o r which AIL=
0 , inequality 1V&
c ( z (F)) 1<
0 holdst r u e and t h e matrix
L = v : ~ L ( ~ ( F ) ; F ) = v : ~ c ( x ( F ) ) +
C
u ~ ( F ) ~ ~ ( ~ ( F ) ) u ~ ~ u ~ t E I C F )is nonsingular.
where
GI
=
[Gi ( 3 (F))], ,IF) and "I=
[at It E IF).
(4.9)4.5.3. For t h e c a s e of individual probabilistic constraints and a l i n e a r objective function c ( z )
=
c T z , substantially weaker assumptions c a n b e used t o g e t a r e s u l t comparable with t h a t of Theorem 4.3; s e e ~ u p a G o v d (1986a).Assumptions:
(i) c ( z ) = c T z
(ii) The optimal solution z (F) i s unique and nondegenerated.
(iii) The marginal densities f t
,
1 5 i 5 m are continuous and positive at t h e points3
(F) , 1 5 i 5 m , respectively.(iv) G i s a n m-dimensional distribution whose marginal distribution functions Gi have continuous derivatives on neighbourhoods of t h e points
3
(F) , 1 5 i 5 m , respectively.a t e a u z d i n e r e n t i a l
where
PI
, GI , az are given by (4.8) and (4.9).Comment. In t h e l a s t c a s e , t h e assumptions o n t h e distributions are comparable with those f o r t h e simple r e c o u r s e problem, which is quite natural. Contrary t o t h e case of nonlinear objective function, z ( F ) is t h e optimal solution of t h e l i n e a r program
maximize c T z s u b j e c t t o A z 2 b (4.11) where bi
=
Fi-l(ai ). Similarly, z ( t ) is t h e optimal solution of t h e l i n e a r programmaximize c T z s u b j e c t t o Ax 2 bt ,
where bit
=
Fiil ( a i ) , 1 S i S m , are t h e quantities of t h e contaminated marginal distribution functionFit ( K )
=
(1-
t ) Fi ( K )+
tGi ( K ).
Let u s approximate F i r l ( a i ) linearly (see e.g. Serfling (1980)):
and approximate z ( t ) by t h e optimal solution z ( t ) of t h e following l i n e a r p a r a m e t r i c program:
maximize c T z
n
-
Gi ( ~ ~ - ~ ( a ~ ) )s u b j e c t t o aij z j 2 Fi-l(ai )
+
t , 1 S i S m .j =l
Pi
(Fi-l(ai1)
Let B = A : b e t h e optimal basis of t h e l i n e a r program dual t o (4.11); t h e n z ( F )
=
AZ-lbz and5
( F )=
Fi-l(ai) , i E I.
According t o o u r assumptions, z ( F ) is unique and nondegenerated, s o t h a t B is optimal f o r t belonging t o a neighbour- hood of z e r o and=
z ( F )+
t d z ( F ; G-F)using t h e r e s u l t (4.10).
4.5.4. One j o i n t p r o b a b i l i s t i c c o n s t r a i n t ( ~ u p a z o v d (1986a)) maximize c
( z
)subject t o PFfAz r o j r a with a E ( 0 , 1 ) , A
=
(aU ) given.A s s u m p t i o n s :
(i) c : Rn + R' i s twice continuously differentiable.
(ii) T h e r e i s a Kuhn-Tucker point w ( F )
= [z
( F ) ; u ( F ) ] f o r (4.13) such t h a t u ( F )>
0 and t h e second-order sufficient condition i s fulfilled.(iii) In a neighbourhood of X ( F ) :
=
A z ( F ) , t h e distribution functions F and G are twice continuously differentiable andA VX F ( X ( F ) ) # 0
.
where
T 2
L
=
V& L ( w ( F ) ; F )=
V& c ( z ( F ) )+
u ( F ) A V x F (X ( F ) ) A ,L ( F )
= v x ~
( X ( F ) ) ~ A L - ~ A T V ~ F ( X ( F ) ).
Comment. Having solved t h e original problem (4.13), w e know z ( F ) a n d w e have t o compute u ( F )
,
L and t o evaluate G ( A z (F))-
a , VX G ( A z ( F ) ) , OX F(Az
( F ) ) t o g e t t h e Ggteaux differential. For a given z ( F ) , u ( F ) , F and G , i t depends on t h e d i f f e r e n c e between t h e values of t h e distribution functions F ( A z ( F ) ) , G ( A z ( F ) ) and on t h e r e l a t i v e differences of t h e i r g r a d i e n t s which are measured bya%L
and1 ( F )
Vx
F (Az (F)).VX
G ( ) )-
I(F)For t h e gradient of F we have
Vx
F(X)=
.'j?')(x)
where .f
=
diag ifi(4
) , 1 S i S m j andF(')(x)
is t h e m-vector of t h e conditional distribution functions F ( x ( ~ )I 4
) , 1 S i S m ; h e r ex ( ~ )
denotes t h e (m-
1)-dimensional subvector of X in which t h e i -th component, Xi, w a s deleted. Similar formulas hold t r u e f o r
VX
G (X).In Example 2.3, t h e two considered distributions F and G a r e multinormal ones and d i f f e r by t h e i r c o r r e l a t i o n matrices only. In this c a s e , gi
= pi
, 1 S i S m , t h e conditional distributions F ( x ( ~ )1 4
) , G(x('
)I 4
) , 1 S i S m , a r e normal andwith
These circumstances make t h e numerical evaluation of t h e Ggteaux differential realistic.
5. The minimax approach
Assume now t h a t t h e set X of admissible solutions i s defined by fixed con- s t r a i n t s only and t h a t t h e objective function g,(z ; F) i s l i n e a r with r e s p e c t t o F (for a generalization t o t h e nonlinear c a s e see Gaivoronski (1985)). In this c a s e , we can set
9,(z ; F ) = E F f o (2 ; o ) ,
where f , (z , o) may e.g. correspond t o t h e difference h, (z ; o)
-
cp(z ; o) in t h e g e n e r a l stochastic program with r e c o u r s e (1.3). L e t F b e a given set of distribu- tions t o which F i s known t o belong. (The c a s e of t h e complete knowledge of t h edistribution c o r r e s p o n d s t o
F =
I F { . ) Consider t h e two-person zero-sum gameH =
( X ,F ,
g o ) ( 5 . 1 )where X i s t h e set of s t r a t e g i e s of t h e f i r s t p l a y e r ,
F
i s t h e set of s t r a t e g i e s of t h e second p l a y e r and go is t h e pay-off function. Any optimal p u r e s t r a t e g y of t h e f i r s t p l a y e r in t h e game ( 5 . 1 ) will b e called t h e m i n i m a x s o l u t i o n of s t o c h a s t i c programmax g o ( x ; F ) f o r F
E F .
T EX ( 5 . 2 )
Under quite g e n e r a l assumptions on
F,
XandJ',
, a minimax solution e x i s t s and s u p min go ( x ; F )= FmpF
px:: go ( z ; F )r E X F E F
(See e.g. 2dGkovd ( 1 9 6 6 ) , Theodorescu ( 1 9 6 9 ) . )
To find a minimax solution means in g e n e r a l t o solve a n optimization problem maximize inf go ( x ; F ) on t h e set X
.
F E F ( 5 . 3 )
If t h e set
F
of distributions i s defined, i n t e r alia, by p r e s c r i b e d values of c e r t a i n moments of t h e distributions F EF,
i t is possible t o use g e n e r a l r e s u l t s of t h e moment problem t o g e tinf g o ( z ; F) F E F
in a form suitable f o r f u r t h e r computations ( s e e ~ u ~ a z o v d ( 1 9 7 7 ) ) . W e s h a l l out- line t h e r e s u l t s of t h e moment problem briefly and w e s h a l l indicate t h e i r applica- tion in s t o c h a s t i c programming.
Let K : = ( n l ,
. . .
, n k ) :R
+ R k , no :R
+ R 1 b e Bore1 measurable mappings.Denote c(R) t h e image of t h e set
R
u n d e r t h e mapping n, by Y :=
conv n ( R ) t h e convex hull of n ( R ) and assume t h a t int Y+
0. For y E intY
denote byFY
t h e set of distributions of a random v e c t o r o on(R
, B) such t h a t c l ,. . .
, nk , no are i n t e g r a b l e with r e s p e c t t o a l l elements F EF
andThe moment problem i s t o find
L ( y ) :
=
inf EF no ( w ).
F E F,, Under t h e a b o v e assumptions,
w h e r e
In many important cases, e.g., f o r
n
compact, n l ,. . .
, nk continuous, no lower semi-continuous, t h e infimum ( 5 . 5 b ) a n d t h e supremum ( 5 . 6 ) are a c h i e v e d . In t h i s c a s e , t h e r e e x i s t s a d i s t r i b u t i o nP
E F a n d a v e c t o r d* E D s u c h t h a tk
L ( y ) = E p n o ( u )
=
d * , +C
d * j ~jj = l a n d f o r t h e given y E i n t Y , problem ( 5 . 5 b ) r e d u c e s t o
Evidently, as a function of t h e p a r a m e t e r y , L ( y ) i s convex. P a r a l l e l r e s u l t s c a n b e given f o r t h e u p p e r bound U ( y ). I t i s important from t h e point of view of com- putation t h a t
FC
in ( 5 . 7 ) i s a discrete distribution. The c o r r e s p o n d i n g p r o b a b i l i t y m e a s u r e must evidently b e c o n c e n t r a t e d in t h e points o ER, f o r whichk
d o
* + x
d j*
n j ( w )=
no ( a ) . Denotej = l
Then f o r almost a l l y E int Y , t h e r e i s a unique d* E D such t h a t y Econv B ( d * ) ,
1
y
=' x
p i n ( w i ) with n ( w i ) E B ( d * ) ,1 5 i 5 1 ,i =1
and
Corresponding t o t h i s r e p r e s e n t a t i o n ,
For t h e s e and o t h e r r e l a t e d r e s u l t s see e.g. Kemperman ( 1 9 6 8 ) , t h e case of t h e inequality c o n s t r a i n t s on t h e moments w a s studied by Kemperman ( 1 9 7 2 ) , different a p p r o a c h e s t o t h e case of t h e noncompact
R
c a n b e found e.g. in R i c h t e r ( 1 9 5 7 ) . Kemperman ( 1 9 7 2 ) , Cipra ( 1 9 8 5 ) , Gassman and Ziemba (1985).When applying t h e above r e s u l t s t o t h e minimax problem, i t i s quite n a t u r a l t o p u t F
=
Fy andno ( w )
= Po
( z ; W ).
The dependence of f o on t h e decision variables z t o g e t h e r with t h e final goal
-
t o solve t h e "outer" maximization problemmaximize inf
EF
f o ( z ; a ) on t h e set X-
F E F
are t h e r e a s o n why t h e d i r e c t application i s possible only in special cases.
This will b e t h e case if t h e set of t h e considered d i s c r e t e distributions possessing t h e p r o p e r t i e s ( 5 . 8 ) would b e relatively small and independent on z o r if i t would b e possible t o r e d u c e t h e corresponding moment problem t o finite number of one- dimensional moment problems. A s a n example of t h i f i r s t mentioned possibility, w e have
Theorem 5.1. Let
R
cR~
be a convex p o l y h e d r o n w i t h extreme p o i n t s o l ,. . .
, c;N a n d Let y EintR.
Let f o : X x