• Keine Ergebnisse gefunden

Asymptotic Behavior of Statistical Estimators and Optimal Solutions for Stochastic Optimization Problems

N/A
N/A
Protected

Academic year: 2022

Aktie "Asymptotic Behavior of Statistical Estimators and Optimal Solutions for Stochastic Optimization Problems"

Copied!
31
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

W O R K I I G P A P E R

ASYMPTOTIC BEHAVIOR OF STATISTICAL ESTIMATYIRS

AND

OPTIMAL SOLUTIONS FOR STOCHASTIC OPTIMIZATION PROBLEMS

Jitka D'Upa&v&

Roger Wets

1 lASA

. Lm....

I n t e r n a t i o n a l I n s t i t u t e for Applied Systems Analysis

(2)

NOT FOR QUOTATION WITHOUT THE PERMISSION OF THE AUTHORS

ASYMPTOTIC BEHAVIOR OF STATISI'ICAL ESIlkIATORS AND OPTIMAL SOLUTIONS FOR STOCHASTIC OPTIMIZATION PROBLEWS

J i t k a DupaE ovh Roger Wets

August 1986 WP-86-41

Working P a p e r s a r e interim r e p o r t s on work of t h e I n t e r n a t i o n a l I n s t i t u t e f o r Applied Systems Analysis a n d h a v e r e c e i v e d only limited review. Views o r opinions e x p r e s s e d h e r e i n d o not n e c e s s a r i l y r e p r e s e n t t h o s e of t h e Institute o r of i t s National Member Organizations.

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS 2361 Laxenburg, Austria

(3)

FOREWORD

This p a p e r p r e s e n t s t h e f i r s t r e s u l t s o n a new s t a t i s t i c a l a p p r o a c h to t h e p r o b l e m of i n c o m p l e t e i n f o r m a t i o n in s t o c h a s t i c p r o g r a m m i n g . T h e t o o l s of nondif- f e r e n t i a b l e o p t i m i z a t i o n u s e d h e r e h e l p to p r o v e t h e c o n s i s t e n c y of ( a p p r o x i m a t e ) o p t i m a l s o l u t i o n s b a s e d o n a n i n c r e a s i n g i n f o r m a t i o n o n t h e t r u e p r o b a b i l i t y d i s t r i - b u t i o n without u n n a t u r a l s m o o t h n e s s a s s u m p t i o n s . T h e y also allow to t a k e f u l l y i n t o a c c o u n t t h e p r e s e n c e of c o n s t r a i n t s .

A l e x a n d e r B. K u r z h a n s k i C h a i r m a n S y s t e m a n d D e c i s i o n S c i e n c e s P r o g r a m

(4)

CONTENTS

1 I n t r o d u c t i o n 2 E x a m p l e s

3 C o n s i s t e n c y : C o n v e r g e n c e of Optimal S o l u t i o n s R e f e r e n c e s

(5)

ASYMPTOTIC BEHAVIOR OF STATISIICAL ESTIMATORS

AND

OPTIMAL SOLUTIONS FOR

STOCHASTIC OPTIMIZATION PROBUDIS J i t k a D u p a E o v & a n d R o g e r Wets

The c a l c u l a t i o n of e s t i m a t e s f o r v a r i o u s s t a t i s t i c a l p a r a m e t e r s h a s b e e n o n e of t h e main c o n c e r n s of S t a t i s t i c s s i n c e i t s i n c e p t i o n , a n d a n u m b e r of e l e g a n t f o r - mulas h a v e b e e n d e v e l o p e d to o b t a i n s u c h e s t i m a t e s i n a n u m b e r of p a r t i c u l a r in- s t a n c e s . Typically s u c h cases c o r r e s p o n d to a s i t u a t i o n when t h e r a n d o m phenomenon i s u n i v a r i a t e in n a t u r e , a n d t h e r e are n o "active" r e s t r i c t i o n s o n t h e e s t i m a t e of t h e unknown s t a t i s t i c a l p a r a m e t e r . However, t h a t i s n o t t h e case in g e n e r a l , many e s t i m a t i o n p r o b l e m s are m u l t i v a r i a t e i n n a t u r e a n d t h e r e are res- t r i c t i o n s o n t h e c h o i c e of t h e p a r a m e t e r s . T h e s e c o u l d b e simple n o n n e g a t i v i t y c o n s t r a i n t s , b u t also much m o r e complex r e s t r i c t i o n s involving c e r t a i n mathemati- cal r e l a t i o n s b e t w e e n t h e p a r a m e t e r s t h a t n e e d to b e e s t i m a t e d . C l a s s i c a l t e c h - n i q u e s , t h a t c a n s t i l l b e u s e d to h a n d l e least s q u a r e e s t i m a t i o n with l i n e a r e q u a l i t y c o n s t r a i n t s o n t h e p a r a m e t e r s f o r e x a m p l e , b r e a k down if t h e r e are i n e q u a l i t y c o n s t r a i n t s or a n o n d i f f e r e n t i a b l e c r i t e r i o n f u n c t i o n . In s u c h cases o n e c a n n o t e x - p e c t t h a t a simple f o r m u l a will yield t h e r e l a t i o n s h i p b e t w e e n t h e s a m p l e s a n d t h e b e s t e s t i m a t e s . Usually, t h e latter must b e found b y solving a n optimization p r o b - lem. N a t u r a l l y t h e s o l u t i o n of s u c h a p r o b l e m d e p e n d s o n t h e c o l l e c t e d s a m p l e s a n d o n e i s c o n f r o n t e d with t h e q u e s t i o n s of t h e c o n s i s t e n c y a n d of t h e a s y m p t o t i c

b e h a v i o r of s u c h e s t i m a t o r s . This i s t h e s u b j e c t of t h i s a r t i c l e .

To o v e r c o m e t h e t e c h n i c a l p r o b l e m s c a u s e d b y t h e i n t r i n s i c l a c k of smooth- n e s s , we r e l y o n t h e g u i d e l i n e s a n d t h e tools p r o v i d e d b y t h e o r y of n o n d i f f e r e n t i - a b l e optimization. In f a c t , t h e p r o b l e m of p r o v i n g c o n s i s t e n c y of t h e e s t i m a t o r s , a n d t h e s t u d y of t h e i r a s y m p t o t i c b e h a v i o r i s c l o s e l y r e l a t e d to t h a t of o b t a i n i n g c o n f i d e n c e i n t e r v a l s f o r t h e s o l u t i o n of s t o c h a s t i c optimization p r o b l e m s when t h e r e i s o n l y p a r t i a l i n f o r m a t i o n a b o u t t h e p r o b a b i l i t y d i s t r i b u t i o n of t h e r a n d o m c o e f f i c i e n t s of t h e p r o b l e m . In f a c t i t was t h e n e e d to d e a l with t h i s class of p r o b -

(6)

lems t h a t originally motivated t h i s s t u d y . W e s h a l l s e e in S e c t i o n 2 t h a t s t o c h a s t i c optimization problems as well as t h e problem of finding s t a t i s t i c a l e s t i m a t o r s are t w o i n s t a n c e s of t h e following g e n e r a l c l a s s of problems:

find x E R n t h a t minimizes E t f ( x ,

- 4)

j ,

w h e r e f : Rnx Z -4 R

y 1 +

o o j i s a n e x t e n d e d r e a l valued function a n d

- #

i s a random v a r i a b l e with v a l u e s in E; f o r m o r e d e t a i l s see S e c t i o n 3. I t i s implicit in t h i s f o r - mulation t h a t t h e e x p e c t a t i o n i s c a l c u l a t e d with r e s p e c t to t h e t r u e p r o b a b i l i t y d i s t r i b u t i o n P of t h e random v a r i a b l e

- #,

w h e r e a s in f a c t a l l t h a t i s known i s a c e r - t a i n a p p r o x i m a t e P V . Our o b j e c t i v e i s to s t u d y t h e b e h a v i o r of t h e optimal solution (estimate) x V , o b t a i n e d b y solving t h e optimization problem using P V i n s t e a d of P to c a l c u l a t e t h e e x p e c t a t i o n , when t h e { P V , v

=

1,

...

j i s a s e q u e n c e of p r o b a b i l i t y m e a s u r e s c o n v e r g i n g to P. I n S e c t i o n 3 w e give conditions u n d e r which c o n s i s t e n c y c a n b e p r o v e d . C o n s t r a i n t s o n t h e c h o i c e of t h e optimal x are i n c o r p o r a t e d in t h e formulation of t h e problem b y allowing t h e function f to t a k e o n t h e value

+

w. The r e s u l t s are o b t a i n e d without e x p l i c i t r e f e r e n c e to t h e form of t h e s e c o n s t r a i n t s .

T h e r e i s of c o u r s e a s u b s t a n t i a l s t a t i s t i c a l l i t e r a t u r e dealing with t h e ques- t i o n s b r o a c h e d h e r e , beginning with t h e seminal a r t i c l e of Wald (1949) a n d t h e work of H u b e r (1967) on maximum likelihood e s t i m a t o r s . Of more d i r e c t p a r e n t a g e , at l e a s t as f a r as formulation a n d u s e of mathematical t e c h n i q u e s , i s t h e work o n s t o c h a s t i c programming p r o b l e m s with p a r t i a l information. Wets (1979) r e p o r t s some p r e l i m i n a r y r e s u l t s , f u r t h e r developments were p r e s e n t e d at t h e 1 9 8 0 meet- ing o n s t o c h a s t i c optimization at IIASA (Laxenburg, A u s t r i a ) a n d r e c o r d e d in Solis a n d Wets (1981), see a l s o DupaEovA (1983a, b ) a n d (1984b) f o r a s p e c i a l case. In a p r o j e c t e d p a p e r w e s h a l l d e a l with e s t i m a t e s of t h e c o n v e r g e n c e rates, as well as with t h e c o n v e r g e n c e of t h e a s s o c i a t e d L a g r a n g i a n function.

2.

EXAMPLES

The r e s u l t s a p p l y equally well to estimation or s t o c h a s t i c optimization p r o b - lems with or without c o n s t r a i n t s , with d i f f e r e n t i a b l e or n o n d i f f e r e n t i a b l e c r i t e r i o n function. However, t h e e x a m p l e s t h a t w e d e t a i l h e r e are t h o s e t h a t f a l l o u t s i d e t h e c l a s s i c a l mold, viz. u n c o n s t r a i n e d smooth problems.

(7)

R e s t r i c t i o n s on t h e s t a t i s t i c a l estimates o r t h e optimal decisions of s t o c h a s t i c optimization problems, follow from t e c h n i c a l a n d modeling c o n s i d e r a t i o n s as well as n a t u r a l s t a t i s t i c a l assumptions. The l e a s t s q u a r e estimation problem with l i n e a r equality c o n s t r a i n t s , a b a s i c s t a t i s t i c a l method, see e.g. R a o (1965), c a n b e solved by a usual tools of d i f f e r e n t i a l calculus. The inequality c o n s t r a i n t s however i n t r o - d u c e a lack of smoothness t h a t d o e s n o t allow u s t o fall b a c k on t h e old stand-bys.

In Judge a n d Takayama (1966), Liew (1976) t h e t h e o r y of q u a d r a t i c programming i s used t o e x h i b i t a n d d i s c u s s t h e s t a t i s t i c a l p r o p e r t i e s of l e a s t s q u a r e e s t i m a t e s sub- j e c t t o inequality c o n s t r a i n t s f o r t h e case of l a r g e a n d small samples.

In connection with t h e maximum likelihood estimation, t h e case of p a r a m e t e r r e s t r i c t i o n s i n t h e form of smooth nonlinear equations was s t u d i e d by Aitchinson a n d Silvey (1958) including r e s u l t s on asymptotic normality of t h e estimates. The Lagrangian a p p r o a c h w a s f u r t h e r developed by Silvey (1959), e x t e n d e d t o t h e case of a multisample s i t u a t i o n by S e n (1979) including analysis of t h e situation when t h e t r u e p a r a m e t e r value d o e s n o t fulfill t h e c o n s t r a i n t s ( t h e nonnull c a s e ) .

Typically o n e must t a k e i n t o a c c o u n t in t h e estimation of v a r i a n c e s and v a r i - a n c e components nonnegativity r e s t r i c t i o n s . Unconstrained maximum likelihood estimation in f a c t o r analysis a n d in more complicated s t r u c t u r a l a n a l y s i s models, s e e e.g. Lee (1980), may l e a d t o negative e s t i m a t e s of t h e v a r i a n c e s . Replacing t h e s e u n a p p r o p r i a t e e s t i m a t e s by z e r o s gives estimates which a r e n o l o n g e r op- timal with r e s p e c t t o t h e c h o s e n fitting function. Similarly, t h e r e i s a problem of g e t t i n g negative e s t i m a t e s of v a r i a n c e components, see Example 2.3. In s t a t i s t i c a l p r a c t i c e , t h e s e nonpositive v a r i a n c e estimates are usually fixed at z e r o a n d t h e d a t a i s eventually r e a n a l y z e d . In g e n e r a l , s u c h a n a p p r o a c h may l e a d t o plausible r e s u l t s in c a s e of estimating one r e s t r i c t e d p a r a m e t e r only a n d i t i s mostly unap- p r o p r i a t e i n multi-dimensional situations; see e.g. t h e e v i d e n c e given by Lee

(1980).

The possibility of using mathematical programming techniques t o g e t con- s t r a i n e d estimates w a s e x p l o r e d by A r t h a n a r i a n d Dodge (1981). As mentioned i n t h e introduction w e use mathematical programming t h e o r y not only t o g e t inequali- t y c o n s t r a i n e d e s t i m a t e s b u t t o g e t asymptotic r e s u l t s f o r a l a r g e c l a s s of decision a n d estimation problems which contains, i n t e r a l i a , r e s t r i c t e d M-estimates and sto- c h a s t i c programming with incomplete information. In comparison with t h e r e s u l t s of a d h o c a p p r o a c h e s valid mostly f o r one-dimensional r e s t r i c t e d estimation o u r method c a n b e used f o r high-dimensional cases a n d without u n n a t u r a l smoothness assumptions, in s p i t e of t h e f a c t t h a t t h e violation of d i f f e r e n t i a b i l i t y assumptions

(8)

c a n n o t b e easily bypassed by t h e use of d i r e c t i o n a l d e r i v a t i v e s (in c o n t r a s t t o t h e one-dimensional c a s e ) .

EXAMPLE 2 . 1 Inequality constrained least squares estimation of regres- sion coe.f'$icients. Assume t h a t t h e d e p e n d e n t v a r i a b l e y c a n b e explained o r p r e d i c t e d on t h e b a s e of information provided by independent v a r i a b l e s x l ,

. . .

, x p . In t h e simplest case of l i n e a r model, t h e o b s e r v a t i o n s y, on y are sup- posed t o b e g e n e r a t e d a c c o r d i n g t o

w h e r e

el, . . .

,

pp

a r e unknown p a r a m e t e r s t o b e estimated, E , , j

=

1,

. . .

, v, d e n o t e t h e o b s e r v e d values of r e s i d u a l and X

=

(xl,) i s a (p, v ) matrix whose rows c o n s i s t of t h e o b s e r v e d v a l u e s of t h e independent v a r i a b l e s .

In t h e p r a c t i c a l implementation of t h i s model, t h e r e may b e in addition some a p r i o r i c o n s t r a i n t s imposed on t h e p a r a m e t e r s s u c h as nonnegativity c o n s t r a i n t s on t h e e l a s t i c i t i e s , see Liew (1976), a r e q u i r e d p r e s i g n e d positive d i f f e r e n c e between input a n d o u t p u t tonnage d u e t o t h e meeting loss, A r t h a n a r i a n d Dodge (1981). As- sume t h a t t h e s e c o n s t r a i n t s are of t h e form

where A(m, p ) , c(m, 1 ) a r e given m a t r i c e s . The use of t h e least s q u a r e s method l e a d s t o t h e optimization problem:

2

minimize J = I

z " I

y,

-

i = 1

f

xi,

pi]

s u b j e c t t o

f

akl 5 ck, k

=

1.

. . . .

m ,

1 =1

which c a n b e solved by q u a d r a t i c programming techniques.

In o u r g e n e r a l framework, problem (2.1) c o r r e s p o n d s t o t h e case of o b j e c t i v e function:

=+

o t h e r w i s e

(9)

with t h e P V t h e e m p i r i c a l d i s t r i b u t i o n s .

Alternatively, minimizing t h e sum of absolute e r r o r s c o r r e s p o n d s t o t h e op- timization problem

s u b j e c t t o

5

an

Pi

5 ck

.

1 5 k 5 m

.

i =1

which c a n b e solved by means of t h e simplex method f o r l i n e a r programming, see e.g. A r t h a n a r i a n d Dodge (1981). The formulation of (2.3) i s again based o n t h e em- p i r i c a l d i s t r i b u t i o n function P v , t h e o b j e c t i v e functions is:

=+

o t h e r w i s e

Note, t h a t t h i s function f i s not d i f f e r e n t i a b l e on S.

Finally, when robustizing t h e l e a s t s q u a r e s a p p r o a c h , instead of minimizing a sum of s q u a r e s a sum of l e s s r a p i d l y i n c r e a s i n g functions of r e s i d u a l s i s minimized, see e . g . H u b e r (1973):

minimize p y,

- 5

xi,

J =1

[

i = l

s u b j e c t l o

2

ski

PI

5 c k , 1 5 k S m

.

1 =1

The function p i s assumed t o b e convex, non-monotone a n d t o p o s s e s s bounded d e r i v a t i v e s of sufficiently high o r d e r , e.g.

p(u)

=

-u2 1

2 f o r J u ( < c

= c ) u l - - c 2 1 f o r ) u 1 5 c

.

2

This a l s o f i t s t h e g e n e r a l framework; t h e o b j e c t i v e function is:

(10)

=+

= o t h e r w i s e

and t h e empirical d i s t r i b u t i o n function P V i s again used t o o b t a i n (2.5).

EXAMPLE 2.2 Heywood cases i n factor analysis. The model f o r confirmative f a c t o r analysis ( J o r e s k o g (1969)) is

where x(n, 1 ) i s a column v e c t o r containing t h e o b s e r v e d v a r i a b l e s , f i s a column v e c t o r containing t h e k common f a c t o r s , e ( n , 1 ) i s a column v e c t o r containing t h e individual p a r t s of t h e o b s e r v a b l e s components a n d A(n, k ) i s t h e matrix of f a c t o r loadings. I t is assumed t h a t f and e are normally d i s t r i b u t e d with mean z e r o , v a r f

=

8 a n d v a r e = Q, which i s diagonal. Consequently, x i s normally d i s t r i b u t e d with mean z e r o a n d with t h e v a r i a n c e matrix

The p a r a m e t e r v e c t o r c o n s i s t s of t h e f r e e elements of A, 9 a n d cP and i t should b e estimated using t h e sample v a r i a n c e matrix S of o b s e r v a b l e s x. This is done by minimizing a s u i t a b l e fitting function, s u c h as

f l ( z , S)

=

log

I +

t r ( S C - l )

-

log ( S

I -

n (2.8) ( t h e maximum likelihood method), o r

where V i s a matrix of weights ( t h e weighted l e a s t s q u a r e s method). Evidently, both (2.8) a n d (2.9) with (2.7) s u b s t i t u t e d f o r C, are o b j e c t i v e functions of non- t r i v i a l unconstrained optimization problems, which c a n b e solved by d i f f e r e n t methods s u c h a s t h e method of Davidon-Fletcher-Powell ( s e e F l e t c h e r a n d Powell (1963) o r by t h e Gauss-Newton algorithm. In p r a c t i c e , however, a b o u t o n e t h i r d of t h e d a t a yield o n e o r more nonpositive estimates of t h e diagonal elements of t h e matrix 9, which a r e individual v a r i a n c e s . These solutions are called Heywood cases and t o d e a l with them, (2.8) or (2.9) should b e minimized u n d e r conditions

2 0 , i

=

1,

. . .

, n. Thus t h e a p p r o p r i a t e formulation defines f as follows:

=+ =

o t h e r w i s e

(11)

and similarly f o r f2.

EXAMPLE 2.3 N e g a t i v e e s t i m a t e s of v a r i a n c e c o m p o n e n t s . Consider a gen- e r a l l i n e a r model with random e f f e c t s

where y ( v , 1 ) i s t h e v e c t o r of o b s e r v a t i o n s on t h e v a r i a b l e y , Z(v, r ) , Xi(v, r i ) , i

=

1

. . .

, p a r e mutually u n c o r r e l a t e d random v e c t o r s with E

pi =

0 , v a r

pi =

u f I r , , i =1,

.

.

.

, p and Er = O . v a r r

=

U ~ I , , a n d 71,

. . .

7rs uoB--.n u p 2 2 a r e unknown p a r a m e t e r s t o b e estimated.

One of t h e simplest examples i s t h e following v a r i a n c e a n a l y s i s model f o r r a n - dom e f f e c t one-way classification: Consider k populations w h e r e t h e j-th measure- ment ( o b s e r v a t i o n ) in t h e i-th population is given by

In (2.11), p i s t h e fixed e f f e c t , a i , i

=

1,

. . .

, k , i s t h e random e f f e c t of t h e i-th po- pulation a n d el, i s r e s i d u a l . Random v a r i a b l e s a l ,

. . .

, a k a n d e l l ,

. . .

, e k n are in- d e p e n d e n t with d i s t r i b u t i o n s N(0, 0:) and N(0, u:), r e s p e c t i v e l y . The p a r a m e t e r s p, u z , u z are t o b e estimated. The t r a d i t i o n a l e s t i m a t e s of t h e v a r i a n c e components u:, u: in model (2.11) are o b t a i n e d by a simple p r o c e d u r e : o n e e q u a t e s t h e mean s q u a r e s

and

1 1 T k

w h e r e f i e

= - CTzl

yi,. i

=

I .

. . . .

k , a n d

7.. = ;;i; Li C;=I

yi,, with t h e i r ex- n

p e c t a t i o n s u: a n d u:n

+

u: t h a t give t h e e s t i m a t e s

(12)

Whereas sz i s evidently nonnegative, t h i s need n o t b e t h e case of

si,

s o t h a t t h e problem of negative e s t i m a t e of t h e v a r i a n c e component s? comes t o t h e f o r e .

The r e s u l t i n g e s t i m a t e s (2.12), (2.13) of t h e v a r i a n c e components in (2.11) fol- low a l s o as a s p e c i a l r e s u l t of t h e MIVQUE and MINQUE estimation developed f o r t h e g e n e r a l model (2.10): Unbiased estimates of a l i n e a r p a r a m e t r i c function

zf,o

ofqi a r e s o u g h t in t h e form y T ~ y where

AZ

=

0 , A(v, v) i s symmetric matrix (2.14)

a n d which a r e optimal in some s e n s e . The MIVQUE e s t i m a t e s c o r r e s p o n d to a matrix A t h a t minimizes t h e v a r i a n c e of y T ~ y s u b j e c t t o t h e conditions (2.14) a n d t h e MINQUE e s t i m a t e s c o r r e s p o n d to a matrix A t h a t minimizes tr(A(1

+ zf=l

Xi x:))'

s u b j e c t t o conditions (2.14). In none of t h e mentioned a p p r o a c h e s , however, t h e na- t u r a l nonnegativity c o n s t r a i n t s on t h e estimates of t h e v a r i a n c e s a:, i

=

1,

. . .

P I a r e i n t r o d u c e d explicitly.

Again, t h e r e are two possible explanations of negative e s t i m a t e s of v a r i a n c e components: t h e model may b e i n c o r r e c t or a s t a t i s t i c a l noise o b s c u r e d t h e under- laying situation. Among o t h e r s , H e r b a c h (1959) a n d Thompson (1962) s t u d i e d v a r i - a n c e analysis models with random e f f e c t s by means of d i f f e r e n t v a r i a n t s of t h e maximum likelihood method u n d e r nonnegativity c o n s t r a i n t s . Correspondingly, in t e r m s of t h e g e n e r a l model, w e h a v e f o r i n s t a n c e

-- nk -- k -- k(n -1)

f(a,2, a:, P, Y)

=

(.rr) (0: +nu:) (a:) 2

- -- -

o t h e r w i s e ,

I :

Similarly, nonnegative MINQUE a n d MIVQUE e s t i m a t e s are of i n t e r e s t . e x p

--

1

EXAMPLE 2 . 4 M-estimates. Let 8 b e a given locally compact p a r a m e t e r s e t , ( Z , A , P ) a p r o b a b i l i t y s p a c e a n d f : E9 x Z -+ R a given function. F o r a sample

Itl, . . .

, from t h e c o n s i d e r e d distribution, a n y estimate TV

=

TV(C1,

. . . .

Cv)

E O defined by condition k n

C C

( ~ 1 ,

-

P ) ~

-

0:

20: I = l J = l U: +nu:

"li:

= I .I,

- upr]J

(13)

v

T" E argmin f(T t j )

j = 1

i s called a n M-estimate. In t h e pioneering p a p e r by H u b e r (1967) ( s e e a l s o H u b e r (1981)), n o n s t a n d a r d s u f f i c i e n t conditions were given u n d e r which jl"j c o n v e r g e s a.s. ( o r in p r o b a b i l i t y ) to a c o n s t a n t go E 8 a n d asymptotic normality of G ( T '

-

g o ) w a s p r o v e d u n d e r assumption t h a t 8 i s a n o p e n set.

The problem (2.15) i s evidently a s p e c i a l case of o u r g e n e r a l framework; t h e P v again c o r r e s p o n d to t h e empirical d i s t r i b u t i o n functions a n d w e h a v e uncon- s t r a i n e d c r i t e r i o n function. W e s h a l l aim to remove both of t h e s e assumptions to g e t r e s u l t s valid f o r a whole c l a s s of p r o b a b i l i t y m e a s u r e s P v estimating P , which c o n t a i n s t h e empirical p r o b a b i l i t y m e a s u r e c o n n e c t e d with t h e o r i g i n a l definition

(2.15) of M-estimates, a n d f o r c o n s t r a i n e d estimates.

EXAMPLE 2.5 S t o c h a s t i c o p t i m i z a t i o n w i t h incompLete i q f o r m a t i o n . Con- s i d e r t h e following decision model of s t o c h a s t i c optimization:

Given a p r o b a b i l i t y s p a c e ( Z , A , P ) , a random element

<

o n Z, a measurable function f : R n x E -4 R a n d a set S cRn

minimize E l f ( x , C)j

=

J f ( x , C)P(d<) on t h e set S c R n

.

(2.16) 2!

A wide v a r i e t y of s t o c h a s t i c optimization problems, e.g., s t o c h a s t i c p r o g r a m s with r e c o u r s e or p r o b a b i l i t y c o n s t r a i n e d models ( s e e e.g. Dempster (1980), Ermo- liev et a l . (1985), Kall (1976), P r d k o p a (1973), W e t s (1983)) f i t i n t o t h i s a b s t r a c t framework.

In many p r a c t i c a l s i t u a t i o n s , however, t h e p r o b a b i l i t y m e a s u r e P need not b e known completely. One possibility how t o d e a l with s u c h a s i t u a t i o n i s t o estimate t h e optimal solution x* of (2.16) by a n optimal solution of t h e problem

minimize

J

f ( x , C) P V ( d < ) o n t h e set S c R n P

where P v i s a s u i t a b l e e s t i m a t e of P b a s e d on t h e o b s e r v e d d a t e s . In t h i s c o n t e x t , t h e r e are d i f f e r e n t possibilities to e s t i m a t e o r a p p r o x i m a t e P a n d t h e u s e of em- p i r i c a l d i s t r i b u t i o n i s only o n e of them. The c a s e of P belonging to a given p a r a m e t r i c family of p r o b a b i l i t y m e a s u r e s b u t with a n unknown p a r a m e t e r v e c t o r w a s s t u d i e d e.g. i n DupaEovh (1984a, b).

(14)

F o r problem (2.16), l a r g e dimensionality of t h e decision v e c t o r x i s typical.

This c i r c u m s t a n c e t o g e t h e r with nondifferentiability ( o r e v e n with noncontinuity) of f a n d with t h e p r e s e n c e of c o n s t r a i n t s r a i s e s qualitatively new problems.

3. CONSISTENCY: CONVERGENCE OF OPTIMAL SOLUTIONS

From a c o n c e p t u a l viewpoint o r f o r t h e o r e t i c a l p u r p o s e s , i t i s convenient as well as e x p e d i e n t to s t u d y problems of s t a t i s t i c a l estimation as well as s t o c h a s t i c optimization problems with p a r t i a l information, in t h e following g e n e r a l framework.

Let ( Z , A , P ) b e a p r o b a b i l i t y s p a c e , with Z

-

t h e s u p p o r t of P

-

a closed s u b s e t of a Polish s p a c e X , a n d A t h e Bore1 sigma-field r e l a t i v e to Z; w e may think of Z as t h e set of possible v a l u e s of t h e random element

t

defined o n t h e p r o b a b i l i t y s p a c e of e v e n t s ( Q , A ',

p').

If P i s known, t h e problem i s to:

find x* E

R n

t h a t minimizes Ef (x) , (3.1)

where

a n d

i s a random lower semicontinuous function; w e set

whenever

t

k f ( x , t ) i s n o t bounded a b o v e by a summable (extended real-valued) function. W e r e f e r to

dom E f :

=

[x lEf(x)

< -1

as t h e eflective d o m a i n of Ef. P o i n t s t h a t d o n o t belong t o dom Ef c a n n o t minimize Ef and t h u s are e f f e c t i v e l y excluded from t h e optimization problem (3.1). Hence, t h e model makes s p e c i f i c provisions f o r t h e p r e s e n c e of c o n s t r a i n t s t h a t may limit t h e c h o i c e of x . Note t h a t by definition of t h e i n t e g r a l , w e always h a v e

dom Ef c l x I f ( x , t )

< -

a.s.1

An e x t e n d e d real-valued function h :

R n

--,

=

[ - -,

-1

i s s a i d t o b e proper if

(15)

h

>-

00 a n d n o t i d e n t i c a l l y

+

=; i t i s l o w e r s e m i c o n t i n u o u s (1.sc.) at x if f o r a n y s e q u e n c e (x )[=1, k c o n v e r g i n g to x

lim inf h ( x k ) 2 h ( x ) , k - + -

w h e r e t h e q u a n t i t i e s involved c o u l d b e

=

or

-=.

T h e e x t e n d r e a l - v a l u e d f u n c t i o n f d e f i n e d o n R n X Z i s a r a n d o m l o w e r s e m i c o n t i n u o u s f i L n c t i o n if

f o r a l l ( E

r ,

f ( . , () i s l . s c . (3.31)

f i s Bn 63 A

-

m e a s u r a b l e (3.3ii)

w h e r e Bn i s t h e Bore1 sigma-field o n Rn. This c o n c e p t , u n d e r t h e name of "normal i n t e g r a n d " , w a s i n t r o d u c e d b y R o c k a f e l l a r (1976), as a g e n e r a l i z a t i o n of C a r a t h e o - d o r y i n t e g r a n d s , to h a n d l e p r o b l e m s in t h e Calculus of V a r i a t i o n s a n d Optimal Con- t r o l T h e o r y . When d e a l i n g with p r o b l e m s of t h a t t y p e , as well as s t o c h a s t i c optimi- z a t i o n p r o b l e m s s u c h as (3.1), t h e t r a d i t i o n a l tools of f u n c t i o n a l a n a l y s i s are n o l o n g e r q u i t e a p p r o p r i a t e . T h e c l a s s i c a l g e o m e t r i c a l a p p r o a c h t h a t a s s o c i a t e s func- t i o n s wiLh t h e i r g r a p h must b e a b a n d o n e d in f a v o r of a new g e o m e t r i c a l viewpoint t h a t a s s o c i a t e s f u n c t i o n s with t h e i r " e p i g r a p h s " ( o r h y p o g r a p h s ) , f o r m o r e a b o u t t h e motivation a n d t h e u n d e r l y i n g p r i n c i p l e s of t h e e p i g r a p h i c a l a p p r o a c h c o n s u l t R o c k a f e l l a r a n d Wets (1984). T h e e p i g r a p h of a f u n c t i o n h : R n -+

R

i s t h e set

e p i h = [ ( x , a ) E R n x R ( h ( x ) 5 a j

.

R o c k a f e l l a r (1976) s h o w s t h a t f : R n X E -+

R

i s a random l.sc. f u n c t i o n if a n d only if

t h e multifunction ( k e p i f ( . , () i s nonempty, closed-valued , (3.4i) t h e multifunction

t

k e p i f ( - ,

C)

i s m e a s u r a b l e ; (3.4ii) r e c a l l t h a t a multifunction ( b r([) : E -+ Rn + l i s m e a s u r a b l e if f o r a l l c l o s e d sets F C R " + ~

f o r f u r t h e r d e t a i l s a b o u t m e a s u r a b l e multifunctions see R o c k a f e l l a r (1976), C a s t a - ing a n d V a l a d i e r (1976), a n d t h e b i b l i o g r a p h y of Wagner (1977) s u p p l e m e n t e d b y I o f f e (1978). W e s h a l l u s e r e p e a t e d l y t h e following r e s u l t d u e to Yankov, von Neu- man, a n d Kuratowski a n d R y l l Nardzewski.

(16)

PROPOSITION 3 . 1 Theorem of Measurable Selections. If

r :

E

2

Rn i s a closed- v a l u e d measurable m u l t m n c t i o n , t h e n there e z i s t s a least one measurable selector, i.e. a measurable f u n c t i o n x : dom

r

--, Rn s u c h t h a t for all E dom

r,

x (C) E r(C), v h e r e dom

r

:

= C

E Z

1

r(C) #

4 1 =

r - ' ( ~ ~ ) E A

.

F o r a proof s e e R o c k a f e l l a r (1976), f o r example. As immediate c o n s e q u e n c e s of t h e definition (3.3) of random l.sc

.

functions, t h e equivalence with t h e conditions (3.4) a n d t h e p r e c e d i n g p r o p o s i t i o n , w e have:

PROPOSITION 3 . 2 Let f : Rn x E --, be a r a n d o m 1.sc. f u n c t i o n . Then for a n y A m e a s u r a b l e f u n c t i o n x : Z --, Rn, t h e f u n c t i o n

Moreover, t h e i n f i m a l f u n c t i o n

tt-+

inf f ( - , C):

=

i n f x E R n f ( x , C)

i s A-measurable, a n d t h e set of optimal s o l u t i o n

t k

argmin f(., C):

=

f x I f ( x , t )

=

inf f ( . , C)j

i s a closed-valued measurable m u l t ~ n c t i o n from Z i n t o Rn, a n d this implies t h a t t h e r e e x i s t s a measurable f u n c t i o n

k x*(t) : dom (argmin f(., ,$))

2

Rn

s u c h t h a t x * ( t ) m i n i m i z e s f ( - , C) whenever argmin f (., ,$)

+ 4.

F o r a s u c c i n c t p r o o f , s e e S e c t i o n 3 of R o c k a f e l l a r and Wets (1984).

If instead of P , w e only h a v e limited information a v a i l a b l e a b o u t P - e.g. some knowledge a b o u t t h e s h a p e of t h e distribution a n d a finite sample of values of

C

o r

# . a

of a function of ,$

-

- t h e n to e s t i m a t e x* we usually h a v e t o r e l y on t h e solution of a n optimization problem t h a t "approximates" (3.1), viz.

find x v E R n t h a t minimizes E v f ( x ) where

The measure P v i s n o t n e c e s s a r i l y t h e empirical m e a s u r e , b u t more g e n e r a l l y t h e

(17)

"best" (in t e r m s of a given c r i t e r i o n ) a p p r o x i m a t e t o P on t h e b a s i s of t h e informa- tion available. A s more information i s c o l l e c t e d , w e could r e f i n e t h e approximation t o P a n d hopefully find a b e t t e r estimate of x

* .

To model t h i s p r o c e s s , w e r e l y on t h e following set-up: l e t (Z, F, p ) b e a sample s p a c e with ( F v ) r = l a n i n c r e a s i n g se- q u e n c e of sigma-field contained in F. A sample

< - -

e.g.

< = It1, t'....

j obtained by independent sampling of t h e values of

,.. t --

l e a d s u s t o a s e q u e n c e IPv(-, <), v

=

1,

...

j of p r o b a b i l i t y m e a s u r e s defined on (Z, A ). Since only t h e information collected up t o s t a g e v c a n b e used in t h e choice of P v , w e must a l s o r e q u i r e t h a t f o r a l l A E A

S i n c e PV d e p e n d s on <, s o d o e s t h e a p p r o x i m a t e problem (3.5), in p a r t i c u l a r i t s solution x '. A s e q u e n c e of e s t i m a t o r s

is (strongly) c o n s i s t e n t if p-almost s u r e l y t h e y c o n v e r g e t o x

*

, t h i s , of c o u r s e , im- plies weak consistency ( c o n v e r g e n c e in probability).

The following r e s u l t s e x t e n d t h e c l a s s i c a l Consistency Theorem of Wald (1940) a n d t h e e x t e n s i o n s by H u b e r (1967), t o t h e more g e n e r a l s e t t i n g laid o u t h e r e a b o v e . Consistency i s obtained by relying on assumptions t h a t are w e a k e r t h a n t h o s e of H u b e r (1967) e v e n in t h e unconstrained c a s e . To d o s o , w e r e l y on t h e t h e o r y of epi-convergence in conjunction with t h e t h e o r y of random sets (measur- a b l e multifunctions) and random l.sc. functions.

A s e q u e n c e of functions Ig ': R n -+ R,

-

v

=

1,.

..

j i s said t o e p i - c o n v e r g e t o g : R" -+

R

if f o r a l l x in Rn, we h a v e

lim inf g "(x ') 2 g(x) f o r a l l I x V j r = l c o n v e r g i n g t o x ,

v + m

and

f o r some I x V j c o n v e r g i n g t o x , lim s u p g V ( x V ) EG g ( x )

.

v + - (3.8)

Note t h a t a n y o n e of t h e s e conditions imply t h a t g i s lower semicontinuous. W e t h e n s a y t h a t g i s t h e e p i - l i m i t of t h e g V , a n d write g

=

epi-lim,, ,gv. W e r e f e r t o t h i s t y p e of c o n v e r g e n c e as epi-convergence, s i n c e i t i s equivalent o t t h e set- c o n v e r g e n c e of t h e e p i g r a p h s . F o r more a b o u t epi-convergence and i t s p r o p e r t i e s , consult Attouch (1984). Our i n t e r e s t in epi-convergence stems from t h e f a c t t h a t

(18)

from a variational viewpoint i t is t h e weakest t y p e of convergence t h a t possesses t h e following p r o p e r t i e s :

PROPOSITION 3.3 [Attouch and Wets (1981), Salinetti and Wets (1986)l. Sup- pose 1g; g V : R n -+ R, v

- =

1,

...

j i s a collection of functions s u c h that g

=

epi -1im

,, ,

,gV. Then

lim s u p (inf gV)

s

inf g , v + -

a n d , ig

x k E argmin g V k for some subsequence

1

vk, k

=

I , . .

.

j and x

=

limk ,,xk, i t follows that

x E argmin g , and

lirn (inf gVk)

=

inf g ; k + -

so in particular ig there e x i s t s a bounded set D c Rn s u c h that for some subse- quence

1

vk, k

=

1,

...

j,

argmin g V k

n

D

+

$ ,

t h e n the m i n i m u m o f g i s attained at some point in the closure of D.

Moreover, ig argmin g

+

$, t h e n lim,

,

, (inf g v )

=

inf g ig and o n l y ig x E argmin g implies the existence of sequences

I&,

r 0, v

=

1 ,

...

j and l x V E Rn, v

=

1 ,

...

j w i t h

lirn E,,

=

0 , and lirn x V

=

x

v + - v + -

s u c h that for all u

=

1,

...

x V E E,

-

argmin g V :

=

Ix ( g V ( x )

s

E,,

+

inf g v j

.

The next theorem t h a t p r o v e s t h e p-almost s u r e epi-convergence of e x p e c t a - tion functionals, is build upon approximation r e s u l t s f o r s t o c h a s t i c optimization problems, f i r s t derived in t h e c a s e f(.,

C)

convex (Theorem 3.3, Wets (1984)), and l a t e r f o r t h e locally Lipschitz c a s e (Theorem 2.8, Birge and Wets (1986)). W e work with t h e following assumptions.

(19)

ASSUMPTION 3.4 "Continuities" o f f . The f i n c t i o n

w i t h

dom f :

=

{ ( x , # ) l f ( x , #)

< ={

= S X E , S c R" closed a n d n o n e m p t y , i s s u c h that for a l l x E S ,

#

t-b f ( x , #) i s c o n t i n u o u s o n E , a n d for a l l

#

E E

a n d Locally Lower L i p s c h i t z o n S , in t h e f o l l o w i n g sense: t o a n y x in S , t h e r e c o r r e s p o n d s a n e i g h b o r h o o d V of x a n d a b o u n d e d c o n t i n u o u s f i n c t i o n

8 :

E -+ R s u c h t h a t f o r a l l x ' E V

n

S a n d

#

E Z,

ASSUMPTION 3.5 Convergence i n distribution. G i v e n t h e s a m p l e s p a c e ( Z , F , p) a n d a n i n c r e a s i n g s e q u e n c e of s i g m a - f i e l d s (Fv),"=l c o n t a i n e d i n F, Let

P V : A

x

Z -+ [0, I], v

=

1,

...

be s u c h t h a t for a l l ( E Z

P v ( . , () i s a p r o b a b i l i t y m e a s u r e o n ( E , A ) , a n d f o r a l l A E A

(t-b P v ( A , () i s F v - m e a s u r a b l e

.

For p-almost a l l ( in Z, t h e s e q u e n c e

P V , ) v

=

1 . . c o n v e r g e s in d i s t r i b u t i o n t o P ,

a n d w i t h P

=

: P O ( - , (), f o r a l l x E S , t h e s e q u e n c e l P v ( . , (){

r=O

i s f ( x , - ) - t i g h t ( a s y m p t o t i c n e g l i g i b i l i t y ) , i.e. t o e v e r y x E S a n d E

>

0 t h e r e c o r r e s p o n d s a com- p a c t set

K,

c s u c h t h a t f o r v

=

0 , 1,

...

j E \ K e l f ( x , # ) l P V ( d # . <)

<

E

.

a n d

(20)

The assumption t h a t

<I+ dorn f ( . , <):

=

l x I f ( x . <)

<

-f

=

S

i s c o n s t a n t , which i s s a t i s f i e d by a l l t h e e x a m p l e s in S e c t i o n 2, may a p p e a r m o r e r e s t r i c t i v e t h a n i t a c t u a l l y i s . Indeed, i t i s e a s y to see t h a t

dorn Ef

= n

dorn f (. , <) , ( E L

if Z i s t h e s u p p o r t of t h e m e a s u r e

P

a n d f o r a l l x

n C , ~

dorn f ( . , <), t h e function f ( x , .) i s bounded a b o v e by a summable function. Then, with S

= nC,

2 dorn f ( . , <) and

f ( x , [) if x E S

+ -

o t h e r w i s e ,

we may as well work with f + i n s t e a d of f , s i n c e

and now [ k dorn f + ( . , [)

=

S i s c o n s t a n t .

Assumption 3.4 implies t h a t f i s a random lower semicontinuous function (nor- mal i n t e g r a n d ) . Indeed, f o r a l l [ =, f ( . , [) i s p r o p e r and lower semicontinuous

(3.3.i) and (x, [) k f ( x , [) i s B" 60 A-measurable (3.3.ii) s i n c e f o r a l l a E R , l e v , f : = {(x, [)lf(x, [ ) S a f i s c l o s e d

.

To s e e t h i s , s u p p o s e {(xk, [ k ) f r = l C lev,f i s a s e q u e n c e c o n v e r g i n g to (x, [); t h e n from Assumption 3.4 we h a v e t h a t f o r k sufficiently l a r g e , and a l l

#

in p a r t i c u l a r

w h e r e

B =

max(, @([) i s f i n i t e , s i n c e B(.) i s bounded. Now

#

k f ( x , #) i s continu- o u s o n Z, t h u s t a k i n g limits as k g o e s to a, w e o b t a i n

f ( x , [) 6 a

+ B

lim Ilx

-

xkll

=

a ,

k-*-

(21)

i.e. (x, C) E lev,f. Since f is a random l.sc. function if follows from P r o p o s i t i o n 3.2 t h a t

i s measurable. Thus condition (3.12) d o e s not s n e a k in a n o t h e r measurability condi- tion, i t r e q u i r e s simply t h a t t h e measurable function 7 b e quasi-integrable.

H u b e r (1967), as well as o t h e r s see e.g. Ibragimov a n d Has'minski (1981), as- sumes t h a t S is open. S i n c e c o n s t r a i n t s usually d o n o t involve s t r i c t inequalities, t h i s i s a n u n n a t u r a l r e s t r i c t i o n , e x c e p t when t h e r e are no c o n s t r a i n t s , i.e. S

=

Rn in which case S i s a l s o closed. In any c a s e , w h a t e v e r b e t h e optimality r e s u l t s o n e may b e a b l e t o p r o v e with S o p e n , t h e y remain valid when S i s r e p l a c e d by i t s clo- s u r e , assuming minimal continuity p r o p e r t i e s f o r t h e e x p e c t a t i o n functionals, b u t t h e c o n v e r s e d o e s n o t hold.

To simplify notations w e s h a l l , whenever i t i s convenient, d r o p t h e e x p l i c i t r e f e r e n c e of t h e d e p e n d e n c e o n

<

of t h e p r o b a b i l i t y m e a s u r e s P v a n d t h e r e s u l t i n g e x p e c t a t i o n functionals E v f , n o n e t h e l e s s t h e r e a d e r should always b e aware t h a t a l l p-as. s t a t e m e n t s r e f e r t o t h e underlying p r o b a b i l i t y s p a c e (Z, F, p ) . W e begin by showing t h a t Ef, as well as t h e Evf, are well-defined functions.

LEMMA 3.6 U n d e r A s s u m p t i o n s 3.4 a n d 3.5, t h e r e e x i s t s Zo E F. p(Zo)

=

1 s u c h t h a t for a l l

<

E ZO, Ef a n d lEvf, v

=

I , . .

.

j a r e p r o p e r lower s e m i c o n t i n u o u s a n c t i o n s s u c h t h a t

S

=

dom Ef

=

dom Evf(., <)

o n w h i c h t h e e x p e c t a t i o n a n c t i o n a l s a r e f i n i t e .

PROOF Let us f i r s t f i x <, a n d assume t h a t f o r t h i s

<

a l l t h e conditions of As- sumption 3.5 are satisfied. If x C S , t h e n f(x, [)

= =

f o r a l l

C

in

=

a n d h e n c e Ef

=

EVf

=

=, i.e.,

S 3 dom E f , S 3 dom EVf

.

With PO

=

P , f o r x E S a n d a n y E

>

0 , t h e r e i s a compact set K c (Assumption 3.5) s u c h t h a t

(22)

as follows from (3.11) a n d t h e f a c t t h a t f ( x , .) i s continuous a n d f i n i t e on K c c E . Thus Evf (x)

<

w.

The f a c t t h a t Ef

> -

w, a n d Evf

> -

00 follows d i r e c t l y from condition (3.12). I t i s also t h i s condition t h a t we use to show t h a t t h e e x p e c t a t i o n f u n c t i o n a l s are lower semicontinuous s i n c e i t allows u s to a p p e a l to Fatou's Lemma to obtain: given

) x

1

=: a s e q u e n c e c o n v e r g i n g to x ; l i m i n f E f ( x V ) 2

f

lim f ( x v , #)P(dt)

v + = ' v + -

w h e r e t h e l a s t inequality follows from t h e lower semicontinuity of f(., t ) at x. Of c o u r s e , t h e same s t r i n g of inequalities holds f o r all ) P V , v

=

1 ,

... 1.

S i n c e t h e a b o v e holds f o r e v e r y v p-almost s u r e l y on Z, t h e set Z,

=

) { E Z J E V f ( . , {) i s f i n i t e , 1-sc. on S, f o r v

=

0, 1

,... 1

i s of m e a s u r e 1.0

THEOREM 3.7 S u p p o s e )E 'f, v

=

1

,... 1

i s a s e q u e n c e of e z p e c t a t i o n f u n c - t i o n a l ~ d e f i n e d b y

a n d E f ( x )

=

E ) f ( x , #){ s u c h t h a t f a n d t h e c o l l e c t i o n ) P ; P V , v

=

1,

... 1

s a t i s & As- s u m p t i o n s 3.4 a n d 3.5. Then, p-almost s u r e l y

Ef

=

e p i -1im EVf

=

ptwse -1im EVf

v + = ' V + = '

w h e r e ptwse-lim,, ,Evf d e n o t e s t h e p o i n t w i s e l i m i t .

PROOF The a r g u m e n t e s s e n t i a l l y follows t h a t of Theorem 2.8 Birge and W e t s (1986), with minor modifications to t a k e care of t h e slightly w e a k e r assumptions a n d t h e f a c t t h a t t h e e x p e c t a t i o n functionals d e p e n d o n

<.

W e begin b y showing t h a t p-almost s u r e l y Ef i s t h e pointwise limit of t h e E V f . W e fix { E Z, and assume t h a t t h e conditions of Assumption 3.5 are s a t i s f i e d f o r t h i s p a r t i c u l a r

<.

S u p p o s e x E S , a n d set

From condition (3.11), i t follows t h a t f o r a l l E

>

0 , t h e r e i s a compact set K c s u c h t h a t f o r a l l v

(23)

L e t 7,:

=

m a x t E K t l h ( # ) ) . W e know t h a t 7, i s f i n i t e s i n c e

K t

i s c o m p a c t a n d h i s con- t i n u o u s o n Z (Assumption 3 . 4 ) . L e t h C b e a t r u n c a t i o n of h , d e f i n e d b y

I

h(#) if Ih(#)

I s

7,

he(#)

=

7, if h ( t )

>

7 c

-

7, if h ( t )

<

7,

T h e f u n c t i o n h C is bounded a n d c o n t i n u o u s , a n d f o r all

#

in Z IhC(#)I

s

lh(#)l

Now, f r o m t h e c o n v e r g e n c e i n d i s t r i b u t i o n of t h e P Y , lim [a::

=

/ E h c ( # ) ~ u ( d # ) ]

=

/ E h c ( # ) ~ ( d # ) :

=

a t

.

,+-

M o r e o v e r , f o r all v

Now, let

W e h a v e t h a t f o r all v

la,

-

a,CI

=

~ & , ~ ~ ( h ( # )

-

h c ( 0 ) P V ( d # ) (

<

2 r

.

a n d also

( E f ( x )

-

aCI

<

2 r

T h e s e t w o last e s t i m a t e s , when u s e d in c o n j u n c t i o n with (3.13) yield: f o r all E

>

0 J E f ( x )

-

awl

<

6 r

.

Thus f o r all x i n S

E f ( x )

=

lim E Y f ( x )

=

lim a, ,

u - + - u - + -

a n d s i n c e , by Lemma 3.6, S

=

dom Ef

=

dom EYf ,

(24)

i t means t h a t Ef

=

ptwse -limv, ,Evf, and t h a t condition (3.8) of epi-convergence i s s a t i s f i e d , s i n c e w e c a n c h o o s e I x v

=

x f o r t h e s e q u e n c e converging t o x .

T h e r e r e m a i n s t o v e r i f y condition (3.7) of epi-convergence. If x @ S , t h e n f o r e v e r y s e q u e n c e lx '{

rZl

converging t o x , s i n c e S i s closed we h a v e t h a t x u @ S f o r

v sufficiently l a r g e and h e n c e E V f ( x Y )

=

-, which implies t h a t lim inf EYf(x Y + Q ")

= -

2 Ef (x)

= - .

If x E S, a n d

l ~ ' { , " = ~

i s a s e q u e n c e converging t o x , unless x v i s in S infinitely o f t e n , lim inf,, , E Y f ( x Y )

=

-, and t h e n condition (3.7) i s t r i v i a l l y s a t i s f i e d . S o l e t u s assume t h a t !X c S. F o r v sufficiently l a r g e , from (3.10) i t follows t h a t t h e r e i s a bounded continuous function

B

s u c h t h a t

I n t e g r a t i n g both s i d e s with r e s p e c t t o PV, and taking lim inf,, ,, w e obtain lirn EYf(x)

-

lim B Y .

I ~ x

- x Y ( ( S lirn infEVf(xv)

LJ+m V + Q Y - Q

where

BV = J

@(.$I Pw(d.$) c o n v e r g e t o a finite limit s i n c e t h e P V c o n v e r g e in d i s t r i - bution t o P , and by pointwise c o n v e r g e n c e of t h e EYf t h i s yields

Ef (x) zs lim inf EVf (xu)

.

O

v + -

To a p p l y in t h i s c o n t e x t , P r o p o s i t i o n s 3.2 a n d 3.3, we must show t h a t t h e e x - p e c t a t i o n functionals lEYf, v

=

I,..

.

{ are random l.sc. functions.

THEOREM 3.8 U n d e r Assumptions 3.4 and 3.5, t h e e z p e c t a t i o n f ' u n c t i o n a L s E ~ ~ : R " X Z

-+E,

f o r v = I , .

. .

,

a r e p-almost s u r e l y r a n d o m l o w e r s e m i c o n t i n u o u s f'unctions, s u c h t h e < k epi Evf ( a , <) is F"measurab1e.

PROOF Lemma 3.6 shows t h a t t h e r e e x i s t s a set ZO c Z of p-measure 1 s u c h t h a t f o r a l l

<

E ZO, t h e multifunction

<

k e p i EYf(., <) : Z,

2

R n i s nonempty, closed-valued

.

This i s condition (3.4.i), t h u s t h e r e remains only t o e s t a b l i s h (3.4.ii), i.e.

<

k epi EYf (., <) i s FY-measurable

.

(25)

f o r v

=

1,.

. . .

Theorem 3.7 p r o v e s t h a t with r e s p e c t to t h e topology of c o n v e r g e n c e in d i s t r i b u t i o n , t h e map

P V b e p i Evf i s continuous

.

Moreover, s i n c e

<

b PV(A, <) i s Fv-measurable f o r a l l A E A , i t means t h a t given a n y f i n i t e c o l l e c t i o n of c l o s e d sets [F, c E J ~ , ~ a n d s c a l a r s [ f i i j f = l c

10,

I], t h e set

which means t h a t t h e function

<

b P v ( . , <) : Z

- P

:

=

t p r o b a b i l i t y m e a s u r e s on ( E , A ) j

i s Fv-measurable. To see t h i s , o b s e r v e t h a t t h e " c o n v e r g e n c e in distributionu- topology c a n b e o b t a i n e d f r o m t h e b a s e of o p e n sets

see Billingsley (1968), t h a t also g e n e r a t e t h e Bore1 field on P. Thus

<

k e p i EVf(., <)

i s t h e composition of a continuous function, a n d a F v - m e a s u r a b l e f u n c t i o n , a n d h e n c e i s F v - m e a s u r a b l e . ~

In t h e proof of Theorem 3.8, we h a v e used t h e continuity of t h e map P V k e p i E V f , in f a c t Theorem 3.7 only p r o v e s epi-convergence, without i n t r o d u c i n g ex- plicitly t h e epi-topology f o r t h e s p a c e of lower semicontinuous functions. The f a c t t h a t e p i - c o n v e r g e n c e i n d u c e s a topology on t h e s p a c e of l.sc. functions i s well- e s t a b l i s h e d , see f o r example Dolecki, S a l i n e t t i a n d Wets (1983) a n d Attouch (1984), a n d t h u s with t h i s p r o v i s o , Theorem 3.7 p r o v e s t h e epi-continuity of t h e map P V k

e p i EVf.

THEOREM 3.9 Consistency. U n d e r Assumptions 3.4 a n d 3.5 w e h a v e t h a t p- a l m o s t s u r e l y

lim s u p (inf E V f ) S inf Ef v + -

Moreover, t h e r e e z i s t s Zo E F w i t h p ( Z \ ZO)

=

0, s u c h t h a t

(i) for a l l

<

E ZO, a n y c l u s t e r p o i n t

9

of a n y s e q u e n c e tx ', v = 1,

... I

w i t h x E

argmin E V f V ( . , <) b e l o n g s t o argmin Ef (i.e. i s an o p t i m a l e s t i m a t e ) ,

(26)

(ii) f o r v

=

1,.

.

<

t, argmin EVf (. , <) : Zo

2

Rn ,

is a c l o s e d - v a l u e d F V - m e a s u r a b l e m u l t i f b n c t i o n .

In p a r t i c u l a r , if t h e r e is a compact s e t D c Rn s u c h t h a t f o r v

=

I,

...

(argmin ~ ~

n

fD is n o n e m p t y p-a.s. ) , and

tx* j

=

argmin Ef

n

D ,

t h e n t h e r e e x i s t txu:Z, --+ Rnj,",l F V - m e a s u r a b l e s e l e c t i o n s of targmin E v f j F = l s u c h t h a t

x

* =

lim x V ( < ) f o r p - a l m o s t all

<

,

u + -

and a l s o

inf Ef

=

lim (inf EVf) p-a.s.

.

v - w

PROOF The inequality (3.14) immediately follows from (3.9) a n d t h e epi- c o n v e r g e n c e p-almost s u r e l y of t h e e x p e c t a t i o n functionals EVf to Ef (Theorem 3.7) as d o e s t h e a s s e r t i o n (i) a b o u t c l u s t e r points of optimal solutions (Proposition 3.2). The f a c t t h a t (argmin E V f ) i s a closed-valued F v - m e a s u r a b l e multifunction fol- lows from Theorem 3.8 a n d P r o p o s i t i o n 3.2.

Now s u p p o s e Zo c Z b e s u c h t h a t p(ZO)

= I ,

f o r a l l

<

E Z o , Ef

=

epi-lim,

,

,Evf, and f o r a l l v

=

1, ... , (argmin E v f )

n

D i s nonempty. F o r a l l v , t h e multifunction

<

h (argmin E v f ( . , <)

n

D) : Zo

2

Rn

i s nonernpty compact-valued, and Fv-measurable; i t i s t h e i n t e r s e c t i o n of two closed-valued m e a s u r a b l e multifunctions, see R o c k a f e l l a r (1976). Now f o r a n y

<

E ZO, l e t t Z v j r = l b e a n y s e q u e n c e in Rn s u c h t h a t f o r a l l Y ,

Zv(<) E argmin ~ " f ( . , < )

n

D

.

Then, a n y c l u s t e r point of t h e s e q u e n c e is in D, s i n c e i t i s c o m p a c t , a n d in argmin Ef as follows f r o m P r o p o s i t i o n 3.2. Actually, x

* =

limv,,xv. To see t h i s

(27)

n o t e t h a t , if x* i s not t h e limit point of t h e s e q u e n c e t h e r e e x i s t s a s u b s e q u e n c e I v k { F = s u c h t h a t f o r some b

>

0 , a n d a l l k

=

1,

. . .

,

s k ~ a r g m i n ~ ' L f n D , a n d J J X * - Z ~ ) ( > ~ ,

b u t t h i s i s c o n t r a d i c t e d by t h e f a c t t h a t t h i s s u b s e q u e n c e included in D c o n t a i n s a f u r t h e r s u b s e q u e n c e t h a t i s c o n v e r g e n t .

N o w , f o r v

=

1,

...

, l e t x V : Z -+

R n

b e a n Fv-measurable s e l e c t i o n of t h e Fv- m e a s u r a b l e multifunction < b (argmin Evf(., <)

n

D), c f . P r o p o s i t i o n 3.1. By t h e p r e c e d i n g a r g u m e n t f o r a l l

<

E Zo, w h e r e p(Zo)

=

1,

x

* =

lim x u ( < ) v + -

and from P r o p o s i t i o n 3.3, i t t h e n a l s o follows t h a t lim (inf Evf (., <))

=

inf Ef

=

~f (x*)

v --r

f o r a l l

<

E ZO.n

I t should b e noted t h a t c o n t r a r y to e a r l i e r work

-

see Wald (1940), H u b e r (1967)

-

w e d o not assume t h e uniqueness of t h e optimal solutions, at l e a s t in t h e c a s e of t h e s t o c h a s t i c programming model, i n t r o d u c e d in s e c t i o n 2, t h i s would n o t b e a n a t u r a l assumption. Also, l e t us o b s e r v e t h a t w e h a v e not given h e r e t h e most g e n e r a l possible v e r s i o n of t h e Consistency Theorem t h a t could b e o b t a i n e d by re- lying on t h e tools i n t r o d u c e d h e r e . T h e r e are conditions t h a t are n e c e s s a r y a n d s u f f i c i e n t f o r t h e c o n v e r g e n c e of infima

-

see S a l i n e t t i a n d Wets (1986), Robinson (1985)

-

t h a t could b e used h e r e in conjunction with c o n v e r g e n c e r e s u l t s f o r m e a s u r a b l e s e l e c t i o n s (Salinetti a n d Wets (1981)) t o yield a slightly s h a r p e r t h e o r e m , b u t t h e conditions would b e much h a r d e r t o v e r i f y , a n d would b e of v e r y limited i n t e r e s t in t h i s c o n t e x t . Also, s i n c e epi-convergence i s of l o c a l c h a r a c t e r , w e could r e w a r d o u r s t a t e m e n t s to o b t a i n "local" c o n s i s t e n c y by r e s t r i c t i n g o u r at- t e n t i o n to a neighborhood of some x* in a r g m i n Ef.

W e conclude by a n e x i s t e n c e r e s u l t . A function h :

R n

-+

R

i s inf-compact if f o r a l l a E

R

l e v a h :

=

Ix E R n l h ( x ) 5 a { i s compact

.

If h i s p r o p e r (h

> -

w , dom h # 0 ) a n d inf-compact, t h e n (inf h ) i s f i n i t e and at- t a i n e d f o r some x E R". F o r example, if h

=

g

+

qs, w h e r e g i s continuous and qs i s

(28)

t h e i n d i c a t o r f u n c t i o n of t h e nonempty c o m p a c t set S(.ks(x)

=

0 if x E S, a n d .o o t h - e r w i s e ) , t h e n h i s inf-compact. A n o t h e r s u f f i c i e n t c o n d i t i o n i s to h a v e g c o e r c i v e . Inf-compactness i s t h e most g e n e r a l c o n d i t i o n t h a t i s v e r i f i a b l e u n d e r which e x - i s t e n c e c a n b e e s t a b l i s h e d . T h e n e x t proposi!.ion g e n e r a l i z e s r e s u l t s of Wets (1973) a n d H i r i a r t - U n r u t y (1976). E s s e n t i a l l y , we a s s u m e t h a t f(., #) i s inf-compact with p o s i t i v e p r o b a b i l i t y .

PROPOSITION 3 . 1 0 U n d e r Assumptions 3.4 a n d 3.5, t h e c o n d i t i o n : t h e r e ex- i s t s A E A w i t h P(A)

>

0 ( r e s p . Pv(A)

>

0 ) s u c h t h a t for aLL a ER, t h e set

lev, f

n

( R n x A) i s b o u n d e d

.

Then Ef is inJ-compact ( r e s p . Evf i s p - a . s . inf-compact).

PROOF I t c l e a r l y s u f f i c e s to p r o v e t h e p r o p o s i t i o n f o r P, t h e s a m e a r g u m e n t a p p l i e s f o r all P v p-as.. L e t

7 ( # ) :

=

inf to, inf f (x, #) j

.

x € R n

The f u n c t i o n i s m e a s u r a b l e ( P r o p o s i t i o n 3 . 2 ) a n d P-summable, see (3.12). T h e func- t i o n f', d e f i n e d b y

i s t h e n n o n n e g a t i v e . M o r e o v e r f' 2 f a n d t h u s

Set al :

=

a / P ( A ) a n d l e t A1 b e t h e p r o j e c t i o n o n R n of lev,,fl

n

(Rn x A). Then if x g Al a n d

#

E A

a n d s i n c e f' i s n o n n e g a t i v e , with

=

E t7(#)

1,

H e n c e l e v -Ef C A l , a bounded s e t . To c o m p l e t e t h e p r o o f i t s u f f i c e s to o b s e r v e a + 7

t h a t f r o m Lemma 3.6 we know t h a t lev,Ef i s c l o s e d s i n c e Ef is l o w e r semicontinu- o u s , a n d t h i s with t h e a b o v e implies t h a t lev, +7Ef i s c o m p a c t f o r a l l a E R.U

Referenzen

ÄHNLICHE DOKUMENTE

J-B Wets (1986): Asymptotic behavior of statistical estimators and optimal solutions for stochastic optimization problems.. J-B Wets (1987): Asymptotic behavior of

New techniques of local sensitivity analysis in nonsmooth optimization are applied to the problem of determining the asymptotic distribution (generally non-normal)

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS A-2361 Laxenburg, Austria... ASYMPTOTIC BEHAVIOR OF STATIETICAL ESI'IMATORS AND OF OF'TIMAL SOLUTIONS OF

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS A-2361 Laxenburg, Austria... movement of

International Institute for Applied Systems Analysis A-2361 Laxenburg, Austria... INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS A-2361

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS 2361 Laxenburg, Austria... SINGULARITY THEORY FOR NONLINEAR

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS 2361 Laxenburg, Austria... TRADITIONAL REGRESSION MODEL AND LEAST SQUARE METHOD

The constrained algorithms are applied in a stabilized output error configura- tion for parameter estimation in stochastic linear systems... CONTINUOUS-TIME CONSTRAINED