• Keine Ergebnisse gefunden

A Descent Algorithm for Large-Scale Linearly Constrained Convex Nonsmooth Minimization

N/A
N/A
Protected

Academic year: 2022

Aktie "A Descent Algorithm for Large-Scale Linearly Constrained Convex Nonsmooth Minimization"

Copied!
22
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

NOT FOR QUOTATION WITHOUT P E R M I S S I O N O F THE AUTHOR

A DESCENT ALGORITHM FOR LARGE-SCALE LINEARLY CONSTRAINED CONVEX

NONSMOOTH M I N I M I Z A T I O N

K r z y s z t o f C . K i w i e l

A p r i l 1 9 8 4 C P - 8 4 - 1 5

C o l l a b o r a t i v e P a p e r s r e p o r t w o r k w h i c h h a s n o t b e e n p e r f o r m e d s o l e l y a t t h e I n t e r n a t i o n a l I n s t i t u t e f o r A p p l i e d S y s t e m s A n a l y s i s and w h i c h has received o n l y

l i m i t e d r e v i e w . V i e w s o r o p i n i o n s e x p r e s s e d h e r e i n do n o t n e c e s s a r i l y r e p r e s e n t t h o s e of t h e I n s t i t u t e , i t s N a t i o n a l M e m b e r O r g a n i z a t i o n s , o r o t h e r o r g a n i - z a t i o n s s u p p o r t i n g t h e w o r k .

INTERNATIONAL I N S T I T U T E FOR A P P L I E D SYSTEMS A N A L Y S I S A - 2 3 6 1 L a x e n b u r g , A u s t r i a

(2)
(3)

A Descent A l ~ o r i t h m f o r L a r ~ e - S c e l c Llnearly Constrained Cor?vcx Nonsnaath kiininization

Krzyaztof C. Kiwiel

Systems Research I n s t i t u t e , P o l i s h Academy o f S c i e n c e s , Ne~velska 6 , 01-447 Warsaw, Poland.

Abstract. A descent a l g o r i t h m i 8 given f o r s a l v i n g a l a r g e convex program obtained by augmenting t h e o b j e c t i v e of a

I

l i n e a r pragram w i t h a ( p a s e i b l y n o n d i f f e r e n t i a b l e ) canvex f u c t i o n depending an r e l a t i v e l y few v a r i a b l e s . Such prablens a f t e n a r i s e i n p r a c t i c e as d e t e r m i n i s t i c e q u i v a l e n t s ~f

s t o c h a s t i c programming p r o b l e m . The algorithm s oearch d i r e c t i m f i n d i n g subproblem8 can be solved e f f i c i e n t l y by t h e e x i s t i n g software f o r large-scale smooth optimizatian. The algorithzc i s both r e a d i l y implementable and g l o b a l l y convergent.

AIGS 1580 s u b j e c t c l a s 8 i f i c a t i a n . Primary: 65K05. .Secondary:gGC25.

Xey wards: Nonsmaoth o p t i m i z a t i o n , n o n d i f f e r e n t i a b l e p r ~ g r a m n i n g , l i n e a r c o n s t r a i n t s , convex programming, descent me t h ~ d s

,

l a r g e - s c a l e optimization

(4)
(5)

1 I n t r o d u c t i o n

T h i s paper p r e s e n t s a method f o r sol.ving t h e f o l l ~ w i n g problem

minimize <c,y>

+

f(x) over a l l ( y , x ) ~ I3

'

s a t i s f y i n g Ay

+

Bx s b, (1.1)

P -

where c e+, A i s an Pm hi-matrix, B i s an P mN-matrix, ~ E ,aa: R f :

61

+ R i s 1 a (possibly n o n d i f f e r e n t i a b l e ) convex f u c t i o n .

Yie ouppose t h a t t h e s e t o f f e a s i b l e p o i n t s : & + B x ' b )

S = C ( ~ , X ) E R

i s nonempty and bounded, and t h a t a t each (y,x) t S we can

compute f ( x ) and a c e r t a i n eubgradient g f ( x ) c h f ( x ) , i.e. an a r b i t r a r y element o f t h e s u b d i f f e r e n t i a l

a

f ( x ) o f I" a t x a n which we cannot impose any f u r t h e r r e s t r i c t i o n s .

Problems o f t h e fom (1.1) a r e o f t e n encountered i n p r a c t i c e , e s p e c i a l l y a s d e t e r m i n i s t i c , equivalents o f two-stage s t ~ c h z a t i c p r o g r a m i n g problems [KI]

,

[NWI]

,

[WIJ I n m a n y a p p l i c a t i a n s

t h e number LI o f " l i n e a r m v a r i a b l e 8 y i i s much l a r g e r than t h e number N o f "nonlinear" v a r i a b l e s

5 ,

and t h e m a t r i c e s A and 3

a r e s p a r s e (have r e l a t i v e l y few nonzero e n t r i e s ) . I n such cases problem (1 . I ) can be solved by t h e e x i s t i q g algorithms f o r

large-scale optimization (e.g. LIINOS [?IGs~] ) i f f i s d i f f e r e n t i a 2 f e . I n t h e n o n d i f f e r e n t i a b l e l a r g e - s c a l e case, only a few a l g a r i t k z s

have been propotled [ B W ~ ] , m d they frequently nssunc t h c Laowledge of t h e f u l l e u b d i f f e r e n t i a l bf(2:) a t each x.

The ffiethod preeented i n t h i s paper modifies one given i n [1(3] t o makc use of t h e e p e c l a l s t r u c t u r e ~ f p r o b l e n ( 1 . I )

.

It

i o o f c n s i b l e p o i n t mathod of deownt i n " t h e scnce of 1~;encrotill:

succcsoivc poiritu i n S w i t h nonincreesing o b j c c t i v e values.

(6)

To d e a l w i t h n o n d i f f e r c n t i a b i l i t y o f f , . a t each i t e r a t i m a

piecewise l i n e a r (polyhedral) approximatim t o f is c ~ n s t r u c t e d fram a t m o s t N+2 a u b ~ r a d i e n t e o f f c a l c u l a t e d przviously a t

c e r t a i n t r i a l points. A cearch d i r e c t i o n i s found by salv>fig a q u a d r a t i c p r o g r a m i n g subproblem obtained by r e p l a c i n g f i n .

- .

(1.1) by i t s polyhedral approximation augmented w i t h a simple q u a d r a t i c term. Then a l i n e aearch f i n d s t h e - n e x t approximation

and t h e next t r i a l p o i n t . The two-point l i n e

s e a r c h i s employed t o d e t e c t d i s c o n t i n u i t i e s i n t h e - g r a d i e n t o f f.

.

We show t h a t t h e m e t h ~ d ie g l o b a l l y convergent under no a d d i t i o n a l assumptions. We m a y add t h a t t h e method w i l l f i n d a s o l u t i o n i n a f i n i t e number o f i t e r a t i o n i i f . f i s p31yhedral and c e r t a i n t e c h n i c a l conditions are s a ~ i s f i e d (see [K2] ). Fron l a c k o f space, we s h a l l pursue t h i s s u b j e c t elsewhere.

. The method i s implementable i n t h e sense q f r e q u i r i n g b~uncied s t o r a g e and a f i n i t e number af simple operations p e r i t e r a t i a c . For prqblems w i t h l a r g e s p a r s e n a t r i c e s A and B and r e l a t i v e l y few n o n l i n e a r v a r i a b l e s xi-, t h e method can use EINOS [@I] f a r s o l v i n g i t s q u a d r a t i c programming subproblems. In f a c t , ' an e f f i c i e n t implementation o f tbe method would r e q u i r e rnadifying- UNOS' t o e x p l o i t t h e f a c t t h a t consecutive subproblems r e t a i n t h e o r i g i n a l c o n s t r a i n t s o f (1 .I ), d i f f e r only i n a f e u e u x i l i a r j l i n e a r c o n s t r a i n t s an x, have simple terms q u a d r a t i c i n x as t h e only n o n l i n e q r i t i e s - i n t h e i r o b j e c t i v e s , e t c . It waulci oe

i n t e r e s t i n g t o perform t h e necessary numerical experimentation, 'but we have n o t had the means - t o do s o .

Other. descent methods f o r e o l v i n g problem (1.1

-

) can be

f ound i n [ D V ~ ]

,

[ ~ 4 ]

,

[ I S B ~ ~ . [ ~ l ]

,

[ ~ 2 ]

,

[x'~I] and Is~ill] None o f . t h e i r s e a r c h d i r e c t i o n f i n d i n g s u b p r o b l e m can be solved

(7)

e f f i c i e n t l y by t h e a v a i l a b l e software when problem (1. I ) i s l z r c r .

. .

Therefare, we hope t h a t our method c ~ u l d c ~ m p e t e w i t h t h e e x i s t i n g al&orithms.

The method i s derived and s t a t e d i n S e c t i o n 2. I t s g l ~ b a l convergence i s e s t a b l i s h e d i n S e c t i o n t 3, where we a l s ~ d i s c u s s

the case of an unbounded f e a s i b l e a e t S. F i n a l l y , we have a c o n c l u s i ~ n s e c t i o n .

HE s h a l l use t h e following n o t a t i o n and t e n n i n o l ~ g y . 3 ki and R N denote t h e U- and N-dimensional Euclidean spaces w i t h the

u s u a l i n n e r prod;cts 4 * , * > and t h e arisociated norms I I

,

r e s p e c t i r e - ly. Vle use xi t o denote t h e i - t h component o f t h e v e c t ~ r x.

. S u p e r s c r i p t s a r e used t o denote d i f f e r e n t v e c t o r s , e.g. x1 and r 2

.

A l l vectore a r e column vectors. However, f o r convenience a

column v e c t o r i n R Y+N i s eometimea denoted by (y ,x) even though

y and x a r e column v e c t o r s

i n RI"

and R N

,

r e s p e c t i v e l y . For any

x r H N and 6 r 0 ,

N N

a , f ( x ) = t g c R : f ( i ) r f ( x )

+ig,%-r>

- r f o r a l l x c R 1

denotes t h e e - s u b d i f f e r e n t i a l of f a t x. We denote by > f ( x ) t h e s e t b o f ( x ) , i.e. t h e ordinary s u b d i f f e r e n t i a l . Note t h a t

f i s continuaus and t h e mapping ( x , e )

- a e

f ( x ) i s 13ca11y b ~ u n d e d , because f i s real-valued and convex on R N ( s e e , e.G.

CDV~I 1.

2. The Xethod

1 1

Given a a t a r t i n g p o i n t a'

-

(y ,x ) S , t h e a l y r i t h s k k k

deacrlbed below g e n e r a t e s eequencea o f p o i n t s z = (y ,X ) i n a ,

k k N k

s e a r c h d i r e c t i o n s dk = (d ,d ) i n

.dLm

R and s t e p s i z e s tL i n Y x

b , 1 )

,

r e l a t e d by z k*l = zk

+

tLd f o r k=1, 2,...

.

The

aequcnce a k l i s intended t o converge t o a s o l u t i ~ n o f p r ~ b l e r .

(8)

ld rl

i?

0

k a C CI

3

d Q, CI

E 2

0 kn m .r(

cn W

n 24 ax a 24

as

n 24 2N

(I] +' r: 4 0 a

r-l a .d k CI cn d n a

a X u n N

r-l d (d

k I3 k

n X. u k

+

h a .a 0 ' V n

-

H u Erc

+

d Q, 01 n (I]

'" ax

P: z ::

Q,

' I

(9)

k k

minimize

6

(z t d )

+ 3

ldx12 over a l l d=(dY,d,)

s a t i s f y i n g z k +d E S, (2.1)

2 m k + l = x k

+

d, k .rn where t h e penalty t e r n

141

/2 s e r v e s t o keep x

A

t h e r e g i ~ n where

*fk

18 a c l o s e a p p r o x i n s t i o n . t o f a 8 3 t h a t F"(* )

"k+l = zk

+

d k m C l e a r l y , d k m a y be faurlu

i 3 c l ~ c e t o P(* ) a t z k k k

. y a x E+N+l t o t h e f3110virnl; k-th fram t h e s o l u t i o n ( d d ,u ) a R

q u a d r a t i c pragramming subproblem

1 1.. +1.;

+

1

minisize < c

,$> +

- u

+

2

I

dxl Dver a l l (dy ,dx,u) E R

k j k

s a t i s f y i n g f . +<g ,dx>

s

u f o r j E J a

3 (2.2)

k k

A(y

+

dy )

+

B(x

+

d,) 2 b.

Lore over,

s o we may i n t e r p r e t ,

vk = ;lr(zk

+

d k )

-

~ ( 2 ' )

k k k '

= < c , d Y

> +

u

-

f ( x ) (2.3)

as an epproximate d e r i v a t i v e of P a t zk i n t h e d i r e c t i o n dh.

It w i l l be convenient t o d e s c r i b e t h e P l i n e a r c a n s t r a i n t s - 1

S m R L ,

of problem (1.1) i n terms o f P a f f i n e f u n o t i 3 n s hi.

such t h a t

S = ( (y , x ) e #+N : hi(y ,x) 5 0 f o r i c I),

where I =11,...,~1. Then subproblem (2.2) t a k e s on t h e . f ~ m

1 . .

I +I?+?

minimize < c a d

> +

u

+

Z l d x ~ 2 . o v e r . a l l (d d . , u ) E iT'.

Y Y '

s a t i s f y i n g f k

+

<gJ, d,$ a u f o r j c J k

,

3 ( 2 . 4

hi k

+

<Vyhi, d ) +<Vxhi,d,)s 0 f o r i o I k k k Y

with hi = hi(y ,x ) f o r i c I , s i n c e

k k k k

hi(y +dg,x +dx) = hi(y ,x )

+(v

h. ,d ) +(Vichi,dX)

Y l Y

f 3 r a l l (dJ ,d,), because each hi i s a f f i n c .

(10)

Xaving n 3 t i v a t e d t k e a e a r c h d i r e c t i a n finding s u b ? r ~ b l e - a s , K C

s h a l l noiv s t a t e t h e nethod i n d e t a i l , c m n e a t i n g on i t s r a l e s I -

A 1 ~ o ~ i t h i . l 2.1

1 1 1

. S t e p 0 ( I n i t i a l i z a t i m l . S g l e c t a s t a r t i n g p o i n t z = (y ,x ) E 5, a f i n a l accuracy t o l e r a n c e e o r 0 and a l i n e s e a r c h p a r a ~ l e t e r

-1 - 1 -1 1 1 -1

n E ( 0 , l ) . S e t J1 = { I \ , z =(y ,X ) = e

,

G =gf(x ) and

f: = f ( z 1 j . S e t ' t h e c o u n t e r 3 k = l , 1=0 and k ( ~ ) = 1.

k k L

Ster, 1 ( D i r e c t i o n findinrcl. Find t h e s o l u t i o n (dy

,

,d, u ) t o subproblem ( 2 . 4 ) , and Lagrange multipliers x', j c J ~ , and

,

i e I , o f (2.4) such t h a t t h e s e t

3

. Y i

J

c a t i a f i e s

1 jk1

5 N+1

.

S e t dk =

($,

2;) and - compute vk by (2.3).

S t e p 2 stoppi pin^ c r i t e r i o n l . I f v k

r -

t,, t e r m i n a t e ; otherwise, c o n t i n u e

-k+l,-,k+l) = ,k + dk. If

-k+l = (y

S t e p 3 (L.i.ne s e a r c b l . S e t z

s e t

ti

= 1 ( s e r i o u s s t e p ) , s e t k ( l + l ) = k + l arid i n c r e a s e 1 bjr 1;

o t h e r w i s e , i . e e i f (2.5) does n ~ t h a l d , s e t

ti

= 0 ( n u l l st;>).

k k k

S e t z k+l (y k + l , x k + ' = z

+

t L d

.

4 ( L i c e a r i z a t i o n u ~ d a t i n ~ : ) . S e t J "k

= J u t i c + l l . S e t ( ~ k + l )

= Gf J

-iC+1>

1 k+l

-

= f ( k k + 1 )

+<, ,

x -

%+I # - (2.6)

k "k

8''

3 = fk j

+<

gJ, Xk+'

-

x

>

fsr j E J

.

I n c r e a s e k by 1 and go t o S t e p 1.

A few remarks 3n t h e a l g o r i t h m are i n o r d e r *

(11)

F o r prablctis af i n t e r e s t t~ us, s u b p r ~ b l e t t s ( 2 . 4 ) w i l l

have r e l a t i v e l y few n o n l i n e a r vari::bles (1: w L) and l a r g e , l ~ t s ~ t r ~ z s

c o n s t r a i n t n a t r i c e o . Such subprablenn can be s o l v e d by LiIiGCS [l-.~:]

i n a f i n i t e n a c b e r o f i t e r a t i a n s ; mareaver, XIIiOS w i l l a u t o c a t i - c a l l y a t ~ o s t 1i+1 nonzers h g r a n g e m u l t i p l i e r s A'

3

f o r t h e f i r s t c o n s t r a i n t s of ( 2 . 4 ) , s i n c e t h e s e c a n s t r a i n t s i n ? a l v e an11 Ii+1 v a r i a b l e s .

In bye? 2 we always have

F ( Z ) 2 p ( z k ) + vk L I V ~

llh

I X-I k

I

f a r a l l z=(y ,x) c S, (2.3)

and hence

p ( z k ) S min { ~ ( z ) : z r S I

-

vk

+

T h i s w i l l be proved i n t h e next s e c t i o n . The abave estiniz'tes j u s t i f y t h e stoppifig c r i t e r i o n of 3he ~ e t h a d .

--,:+I S t e p 3 i s always e n t e r e d w i t h vk 4 0. The t r i a l p s i n t z i s a c c e p t e d as t h e n e x t i t e r e t e z agly i f t h i s decrea9ez a i g n i f i c r n t l y t h e o b j e c t i v e value. Otherwise t h z a l g s r i t k ~ s t a y s a t z k+l = zk ( a n u l l s t e p ) , b u t t h e new s u b g r a d i e n t

i n f a m a t i o n c o l l e c t e d - a t "k+l z w i l l a i d i n f i n d i n g a b e t t e r r.exz s e a r c h d i r e c t i o n , s i n c e k+l E ' J ~ + ' . O f c o u r s e , { z k

3

c S , because

-k+l z = zk .+ dk . E

s

f o r a l l k.

\de may add t h a t if t h e r e are no l i n e a r v a r i a b l e s i n prsblem (1.1 ) (Y=o)

,

t h e n Algorithm 2.1 becames s i m i l a r t o t h e nethod of 1 ~ 3 3 .

3- C n n ~ e r ~ y e n c c

I n t h i n s e c t i o n we s h o w t h a t t h e a l g ~ r i t h m g e n e r a t e s a u i n i - n i z i n g sequence izk] c S, i D e . ~ ( z k )

4

min ( P ( Z ) : z E s ) ; m r e 3 v e r ,

-

0

there e x i s t s = (y , x ) i n t h e s e t of ~ a l u t i o n s af prablein (1.1 )

(12)

s u c h t h a t xk 4

f

and y k

Ai

f o r eorne i n f i n i t e o e t Kc{1,2,...).

ile assume, of c o u r s e , t h a t t h e f i n a l a c c u r a c y t ~ l e r a n c e t s i s s e t

t~ zero. Our a n a l y s i s w i l l dwelve on t h e r e s u l t s i n rd2], [ ~ 3 ]

.

\'le s t a r t by a n a l y z i n g t h e f ~ l l o w i n g d u a l t o t h e k - t h s u b p r a o l e z

s u b j e c t t o 3 . 2 0 f o r j e J , k 5 = I ,

J k J (3.1)

j c J

p i 2 O f o r i a I ,

where

dk'= J f ( X k )

-

f J f o r j a J k

.

Lemaa 3.1. ( i ) The Lagrange m u l t i p l i e r s ( hk,$) of (2.4) s o l v e

k k k .k

(3.1) and y i e l d t h e u n i q u e p a r t (d,,u ) of t h e s o l u t i s n (6 ,a ,uk)

Y X

of (2.4) by

where

( i i ) The o p t i m a l v a l u e wk of (3;1) s a t i s f i e s

and one h 3 ~

.

(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)

Referenzen

ÄHNLICHE DOKUMENTE

Several approaches, based on the pattern search methods dating back to [13], have been developed for bound and linearly constrained problems in [16] and [17] and more general type

One of the essential ingredients of an A ∗ search is an admissi- ble heuristic function for estimating the cost-to-go, i.e., in our case the length of a CLCS for any

Meyer, and Christian Wirth: UBY - A Large-Scale Unified Lexical-Semantic Resource Based on LMF, in: Proceedings of the 13th Conference of the European Chapter of the Association

This paper establishes a linear convergence rate for a class of epsilon-subgradient descent methods for minimizing certain convex functions on R n.. Currently prominent

These theoretical speculations are validated by experiments involving the discussed methods and a n advanced implementation of the simplex algorithm: a set of very

An experimental computer code has been developed on the basis of Section 2. We shall call our implementation Convex. The initial experiments indicate that scaling is

Second, following [12] and [22], we may solve a sequence of aug- mented Lagrangian problems with maintaining the penalty parameters unchanged, but shifting the dual

When the objective function is a lower semicontinuous convex extended function (which happens when one minimizes problems with constraints), the subgradient algorithm