A Descent Algorithm for Large-Scale Linearly Constrained Convex Nonsmooth Minimization

(1)

NOT FOR QUOTATION WITHOUT P E R M I S S I O N O F THE AUTHOR

A DESCENT ALGORITHM FOR LARGE-SCALE LINEARLY CONSTRAINED CONVEX

NONSMOOTH M I N I M I Z A T I O N

K r z y s z t o f C . K i w i e l

A p r i l 1 9 8 4 C P - 8 4 - 1 5

C o l l a b o r a t i v e P a p e r s r e p o r t w o r k w h i c h h a s n o t b e e n p e r f o r m e d s o l e l y a t t h e I n t e r n a t i o n a l I n s t i t u t e f o r A p p l i e d S y s t e m s A n a l y s i s and w h i c h has received o n l y

l i m i t e d r e v i e w . V i e w s o r o p i n i o n s e x p r e s s e d h e r e i n do n o t n e c e s s a r i l y r e p r e s e n t t h o s e of t h e I n s t i t u t e , i t s N a t i o n a l M e m b e r O r g a n i z a t i o n s , o r o t h e r o r g a n i - z a t i o n s s u p p o r t i n g t h e w o r k .

INTERNATIONAL I N S T I T U T E FOR A P P L I E D SYSTEMS A N A L Y S I S A - 2 3 6 1 L a x e n b u r g , A u s t r i a

(2)

(3)

A Descent A l ~ o r i t h m f o r L a r ~ e - S c e l c Llnearly Constrained Cor?vcx Nonsnaath kiininization

Krzyaztof C. Kiwiel

Systems Research I n s t i t u t e , P o l i s h Academy o f S c i e n c e s , Ne~velska 6 , 01-447 Warsaw, Poland.

Abstract. A descent a l g o r i t h m ^{i 8}given f o r s a l v i n g a l a r g e convex program obtained by augmenting t h e o b j e c t i v e of a

I

l i n e a r pragram w i t h a ( p a s e i b l y n o n d i f f e r e n t i a b l e ) canvex f u c t i o n depending an r e l a t i v e l y few v a r i a b l e s . Such prablens a f t e n a r i s e i n p r a c t i c e as d e t e r m i n i s t i c e q u i v a l e n t s ~f

s t o c h a s t i c programming p r o b l e m . The algorithm ^soearch d i r e c t i m f i n d i n g subproblem8 can be solved e f f i c i e n t l y by t h e e x i s t i n g software f o r large-scale smooth optimizatian. The algorithzc i s both r e a d i l y implementable and g l o b a l l y convergent.

AIGS 1580 s u b j e c t c l a s 8 i f i c a t i a n . Primary: 65K05. .Secondary:gGC25.

Xey wards: Nonsmaoth o p t i m i z a t i o n , n o n d i f f e r e n t i a b l e p r ~ g r a m n i n g , l i n e a r c o n s t r a i n t s , convex programming, descent me t h ~ d s

,

l a r g e - s c a l e optimization

(4)

(5)

1 I n t r o d u c t i o n

T h i s paper p r e s e n t s a method f o r sol.ving t h e f o l l ~ w i n g problem

minimize <c,y>

+

^f(x) over a l l ( y , x ) ~ I3

'

s a t i s f y i n g Ay

+

^Bx^s^b, ^(1.1)

P -

where c ^e+, A i s an Pm hi-matrix, B i s an P mN-matrix, ~ E ,aa: R f :

61

+ R i s 1 a (possibly n o n d i f f e r e n t i a b l e ) convex f u c t i o n .

Yie ouppose t h a t t h e s e t o f f e a s i b l e p o i n t s : & + B x ' b )

S = C ( ~ , X ) E R

i s nonempty and bounded, and t h a t a t each (y,x) t S we can

compute f ( x ) and a c e r t a i n eubgradient g f ( x ) c h f ( x ) , i.e. an a r b i t r a r y element o f t h e s u b d i f f e r e n t i a l

a

^f^{( x )} o f I" a t x a n which we cannot impose any f u r t h e r r e s t r i c t i o n s .

Problems o f t h e fom (1.1) a r e o f t e n encountered i n p r a c t i c e , e s p e c i a l l y a s d e t e r m i n i s t i c , equivalents o f two-stage s t ~ c h z a t i c p r o g r a m i n g problems [KI]

,

^[NWI]

,

^[WIJ ^{I n}^{m a n y} a p p l i c a t i a n s

t h e number LI o f " l i n e a r m v a r i a b l e 8 y i i s much l a r g e r than t h e number N o f "nonlinear" v a r i a b l e s

5 ,

^and t h e m a t r i c e s A and 3

a r e s p a r s e (have r e l a t i v e l y few nonzero e n t r i e s ) . I n such cases problem (1 . I ) can be solved by t h e e x i s t i q g algorithms f o r

large-scale optimization (e.g. LIINOS [?IGs~] ) i f f i s d i f f e r e n t i a 2 f e . I n t h e n o n d i f f e r e n t i a b l e l a r g e - s c a l e case, only a few a l g a r i t k z s

have been propotled [ B W ~ ] , m d they frequently nssunc t h c Laowledge of t h e f u l l e u b d i f f e r e n t i a l bf(2:) a t each x.

The ffiethod preeented i n t h i s paper modifies one given i n [1(3] t o makc use of t h e e p e c l a l s t r u c t u r e ~ f p r o b l e n ( 1 . I )

.

^It

i o o f c n s i b l e p o i n t mathod of deownt i n " t h e scnce of 1~;encrotill:

succcsoivc poiritu i n S w i t h nonincreesing o b j c c t i v e values.

(6)

To d e a l w i t h n o n d i f f e r c n t i a b i l i t y o f f , . a t each i t e r a t i m a

piecewise l i n e a r (polyhedral) approximatim t o f is c ~ n s t r u c t e d fram a t m o s t N+2 a u b ~ r a d i e n t e o f f c a l c u l a t e d przviously a t

c e r t a i n t r i a l points. A cearch d i r e c t i o n i s found by salv>fig a q u a d r a t i c p r o g r a m i n g subproblem obtained by r e p l a c i n g f i n .

- .

(1.1) by i t s polyhedral approximation augmented w i t h a simple q u a d r a t i c term. Then a l i n e aearch f i n d s t h e - n e x t approximation

and t h e next t r i a l p o i n t . The two-point l i n e

s e a r c h i s employed t o d e t e c t d i s c o n t i n u i t i e s i n t h e - g r a d i e n t o f f.

.

We show t h a t t h e m e t h ~ d ie g l o b a l l y convergent under no a d d i t i o n a l assumptions. We m a y add t h a t t h e method w i l l f i n d a s o l u t i o n i n a f i n i t e number o f i t e r a t i o n i i f . f i s p31yhedral and c e r t a i n t e c h n i c a l conditions are s a ~ i s f i e d (see [K2] ). Fron l a c k o f space, we s h a l l pursue t h i s s u b j e c t elsewhere.

. The method i s implementable i n t h e sense q f r e q u i r i n g b~uncied s t o r a g e and a f i n i t e number af simple operations p e r i t e r a t i a c . For prqblems w i t h l a r g e s p a r s e n a t r i c e s A and B and r e l a t i v e l y few n o n l i n e a r v a r i a b l e s xi-, t h e method can use EINOS [@I] f a r s o l v i n g i t s q u a d r a t i c programming subproblems. In f a c t , ^'an e f f i c i e n t implementation o f tbe method would r e q u i r e rnadifying- UNOS' t o e x p l o i t t h e f a c t t h a t consecutive subproblems r e t a i n t h e o r i g i n a l c o n s t r a i n t s o f (1 .I ), d i f f e r only i n a f e u e u x i l i a r j l i n e a r c o n s t r a i n t s an x, have simple terms q u a d r a t i c i n x as t h e only n o n l i n e q r i t i e s - i n t h e i r o b j e c t i v e s , e t c . It waulci oe

i n t e r e s t i n g t o perform t h e necessary numerical experimentation, 'but we have n o t had the means - t o do s o .

Other. descent methods f o r e o l v i n g problem (1.1

-

) can be

f ound i n [ D V ~ ]

,

^{[ ~ 4 ]}

,

[ I S B ~ ~ . [ ~ l ]

,

^{[ ~ 2 ]}

,

^[x'~I] ^and Is~ill] None o f . t h e i r s e a r c h d i r e c t i o n f i n d i n g s u b p r o b l e m can be solved

(7)

e f f i c i e n t l y by t h e a v a i l a b l e software when problem (1. I ) i s l z r c r .

. .

Therefare, we hope t h a t our method c ~ u l d c ~ m p e t e w i t h t h e e x i s t i n g al&orithms.

The method i s derived and s t a t e d i n S e c t i o n 2. I t s g l ~ b a l convergence i s e s t a b l i s h e d i n S e c t i o n t 3, where we a l s ~ d i s c u s s

the case of an unbounded f e a s i b l e a e t S. F i n a l l y , we have a c o n c l u s i ~ n s e c t i o n .

HE s h a l l use t h e following n o t a t i o n and t e n n i n o l ~ g y . 3 ki and R N denote t h e U- and N-dimensional Euclidean spaces w i t h the

u s u a l i n n e r prod;cts 4 * , * > and t h e arisociated norms I I

,

r e s p e c t i r e - ly. Vle use xi t o denote t h e i - t h component o f t h e v e c t ~ r x.

. S u p e r s c r i p t s a r e used t o denote d i f f e r e n t v e c t o r s , e.g. x1 ^{and r}2

.

A l l vectore a r e column vectors. However, f o r convenience a

column v e c t o r i n R Y+N i s eometimea denoted by (y ,x) even though

y and x a r e column v e c t o r s

i n RI"

and R N

,

r e s p e c t i v e l y . For any

x r H N and 6 r 0 ,

N N

a , f ( x ) = t g c R : f ( i ) r f ( x )

+ig,%-r>

- r f o r a l l x c R 1

denotes t h e e - s u b d i f f e r e n t i a l of f a t x. We denote by > f ( x ) t h e s e t b o f ( x ) , i.e. t h e ordinary s u b d i f f e r e n t i a l . Note t h a t

f i s continuaus and t h e mapping ( x , e )

- a e

f ( x ) i s 13ca11y b ~ u n d e d , because f i s real-valued and convex on R N ( s e e , e.G.

CDV~I 1.

2. The Xethod

1 1

Given a a t a r t i n g p o i n t a'

-

^(y ^,x ⁾ ^{S ,} t h e a l y r i t h s k k k

deacrlbed below g e n e r a t e s eequencea o f p o i n t s z = (y ,X ) i n a ,

k k N k

s e a r c h d i r e c t i o n s dk = (d ,d ) i n

.dLm

^R ^ands t e p s i z e s tL i n Y x

b , 1 )

,

r e l a t e d by z k*l = zk

+

tLd f o r k=1, 2,...

.

^The

aequcnce a k l i s intended t o converge t o a s o l u t i ~ n o f p r ~ b l e r .

(8)

ld rl

i?

0

k a C CI

3

d Q, CI

E 2

0 kn m .r(

cn W

n 24 ax a 24

as

n 24 2N

(I] +' r: 4 0 a

r-l a .d k CI cn _dⁿ_a

a X u n N

r-l d (d

k I3 k

n X. u k

+

h a .a 0 ' V n

-

H u Erc

+

d Q, 01 n (I]

'" ax

P: z ::

Q,

' I

(9)

k k

minimize

6

(z t d )

+ 3

^ldx12 ^over^{a l l} ^d=(dY,d,)

s a t i s f y i n g z k +d E S, _(2.1)

2 _{m k + l}₌ _xk

+

d, k .rn where t h e penalty t e r n

141

^/2 s e r v e s t o keep x

A

t h e r e g i ~ n where

*fk

¹⁸^a c l o s e a p p r o x i n s t i o n . t o f a 8 3 t h a t F"(* )

"k+l = zk

+

^{d k m}C l e a r l y , d k m a y be faurlu

i 3 c l ~ c e t o P(* ) a t z k k k

. y a x E+N+l t o t h e f3110virnl; k-th fram t h e s o l u t i o n ( d d ,u ) a R

q u a d r a t i c pragramming subproblem

1 1.. +1.;

+

¹

minisize < c

,$> +

- u

+

2

I

dxl Dver a l l (dy ,dx,u) E R

k j k

s a t i s f y i n g f . +<g ,dx>

s

u f o r j E J ^a

3 (2.2)

k k

A(y

+

^dy⁾

+

^B(x

+

d,) ²b.

Lore over,

s o we may i n t e r p r e t ,

vk = ;lr(zk

+

d k )

-

~ ( 2 ' )

k k k '

= < c , d _Y

> +

u

-

^{f ( x}⁾ ^(2.3)

as an epproximate d e r i v a t i v e of P a t zk i n t h e d i r e c t i o n dh.

It w i l l be convenient t o d e s c r i b e t h e P l i n e a r c a n s t r a i n t s - 1

S m R L ,

of problem (1.1) i n terms o f P a f f i n e f u n o t i 3 n s hi.

such t h a t

S = ( (y , x ) e #+N : hi(y ,x) 5 0 f o r i c I),

where I =11,...,~1. Then subproblem (2.2) t a k e s on t h e . f ~ m

1 . .

I +I?+?

minimize < c a d

> +

u

+

Z l d x ~ 2 . o v e r . a l l (d d . , u ) E iT'.

Y Y '

s a t i s f y i n g f k

+

^<gJ,d,$ a u f o r j c J k

,

3 ( 2 . 4

hi k

+

<Vyhi, d ) +<Vxhi,d,)s 0 f o r i o I k k k Y

with hi = hi(y ,x ) f o r i c I , s i n c e

k k k k

hi(y +dg,x +dx) = hi(y ,x ⁾

+(v

^{h. ,d} ⁾+(Vichi,dX)

Y l Y

f 3 r a l l (dJ ,d,), because each hi i s a f f i n c .

(10)

Xaving n 3 t i v a t e d t k e a e a r c h d i r e c t i a n finding s u b ? r ~ b l e - a s , K C

s h a l l noiv s t a t e t h e nethod i n d e t a i l , c m n e a t i n g on i t s r a l e s I -

A 1 ~ o ~ i t h i . l 2.1

1 1 1

. S t e p 0 ( I n i t i a l i z a t i m l . S g l e c t a s t a r t i n g p o i n t z = (y ,x ) E 5, a f i n a l accuracy t o l e r a n c e e o r ⁰ and a l i n e s e a r c h p a r a ~ l e t e r

-1 - 1 -1 1 1 -1

n E ( 0 , l ) . S e t J1 = { I \ , z =(y ,X ) = e

,

G =gf(x ) and

f: = f ( z 1 j . S e t ' t h e c o u n t e r 3 k = l , 1=0 and k ( ~ ) = 1.

k k L

Ster, 1 ( D i r e c t i o n findinrcl. Find t h e s o l u t i o n (dy

,

^,^d^, ^u⁾ t o subproblem ( 2 . 4 ) , and Lagrange multipliers x', j c J ~ , and

,

^{i e}^{I ,}^{o f} ^(2.4) such t h a t t h e s e t

3

. Y i

J

c a t i a f i e s

1 jk1

⁵N+1

.

^{S e t}^dk⁼

^($,

^2;) ^{and -}^compute ^vkby (2.3).

S t e p 2 stoppi pin^ c r i t e r i o n l . I f v k

r -

^t,, t e r m i n a t e ; otherwise, c o n t i n u e

-k+l,-,k+l) = ,k + dk. If

-k+l = (y

S t e p 3 (L.i.ne s e a r c b l . S e t z

s e t

ti

⁼ 1 ( s e r i o u s s t e p ) , s e t k ( l + l ) = k + l arid i n c r e a s e 1 bjr 1;

o t h e r w i s e , i . e e i f (2.5) does n ~ t h a l d , s e t

ti

⁼0 ( n u l l st;>).

k k k

S e t z k+l (y k + l , x k + ' = z

+

^{t L d}

.

4 ( L i c e a r i z a t i o n u ~ d a t i n ~ : ) . S e t J "k

= J u t i c + l l . S e t ( ~ k + l )

= Gf J

-iC+1>

1 k+l

-

= f ( k k + 1 )

+<, ,

^x^-

%+I ^#- (2.6)

k "k

8''

₃ ⁼^fk_j

+<

^gJ,^Xk+'

-

^x

^>

^fsr^j^E^J

.

I n c r e a s e k by 1 and go t o S t e p 1.

A few remarks 3n t h e a l g o r i t h m are i n o r d e r *

(11)

F o r prablctis af i n t e r e s t t~ us, s u b p r ~ b l e t t s ( 2 . 4 ) w i l l

have r e l a t i v e l y few n o n l i n e a r vari::bles (1: w L) and l a r g e , l ~ t s ~ t r ~ z s

c o n s t r a i n t n a t r i c e o . Such subprablenn can be s o l v e d by LiIiGCS [l-.~:]

i n a f i n i t e n a c b e r o f i t e r a t i a n s ; mareaver, XIIiOS w i l l a u t o c a t i - c a l l y a t ~ o s t 1i+1 nonzers h g r a n g e m u l t i p l i e r s A'

3

f o r t h e f i r s t c o n s t r a i n t s of ( 2 . 4 ) , s i n c e t h e s e c a n s t r a i n t s i n ? a l v e an11 Ii+1 v a r i a b l e s .

In bye? 2 we always have

F ( Z ) 2 p ( z k ) + vk L I V ~

llh

I ^X-Ik

I

^{f a r}a l l z=(y ,x) c S, (2.3)

and hence

p ( z k ) S min { ~ ( z ) : z r S I

-

^vk

⁺

T h i s w i l l be proved i n t h e next s e c t i o n . The abave estiniz'tes j u s t i f y t h e stoppifig c r i t e r i o n of 3he ~ e t h a d .

--,:+I S t e p 3 i s always e n t e r e d w i t h vk ⁴ ^0.The t r i a l p s i n t z i s a c c e p t e d as t h e n e x t i t e r e t e z agly i f t h i s decrea9ez a i g n i f i c r n t l y t h e o b j e c t i v e value. Otherwise t h z a l g s r i t k ~ s t a y s a t z k+l = zk ( a n u l l s t e p ) , b u t t h e new s u b g r a d i e n t

i n f a m a t i o n c o l l e c t e d - a t "k+l z w i l l a i d i n f i n d i n g a b e t t e r r.exz s e a r c h d i r e c t i o n , s i n c e k+l E ' J ~ + ' . O f c o u r s e , { z k

3

^cS , because

-k+l z = zk .+ dk ^.E

s

f o r a l l k.

\de may add t h a t if t h e r e are no l i n e a r v a r i a b l e s i n prsblem (1.1 ) (Y=o)

,

t h e n Algorithm 2.1 becames s i m i l a r t o t h e nethod of 1 ~ 3 3 .

3- C n n ~ e r ~ y e n c c

I n t h i n s e c t i o n we s h o w t h a t t h e a l g ~ r i t h m g e n e r a t e s a u i n i - n i z i n g sequence izk] c S, i D e . ~ ( z k )

4

min ( P ( Z ) : z E s ) ; m r e 3 v e r ,

-

⁰

there e x i s t s ₌ _{(y , x )} i n t h e s e t of ~ a l u t i o n s af prablein (1.1 )

(12)

s u c h t h a t xk 4

f

and y k

Ai

f o r eorne i n f i n i t e o e t Kc{1,2,...).

ile assume, of c o u r s e , t h a t t h e f i n a l a c c u r a c y t ~ l e r a n c e ^{t s}i s s e t

t~ zero. Our a n a l y s i s w i l l dwelve on t h e r e s u l t s i n rd2], [ ~ 3 ]

.

\'le s t a r t by a n a l y z i n g t h e f ~ l l o w i n g d u a l t o t h e k - t h s u b p r a o l e z

s u b j e c t t o 3 . 2 0 f o r j e J , k 5 = I ,

J k J (3.1)

j c J

p i 2 O f o r i a I ,

where

dk'= _J f ( X k )

-

^f_J ^{f o r}^{j a J}^k

.

Lemaa 3.1. ( i ) The Lagrange m u l t i p l i e r s ( hk,$) of (2.4) s o l v e

k k k .k

(3.1) and y i e l d t h e u n i q u e p a r t (d,,u ) of t h e s o l u t i s n (6 ,a ,uk)

Y X

of (2.4) by

where

( i i ) The o p t i m a l v a l u e wk of (3;1) s a t i s f i e s

and one h 3 ~

.

(13)

(14)

(15)

(16)

(17)

(18)

(19)

(20)

(21)

(22)