NOT FOR QUOTATION WITHOUT PERMISSION OF THE AUTHOR
AN ACCELERATED METHOD FOR MINIMIZING A CONVEX FUNCTION OF TWO VARIABLES
F.A. PAIZEROVA April 1985 CP-85-17
C o t t a b o r a t i v e P a p e r s
report work which has not been performed solely at the International Institute for Applied Systems Analysis and which has received only limited review. Views or opinions expressed herein do not necessarily represent those of the Institute, its National Member Organizations, or other organi- zations supporting the work.
INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS
A-2361 Laxenburg,. Austria
PREFACE
In this paper the author considers the problem of minimiz- ing a convex function of two variables without computing the derivatives or (in the nondifferentiable case) the subgradients of the function, and suggests two algorithms for doing this.
Such algorithms could form an integral part of new methods for minimizing a convex function of many variables based on the
solution of a two-dimensional minimization problem at each step (rather than on line-searches, as in most existing algorithms.)
This is a contribution to research on nonsmooth optimization currently underway in System and Decision Sciences Program Core.
A.
B. KURZHANSKI ChairmanSystem and Decision
Sciences Program
AN ACCELERATED METHOD FOR M I N I M I Z I N G A CONVEX FUNCTION O F TWO
VARIABLES
F.A.
Paizerova
A
method f o r minimizing a convex c o n t i n u o u s l y d i f f e r e n t - i a b l e f u n c t i o n o f two v a r i a b l e s was proposed i n
[I],where
i twas shown t h a t i t s r a t e of convergence i s geometric with
c o e f f i c i e n t 0.9543.
W es h a l l d e s c r i b e two m o d i f i c a t i o n s of
thismethod w i t h improved convergence r a t e s .
Let
Z E E ~ ,a f u n c t i o n f be convex and continuously d i f f e r - e n t i a b l e on
E2. A s s u m et h a t we know t h a t a minimum p o i n t of f
i scontained i n a convex q u a d r i l a t e r a l ABCD, The a r e a of t h i s q u a d r i l a t e r a l is c a l l e d t h e u n c e r t a i n t y a r e a .
L e t Rbe t h e p o i n t of i n t e r s e c t i o n of t h e diagonals of t h e q u a d r i l a t e r a l .
L e tus choose f o u r p o i n t s
M , N , Q , Pon i n t e r v a l s
ACand
BDwhich a r e a l l
a tt h e same d i s t a n c e
Efrom R (where e
> 0i s f i x e d ) ,
Now l e t u s compute t h e f u n c t i o n f a t t h e s e p o i n t s and a t t h e p o i n t
R( s e e F i g u r e
1 ) .Case
1f ( M 1
> f ( R ) ,f ( N )
>f ( R ) ( 2 )
I n t h i s c a s e
Ri s ( w i t h i n e-accuracy) a minimum p o i n t of f onAC and
BD,and t h e n by t h e p r o p e r t i e s of continuously d i f f e r e n t - i a b l e f u n c t i o n s t h e p o i n t
Ris a minimum p o i n t of f on ABCD ( t o w i t h i n t h e g i v e n accuracy
c )and t h e p r o c e s s terminates.
Case
2.. Ifi n e q u a l i t y
( 1 )i s s a t i s f i e d b u t i n e q u a l i t y
( 2 ) i sn o t , then
Ri s a minimum p o i n t of f on
BD.I-f f(M)
<f ( R ) then
D
Fig. 1 Fig. 2
VZEBDC
and t h e r e f o r e a minimum p o i n t .of f l i e s w i t h i n t h e t r i a n g l e ABD.
I f f ( N ) < f ( R ) t h e n
f ( Z ) > f ( R ) VZEABD
and a minimum p o i n t of f l i e s w i t h i n t h e t r i a n g l e BDC.
Case 3. I f i n e q u a l i t y ( 2 ) i s s a t i s f i e d b u t ( 1 is n o t then we argue analogously.
These t h r e e c a s e s were d i s c u s s e d i n [ I ] and a r e t r e a t e d i n t h e same way h e r e . The d i f f e r e n c e between o u r method and t h a t of [ I ] i s demonstrated i n t h e f o l l o w i n g c a s e 4 .
Case 4. Suppose t h a t b o t h i n e q u a l i t i e s ( 1 ) and ( 2 ) a r e s a t i s - f i e d . Then t h e r e e x i s t two p o i n t s ( s a y , M and Q ) such t h a t
I t follows from t h e c o n v e x i t y of f t h a t
f ( z ) > f ( R ) VZEDRC
L e t us draw t h e l i n e VW which p a s s e s through t h e p o i n t R and
i s
p a r a l l e l t o t h e l i n e DC. O n t h e i n t e r v a l VW l e t us choose two p o i n t s G and H a t a d i s t a n c e e from H. I f f ( H )-
> f ( R ) andf ( G )
-
> f ( R ) t h e n R i s ( w i t h i n E-accuracy) a minimum p o i n t of the f u n c t i o n f ( Z ) on t h e l i n e VW ( s e e [21 ) and s i n c e f ( M ) c f ( R ) thenf ( z ) > f ( R ) TJZEVWCD
This c a s e was a l s o d i s c u s s e d i n [ I ] . The c a s e l e f t t o be d i s - cussed i s t h e one where e i t h e r f (H) c f ( R ) o r f ( G ) c f ( R ) .
A t t h i s p o i n t o u r method d i v e r g e s from t h e method d e s c r i b e d i n [ I ] . W e w i l l s u g g e s t two m o d i f i c a t i o n s of t h i s method. For t h e s a k e of argument assume t h a t f ( H ) < f ( R ) .
-
1 . F i r s t m o d i f i c a t i o n , I t i s assumed t h a t
Then ( s e e F i g u r e 1 )
TJZEVRCD Moreover,
f ( Z ) > f ( R ) VZEVCD
L e t u s draw t h e l i n e FF1 which p a s s e s through t h e p o i n t R and is p a r a l l e l t o t h e l i n e VC, On t h e i n t e r v a l FF, l e t us choose two p o i n t s T and S a t a d i s t a n c e E from R.
f ( T I
2
f ( R ) and f (S)2
f ( R )then R is (within E-accuracy) a minimum p o i n t of f on FF1 and
f ( Z ) > f ( R ) VZEFFICD f ( S ) c f ( R ) t h e n
f - ( Z ) > f ( R )
and furthermore
,
As a result we get the quadrilateral ABCF which contains a mini- mum point of the function f. Let us compute the ratio of the areas of the quadrilaterals ABCF and ABCD.
Assume that
Let h b e the height of the triangle ABC. Then
Here SABC is the area of the triangle ABC. We have
Let us aefine h2. Since
we have
This leads to
1 a a
'mc
= 'VRC =2
RC"hZ = A C m h,
2(l+a1)
Hence, t h e r a t i o of t h e a r e a of t h e q u a d r i l a t e r a l ABCF t o t h e a r e a of t h e q u a d r i l a t e r a l ABCD i s
S i n c e
a l (2+al
>
2
-
a ( 2 + a ) i f a l2
a t h i s r e s u l t i m p l i e s ( l + a l ( 1 + a )I f w e d e c r e a s e
the
u n c e r t a i n t y a r e a a s shown i n Figure 2 ,similar
arguments l e a d u s a g a i n t o ( 4 ) .I f a t some s t e p it t u r n s o u t t h a t RD =
a -
a 0 (where a. w i l l be d e f i n e d l a t e r ) t h e n w e draw a l i n e p a s s i n g through D and p a r a l l e l t o AC, and t h e n extend AB and BD u n t i l they i n t e r -sect
t h i s l i n e (see F i g u r e 3 ) . I n s t e a d of t h e q u a d r i l a t e r a l ABCD l e t us t a k e t h e t r i a n g l e A I B C I . I n t h e c a s e of a q u a d r i - l a t e r a l w e had f o u r l i n e s p a s s i n g through R. I n t h e c a s e of a t r i a n g l e w e t a k e the p o i n t of i n t e r s e c t i o n of i t s medians( t h e p o i n t R, ) i n s t e a d of R.
Fig. 4 Fig. 3
If a minimum point of f is not contained in khe quadri- lateral KBFRl (Fig.
3)then we draw the line W passing through R1 and parallel to tne line AIC1. On the interval
VWlet us choose two points G and H at a distance
Efrom R1.
If
f (GI 2 f (R1 and f (HI 2 f (R1
then
Ris (within e-accuracy) a nlinimum point of f on W and
1f(Z)
>f(R1) V Z E V B W
Consider the case f
(H) <f (R,) . Then we conclude that
and furthermore,
f(Z)
>f(R1) V Z E V B F
Thus, we have a new quadrilateral AIVFCl which contains a mini- mum point.
Let us define the ratio of the area of the quadrilateral AIVFCl and the quadrilateral ABCD. Let h be the height of the triangle ABC. We have
Hence,
1
'A~ w c l
=3 (1+a) A I C l m h
and
L e t u s c o n s i d e r t h e c a s e where t h e t r i a n g l e A I R I C l (see F i g . 4 ) does n o t c o n t a i n a minimum p o i n t of f . L e t us draw t h e l i n e VW p a s s i n g through t h e p o i n t R1 and p a r a l l e l t o t h e l i n e A I C 1 , and argue a s above. L e t VBC1 be a t r i a n g l e which c o n t a i n s a minimum p o i n t of f . W e g e t
and t h e r a t i o of t h e a r e a of t h e
new
t r i a n g l e VBC1 and t h e qua- d r i l a t e r a l ABCD i s7
2 ( l + a ) , i . e . ( 5 ) h o l d s again.If a
-
< a O-
0,335, t h e n w e must c o n s t r u c t a t r i a n g l e s i n c e i t g u a r a n t e e s a g r e a t e r d e c r e a s e i n t h e u n c e r t a i n t y a r e a , The q u a n t i t y a. is then a s o l u t i o n of t h e e q u a t i o nThe convergence of t h i s m o d i f i c a t i o n of t h e method from El]
i s g e o m e t r i c w i t h t h e r a t e
Fig. 5 Fig. 6
2. Second m o d i f i c a t i o n . L e t u s a g a i n (see F i g . 5) assume t h a t
f(M) < f ( R ) Then
f ( Z ) > f ( R ) V Z E O R C D
.
Furthermore,
f ( Z ) > f ( R ) VZEVCD
L e t u s draw t h e l i n e FF, p a s s i n g t h r o u g h R and p a r a l l e l t o t h e l i n e VC. On
tne
i n t e r v a l FFt l e t u s choose two p o i n t s T and S a ta
d i s t a n c e E from R.f ( T )
,
f ( R ) and f ( S )2
f ( R )t h e n R is ( w i t h i n € - a c c u r a c y )
a
minimum p o i n t o f f on FF1 andL e t
Then
V Z E F R C D and f u r t h e r m o r e
Now l e t u s a g a i n draw t h e l i n e KL p a s s i n g t h r o u g h R and p a r a l l e l
-
t o FC and p r o c e e d
as
above.A s a r e s u l t w e g e t t h e new q u a d r i l a t e r a l ABCK which con- t a i n s
a
minimum p o i n t o f f . How l e t u s compute t h e r a t i o o f tile a r e a s of t h e new q u a d r i l a t e r a l ABCX and t h e q u a d r i l a t e r a l ABCD.Assume that
Let h be the L i g h t of the. triangle ABC. It follows from the computations above that
Let us find h3. Since SAFC =
1
A C - h 3 and 2we
haveTherefore
a - a , a - a , .I
The ratio of the areas of the new quadrilateral
ABCK
and the quadrilateral ABCD isS i n c e
it f o l l o w s from ( 6 ) t h a t
If w e d e c r e a s e t h e u n c e r t a i n t y a r e a a s shown i n F i g . 6 , w e a g a i n o b t a i n t h e
same
r e l a t i o n (7).
L e t (see F i g . 7 )
f (h) < f ( R ) Then
V Z E V R C D and f u r t h e r m o r e
f ( Z ) > f ( R ) TTZEVCD
L e t u s draw t h e l i n e FF1 . p a s s i n g t h r o u g h t h e p o i n t R and p a r a l l e l t o t h e l i n e VC. On the i n t e r v a l FF1 l e t u s c h o o s e two p o i n t s T and S a t a d i s t a n c e c from R. I f
f ( T )
2
f ( R ) and f ( S ) f f ( ~ )t h e n R is ( w i t h i n &-accuracy) a minimum p o i n t o f f on FF1 and
L e t
f (T) < f ( R ) .
0
Fig.
7Then
0
Fig.
8Let us again draw the line
KLpassing through
Rand parallel to the line
VF1and argue as above.
Asa result we get a new quacl-
rilateral
ABFIKwhich contains a :nininun point of f. Find the ratio of the areas of the quadrilaterals
ABFIKand ABCD.
Assume that
The triangles DRC and ABR are similar since
We have
=a and BC is parallel to AB.
AB
The line
VWis parallel to the line DC
byconstruction. Thus,
W U A B .The triangles
ABDand VRD are also similar since the
c o r r e s p o n d i n g a n g l e s a r e e q u a l . T h e r e f o r e
Analogously t h e f a c t t h a t t h e t r i a n g l e s BCD and BWR a r e s i m i l a r i m p l i e s t h a t
T h e r e f o r e VR = WR and L ARV = L CRW. W e have W1 = WW1. The l i n e FF1 i s p a r a l l e l t o the l i n e VC by c o n s t r u c t i o n . S i n c e t h e t r i a n g l e s VWC a n a RWFl a r e s i m i l a r , w e have
Hence,
W e have
From t h e computations above i t f o l l o w s t h a t
a 1 a aa
RC =
-
1 + a l A c t W1 h2- - -
h' 'VCD- -
2 ( 1 + a 1 ) AC-h,
Thus,
Then
The ratio of the areas of the new quadrilateral ABFIK and the quadrilateral ABCD is
(since a 1 = a).
If we decrease the uncertainty area as shown in Fig. 8 then we again have (8)
.
The estimate (8) is worse than (7).
In the case
we always have an estimate better than (8). If at some step
then we enlarge the quadrilateral to a triangle and instead
of
the quaarilateral ABCD we take the triangle A l BCl (Fig. 9).
Fig. 9 Fig. 10
L e t R1 be t h e p o i n t o f i n t e r s e c t i o n of t h e m e d i a n s o f t r i a n g l e AIBC1. L e t t h e r e be no minimum p o i n t of f i n t h e q u a d r i l a t e r a l KBFR1. Then l e t
u s
draw t h e l i n e VW p a s s i n g through t h e p o i n t R1 and p a r a l l e lt o
t h e l i n e A I C 1 . On t h e i n t e r v a l VW choose t w o p o i n t s G andH
a t a d i s t a n c e E from R1. I ff ( G )
-
> f ( R 1 ) and f ( H )-
> f ( R 1 )t h e n R1 i s ( w i t h i n &-accuracy) a minimum p o i n t of f on VW and
f ( Z ) > f ( R 1 ) VZEVBW
I n t h e c a s e f ( H ) < f (R1 ) w e have
and mpreover
L e t
u s
draw t h e l i n e V,F,
p a s s i n g through t h e p o i n t R, and p a r a l l e l t o t h e l i n eYF,
and a r g u e a n a l o g o u s l y . L e ta
q u a d r i - l a t e r a l Al W I C l b e o b t a i n e d which c o n t a i n sa minimum
p o i n t o f f . L e t h be the h e i g h t of t h e triangle'ABC. W e have-
(l+a)AIC1 oh-, S,,-
1'VBF
- 8 -
'VFR~ I-
36 ( l + a ) A I C l * h
,
1
The r a t i o o f t h e new q u a d r i l a t e r a l AIWICl and t h e q u a d r i l a t e r a l Ai3CD i s
I f
w e
d e c r e a s e t h e t r i a n g l e a s showni n
F i g , 10, t h e n t h e r a t i o of t h e a r e a s of t h e new t r i a n g l e FBC1 and t h e q u a d r i l a -t e r a l ABCD i s
The e s t i m a t e ( 9 ) i s worse t h a n t h e e s t i m a t e ( 1 0 )
.
then it i s n e c e s s a r y t o c o n s t r u c t a t r i a n g l e . The q u a n t i t y a. i s
a
s o l u t i o n of - t h e e q u a t i o nThis m o d i f i c a t i o n of t h e method d i s p l a y s g e o m e t r i c convergence w i t h a r a t e q 0.8425.
1. V.F. Demyanov. "On minimizing a convex f u n c t i o n on a p l a n e w , Zh. Vychisl. M a t . M a t , F i z . 1 6 ( 1 ) (1976) 247-251.
2. D . J . Wilde. Optimum Seeking Methods. P r e n t i c e - H a l l I n t e r n . S e r i e s i n t h e P h y s i c a l