Analogues of line search techniques - Stochastic Quasigradient Methods and their Implications

The decision as t o whether (and how) to change t h e s t e p size may be based on t h e values of the scalar product of adjacent s t e p directions. If we have

(P-l,P) >

0, t h e n this may be a sign t h a t regular behavior prevails over sto- chastic behavior, t h e function is decreasing in t h e s t e p direction and t h e s t e p size should be increased. Due to stochastic effects t h e function will very often increase r a t h e r t h a n decrease, but in t h e long r u n t h e number of bad choices will be less t h a n t h e number of c o r r e c t decisions. Analogously. if this inequal- ity does not hold then t h e s t e p size should be decreased. The rule for changing t h e step size is t h u s basically a s follows:

where t h e values of al, a2, a3 (recommended values a l

-

0.4-0.8 , 1

<

^{a 2 <}1.3 and 0.7

s

<

1) should be chosen before starting t h e iterations. It is also advisable t o have upper and lower bounds on t h e step size t o avoid divergence.

Sometimes i t is convenient t o normalize the vectors of s t e p directions, i.e.,

il =

1. The lower bound may decrease a s the iterations proceed. This method may also be applied to t h e choice of a vector s t e p size, treating some (or all)

variables or groups of variables separately. A number of different methods based on the use of scalar products of adjacent s t e p directions to control the s t e p size have been developed by Uriasiev [19], Pflug [16]. a n d Ruszczynski and Syski [20].

The interactive stochastic optimization package implemented a t IIASA (STO) is based on t h e s a m e ideas as the package for stochastic and nondifferentiable optimization developed in Kiev (NDO). It allows t h e user t o choose between interactive and automatic modes and makes available t h e s to- chastic quasigradient methods described in Sections 2 a n d 3. In t h e interactive mode t h e program offers the user t h e opportunity to change the step parame- t e r s and the methods by which t h e step size and step direction a r e chosen dur- ing the course of the iterations. The user can also stop the iterative process a n d obtain a more precise e s t i m a t e of the value of the objective function before continuing. The package is written in FORTRAN-77.

Before initiating t h e optimization process the user h a s to:

(i) Provide a subroutine UF which calculates the value of function f ( z , o ) for fixed z and w and. optionally, a subroutine UG which computes t h e gra- dient f,(z,w) of this function; t h e function evaluation subroutine should be of the form:

FVNCl"I0N UF(N,X) DIMENSION X(N) Calculation of

J

(z,w)

RETURN

END

Here N is the dimension of t h e vector of variables

X

(Note t h a t the imple- mentation on the IIASA

VAX

actually requires the subroutine to be entered in lower-case letters r a t h e r than capitals.) A description of the subroutine which calculates a quasigradient is given later in this paper.

(ii) Compile these subroutines with the source code t o obtain an executable module.

(iii) Provide a t least one of t h e following additional d a t a files:

-

algorithm control file (used only in t h e non-interactive option)

-

p a r a m e t e r Ale (used only in t h e interactive option)

-

initial data Ale (should always be present)

All of these files a r e described in some detail l a t e r in t h e paper.

The optimization process c a n then begin. The program first asks t h e u s e r a series of questions regarding the required mode (interactive or automatic), method of step size regulation, choice of s t e p direction. etc. These questions appear on t h e monitor a n d should be answered from t h e keyboard or by refer- ence t o a d a t a file. We shall represent t h e dialogue a s follows:

Question? Answm

with t h e user's response given in italics. The first question is Interactive mode? reply yes or no yes/no

To choose t h e interactive option the u s e r should type in yes (or y); to select t h e automatic option h e should answer no (or n). In t h e l a t t e r c a s e t h e program would ask no f u r t h e r questions, but would read all t h e necessary information from t h e algorithm control file (which is usually numbered 2

-

u n d e r UNrX con- ventions its n a m e is fort.2). The iterative process would t h e n begin, terminat- ing after 10,000 iterations if no other stopping criterion is fulfilled. The algo- r i t h m control file m u s t contain answers t o all of t h e following questions except those concerned either with dialogue during t h e iterations or with t h e parame- t e r f3le (such questions a r e marked with a n asterisk below). This file is given a n a m e only for ease of reference

-

t h e important thing for t h e u s e r i s its number.

Assume now t h a t t h e user has chosen t h e interactive option by answering yes t o the first question. The program then asks

parameter f lle? (number)

The user should respond either with the number of t h e file of default pararne- t e r s or with t h e n u m b e r of the file in which t h e c u r r e n t values of t h e algorithm parameters a r e stored. The file of default p a r a m e t e r s is provided with t h e pro- gram and has t h e n a m e fort.12 (under UNlX conventions); t h u s , to refer t h e program t o t h e default file the user should answer 12. The purpose of this file is to help t h e user to s e t t h e values of algorithm p a r a m e t e r s in t h e ensuing dialo- gue and also t o s t o r e such improved values a s may be discovered by the u s e r

t h r o u g h t r i a l a n d error. I f t h e u s e r assigns t h e a l g o r i t h m p a r a m e t e r s any values o t h e r t h a n those in t h e default file. t h e new values become t h e default values in s u b s e q u e n t r u n s of t h e program. This Ale is optional.

The p r o g r a m t h e n asks

read parameter file? reply yes or no y e s / n o

The answer y e s implies t h a t t h e file specified i n t h e previous question exists, a n d t h a t default p a r a m e t e r values a r e s t o r e d i n this file. In t h i s case, when asking t h e u s e r about p a r a m e t e r values. t h e p r o g r a m will read t h e default option in t h e p a r a m e t e r file a n d r e p r o d u c e i t on t h e s c r e e n together w i t h t h e question. If t h e u s e r a c c e p t s t h i s default value he s h o u l d respond with 0 (zero);

otherwise h e should e n t e r his own value, which will become t h e new default value.

The a n s w e r no m e a n s t h a t no default values a r e available a t t h e m o m e n t . In this c a s e t h e program will form a new default file (labeled with the n u m b e r given a s a n answer t o t h e previous question); i t s c o n t e n t s will be based on t h e u s e r ' s answers t o f u t u r e questions. This new default file, once formed, c a n be u s e d i n s u b s e q u e n t r u n s .

The n e x t question is

number of variables? ( n u m b e r )

t o which t h e u s e r should respond with t h e dimension of t h e vector of variables

2 . He is t h e n asked

Initial data file? (nurn b e r )

a n d should reply with t h e n u m b e r of t h e initial d a t a file. This file should con- tain t h e following e l e m e n t s (in exactly this order):

-

The initial point, which should be a s e q u e n c e of n u m b e r s s e p a r a t e d by c o m m a s or o t h e r delimiters.

-

_Any additional d a t a required by s u b r o u t i n e s U F or TJG if s u c h d a t a exists a n d t h e u s e r chooses t o p u t it in t h e initial d a t a file (optional).

-

Information about t h e c o n s t r a i n t s (described in m o r e detail below) The p r o g r a m t h e n asks

step size regulation? is

Here is is a positive i n t e g e r from t h e s e t 11,2.3,4,6,7{, where the diflerent values of is correspond to different ways of choosing t h e s t e p size. (The i n t e g e r 5 is r e s e r v e d for a n option c u r r e n t l y u n d e r development.)

1 Adaptive automatic s t e p size regulation (24) based on algorithm perfor- m a n c e function (22) and function estimate (18).

2 Manual s t e p size regulation based on algorithm performance function (22) and function estimate (18).

3 Adaptive automatic s t e p size regulation (24) using algorithm perfor- m a n c e measure (22) a n d a function estimate based on a finite number of

A direct observation of a stochastic quasigradient is available For and t h e user has t o specify a subroutine UG to calculate it:

where

C(N)

is an observation of a stochastic quasigradient.

2 Central finite-difference approximation of the gradient as in (11).

3 The

pvs

a r e c a l c u l a t e d using random s e a r c h t e c h n i q u e s ( 1 2 ) .

4 Forward finite-difference approximation of t h e initial observations

pms

^as

in ( 1 0 ) .

5 Central finite-difference approximation of t h e g r a d i e n t a s i n ( 1 1 ) . All observations of t h e function used in one observation of

raS

a r e made with t h e s a m e values of random p a r a m e t e r s o.

6 The

F"

a r e c a l c u l a t e d using random s e a r c h t e c h n i q u e s ( 1 2 ) . All obser-

vations of t h e function used in one observation of a r e m a d e with the s a m e values of r a n d o m p a r a m e t e r s o.

7 Forward finite-difference approximation of t h e initial observations

pas

^as

in (10). All observations of the:function u s e d in o n e observation of

pus

a r e m a d e with t h e s a m e values of random p a r a m e t e r s o.

Note t h a t for id1

=

5 6 . 7 all observations of t h e function u s e d in one observation of a r e m a d e with t h e s a m e values of random p a r a m e t e r s o. In this case t h e u s e r should write a function UF which supports this f e a t u r e a s follows:

FVNCTION UF(N,X) DIMENSION X(N) COMMON/OMEG/LO,MO If LO= 1 a n d MO= 1 t h e n obtain new values of r a n d o m factors o a n d s e t MO=O.

Make a n observation of t h e function a t point z.

R E T U R N

END

The second figure i d z d e t e r m i n e s t h e point a t which observations a r e made:

i d 2 Definition

1 The initial direction i s calculated a t t h e c u r r e n t point zS

2 The initial direction is calculated a t a point chosen randomly from among those in t h e neighborhood of t h e c u r r e n t point zS

The value of id3 deflnes t h e way in which t h e s t e p in a finite-difference or ran- dom s e a r c h approximation of

p"

is chosen:

id3 Definition

1 The approximation s t e p is Axed. The observations of t h e objective func- tion a t point zs originally used t o obtain g r a d i e n t observations

PnS

^{a r e}

not u s e d to update t h e e s t i m a t e of t h e function employed f o r s t e p size regulation.

2 The r a t i o b , / p s of t h e s t e p in t h e Bnite-&fference approximation t o t h e s t e p size of t h e algorithm i s fixed (see (10)-(12)). The observations of t h e objective function a t point z S originally used to obtain g r a d i e n t observations

pas

a r e n o t used t o u p d a t e t h e e s t i m a t e of t h e function employed for s t e p size regulation.

3 The approximation s t e p i s fixed. The observations described for id3

=

1,2 above are u s e d t o update t h e c u r r e n t e s t i m a t e of t h e objective function.

The r a t i o 6 , / p s of t h e s t e p in t h e finite difference approximation t o t h e s t e p size of t h e algorithm i s fixed (see (10)-(12)). The observations described for id3

=

1,2 above are u s e d t o update t h e c u r r e n t e s t i m a t e of t h e objective function.

The f o u r t h figure id4 defines t h e type of averaging used t o obtain f r o m obser- vations

p".

id4 Definition

1 No averaging,

P = -.

t t l S , i

=

2 Number of samples

>

The value of id5 specifies t h e way in which t h e final s t e p direction u s is obtained f r o m previous values of u s a n d from

r .

id5 Definition

1 No previous information is used. The final vector v S is simply s e t equal t o

p .

2 (9) is used.

Im Dokument Stochastic Quasigradient Methods and their Implications (Seite 22-28)