The decision as t o whether (and how) to change t h e s t e p size may be based on t h e values of the scalar product of adjacent s t e p directions. If we have
(P-l,P) >
0, t h e n this may be a sign t h a t regular behavior prevails over sto- chastic behavior, t h e function is decreasing in t h e s t e p direction and t h e s t e p size should be increased. Due to stochastic effects t h e function will very often increase r a t h e r t h a n decrease, but in t h e long r u n t h e number of bad choices will be less t h a n t h e number of c o r r e c t decisions. Analogously. if this inequal- ity does not hold then t h e s t e p size should be decreased. The rule for changing t h e step size is t h u s basically a s follows:where t h e values of al, a2, a3 (recommended values a l
-
0.4-0.8 , 1<
a 2 < 1.3 and 0.7s
a3<
1) should be chosen before starting t h e iterations. It is also advisable t o have upper and lower bounds on t h e step size t o avoid divergence.Sometimes i t is convenient t o normalize the vectors of s t e p directions, i.e.,
il =
1. The lower bound may decrease a s the iterations proceed. This method may also be applied to t h e choice of a vector s t e p size, treating some (or all)variables or groups of variables separately. A number of different methods based on the use of scalar products of adjacent s t e p directions to control the s t e p size have been developed by Uriasiev [19], Pflug [16]. a n d Ruszczynski and Syski [20].
The interactive stochastic optimization package implemented a t IIASA (STO) is based on t h e s a m e ideas as the package for stochastic and nondifferentiable optimization developed in Kiev (NDO). It allows t h e user t o choose between interactive and automatic modes and makes available t h e s to- chastic quasigradient methods described in Sections 2 a n d 3. In t h e interactive mode t h e program offers the user t h e opportunity to change the step parame- t e r s and the methods by which t h e step size and step direction a r e chosen dur- ing the course of the iterations. The user can also stop the iterative process a n d obtain a more precise e s t i m a t e of the value of the objective function before continuing. The package is written in FORTRAN-77.
Before initiating t h e optimization process the user h a s to:
(i) Provide a subroutine UF which calculates the value of function f ( z , o ) for fixed z and w and. optionally, a subroutine UG which computes t h e gra- dient f,(z,w) of this function; t h e function evaluation subroutine should be of the form:
FVNCl"I0N UF(N,X) DIMENSION X(N) Calculation of
J
(z,w)RETURN
END
Here N is the dimension of t h e vector of variables
X
(Note t h a t the imple- mentation on the IIASAVAX
actually requires the subroutine to be entered in lower-case letters r a t h e r than capitals.) A description of the subroutine which calculates a quasigradient is given later in this paper.(ii) Compile these subroutines with the source code t o obtain an executable module.
(iii) Provide a t least one of t h e following additional d a t a files:
-
algorithm control file (used only in t h e non-interactive option)-
p a r a m e t e r Ale (used only in t h e interactive option)-
initial data Ale (should always be present)All of these files a r e described in some detail l a t e r in t h e paper.
The optimization process c a n then begin. The program first asks t h e u s e r a series of questions regarding the required mode (interactive or automatic), method of step size regulation, choice of s t e p direction. etc. These questions appear on t h e monitor a n d should be answered from t h e keyboard or by refer- ence t o a d a t a file. We shall represent t h e dialogue a s follows:
Question? Answm
with t h e user's response given in italics. The first question is Interactive mode? reply yes or no yes/no
To choose t h e interactive option the u s e r should type in yes (or y); to select t h e automatic option h e should answer no (or n). In t h e l a t t e r c a s e t h e program would ask no f u r t h e r questions, but would read all t h e necessary information from t h e algorithm control file (which is usually numbered 2
-
u n d e r UNrX con- ventions its n a m e is fort.2). The iterative process would t h e n begin, terminat- ing after 10,000 iterations if no other stopping criterion is fulfilled. The algo- r i t h m control file m u s t contain answers t o all of t h e following questions except those concerned either with dialogue during t h e iterations or with t h e parame- t e r f3le (such questions a r e marked with a n asterisk below). This file is given a n a m e only for ease of reference-
t h e important thing for t h e u s e r i s its number.Assume now t h a t t h e user has chosen t h e interactive option by answering yes t o the first question. The program then asks
parameter f lle? (number)
The user should respond either with the number of t h e file of default pararne- t e r s or with t h e n u m b e r of the file in which t h e c u r r e n t values of t h e algorithm parameters a r e stored. The file of default p a r a m e t e r s is provided with t h e pro- gram and has t h e n a m e fort.12 (under UNlX conventions); t h u s , to refer t h e program t o t h e default file the user should answer 12. The purpose of this file is to help t h e user to s e t t h e values of algorithm p a r a m e t e r s in t h e ensuing dialo- gue and also t o s t o r e such improved values a s may be discovered by the u s e r
t h r o u g h t r i a l a n d error. I f t h e u s e r assigns t h e a l g o r i t h m p a r a m e t e r s any values o t h e r t h a n those in t h e default file. t h e new values become t h e default values in s u b s e q u e n t r u n s of t h e program. This Ale is optional.
The p r o g r a m t h e n asks
read parameter file? reply yes or no y e s / n o
The answer y e s implies t h a t t h e file specified i n t h e previous question exists, a n d t h a t default p a r a m e t e r values a r e s t o r e d i n this file. In t h i s case, when asking t h e u s e r about p a r a m e t e r values. t h e p r o g r a m will read t h e default option in t h e p a r a m e t e r file a n d r e p r o d u c e i t on t h e s c r e e n together w i t h t h e question. If t h e u s e r a c c e p t s t h i s default value he s h o u l d respond with 0 (zero);
otherwise h e should e n t e r his own value, which will become t h e new default value.
The a n s w e r no m e a n s t h a t no default values a r e available a t t h e m o m e n t . In this c a s e t h e program will form a new default file (labeled with the n u m b e r given a s a n answer t o t h e previous question); i t s c o n t e n t s will be based on t h e u s e r ' s answers t o f u t u r e questions. This new default file, once formed, c a n be u s e d i n s u b s e q u e n t r u n s .
The n e x t question is
number of variables? ( n u m b e r )
t o which t h e u s e r should respond with t h e dimension of t h e vector of variables
2 . He is t h e n asked
Initial data file? (nurn b e r )
a n d should reply with t h e n u m b e r of t h e initial d a t a file. This file should con- tain t h e following e l e m e n t s (in exactly this order):
-
The initial point, which should be a s e q u e n c e of n u m b e r s s e p a r a t e d by c o m m a s or o t h e r delimiters.-
Any additional d a t a required by s u b r o u t i n e s U F or TJG if s u c h d a t a exists a n d t h e u s e r chooses t o p u t it in t h e initial d a t a file (optional).-
Information about t h e c o n s t r a i n t s (described in m o r e detail below) The p r o g r a m t h e n asksstep size regulation? is
Here is is a positive i n t e g e r from t h e s e t 11,2.3,4,6,7{, where the diflerent values of is correspond to different ways of choosing t h e s t e p size. (The i n t e g e r 5 is r e s e r v e d for a n option c u r r e n t l y u n d e r development.)
1 Adaptive automatic s t e p size regulation (24) based on algorithm perfor- m a n c e function (22) and function estimate (18).
2 Manual s t e p size regulation based on algorithm performance function (22) and function estimate (18).
3 Adaptive automatic s t e p size regulation (24) using algorithm perfor- m a n c e measure (22) a n d a function estimate based on a finite number of
A direct observation of a stochastic quasigradient is available For and t h e user has t o specify a subroutine UG to calculate it:
where
C(N)
is an observation of a stochastic quasigradient.2 Central finite-difference approximation of the gradient as in (11).
3 The
pvs
a r e c a l c u l a t e d using random s e a r c h t e c h n i q u e s ( 1 2 ) .4 Forward finite-difference approximation of t h e initial observations
pms
asin ( 1 0 ) .
5 Central finite-difference approximation of t h e g r a d i e n t a s i n ( 1 1 ) . All observations of t h e function used in one observation of
raS
a r e made with t h e s a m e values of random p a r a m e t e r s o.6 The
F"
a r e c a l c u l a t e d using random s e a r c h t e c h n i q u e s ( 1 2 ) . All obser-vations of t h e function used in one observation of a r e m a d e with the s a m e values of r a n d o m p a r a m e t e r s o.
7 Forward finite-difference approximation of t h e initial observations
pas
asin (10). All observations of the:function u s e d in o n e observation of
pus
a r e m a d e with t h e s a m e values of random p a r a m e t e r s o.
Note t h a t for id1
=
5 6 . 7 all observations of t h e function u s e d in one observation of a r e m a d e with t h e s a m e values of random p a r a m e t e r s o. In this case t h e u s e r should write a function UF which supports this f e a t u r e a s follows:FVNCTION UF(N,X) DIMENSION X(N) COMMON/OMEG/LO,MO If LO= 1 a n d MO= 1 t h e n obtain new values of r a n d o m factors o a n d s e t MO=O.
Make a n observation of t h e function a t point z.
R E T U R N
END
The second figure i d z d e t e r m i n e s t h e point a t which observations a r e made:
i d 2 Definition
1 The initial direction i s calculated a t t h e c u r r e n t point zS
2 The initial direction is calculated a t a point chosen randomly from among those in t h e neighborhood of t h e c u r r e n t point zS
The value of id3 deflnes t h e way in which t h e s t e p in a finite-difference or ran- dom s e a r c h approximation of
p"
is chosen:id3 Definition
1 The approximation s t e p is Axed. The observations of t h e objective func- tion a t point zs originally used t o obtain g r a d i e n t observations
PnS
a r enot u s e d to update t h e e s t i m a t e of t h e function employed f o r s t e p size regulation.
2 The r a t i o b , / p s of t h e s t e p in t h e Bnite-&fference approximation t o t h e s t e p size of t h e algorithm i s fixed (see (10)-(12)). The observations of t h e objective function a t point z S originally used to obtain g r a d i e n t observations
pas
a r e n o t used t o u p d a t e t h e e s t i m a t e of t h e function employed for s t e p size regulation.3 The approximation s t e p i s fixed. The observations described for id3
=
1,2 above are u s e d t o update t h e c u r r e n t e s t i m a t e of t h e objective function.The r a t i o 6 , / p s of t h e s t e p in t h e finite difference approximation t o t h e s t e p size of t h e algorithm i s fixed (see (10)-(12)). The observations described for id3
=
1,2 above are u s e d t o update t h e c u r r e n t e s t i m a t e of t h e objective function.The f o u r t h figure id4 defines t h e type of averaging used t o obtain f r o m obser- vations
p".
id4 Definition
1 No averaging,
P = -.
t t l S , i=
1.2 Number of samples
>
1.The value of id5 specifies t h e way in which t h e final s t e p direction u s is obtained f r o m previous values of u s a n d from
r .
id5 Definition
1 No previous information is used. The final vector v S is simply s e t equal t o
p .
2 (9) is used.