N ~ IIV(z) - Okl12 - Stochastic quasigradient methods. Numerical techniques for stochastic opti

~----~---~---Stochastic Optimization Problems (b) possibilities of choosing P8 as a function of (xo, ... ,x⁸⁾ in order to

decrease the value of the objective function (6.82). It can be done by using adaptive ways of choosing P8 (interactive or automatic), as it is described in Section 6.6. It leads to different nonlinear estimations of z· in contrast to the estimate (6.84) which is the linear function of observations.

Problems of estimation of the moments

EOt,EIOjt,E(O -EO)t

may also be formulated as minimization problems

FlO

(x)

Ilx -

_{otll2 ,F2}o

(x)

Ellx -loltI12,

(x)

Ellx -

(0 - EO)tI12, where for the sake of simplicity we denote

ot =

(O~,... ,O~),IOlt

= (IOslt, ... ,IOnl

^l^),(0 -

RO)t

((OI-EOI)t, ... ,(On-EOn)l).

The stochastic gradients of these functions are:

e~(B)⁼ 2(x8- (08+I)l),e~ ⁼ 2(x^S ^-

W+1I

^l^),

e~(B) ⁼ 2(x⁸^-

n ^W+

¹^- ⁰⁸

⁺

^k)).

k=l

(ii) Supp ose now that we have the information EO= V(z)lz=z.,

where V(z) is a given function and

z·

is an unknown vector. Then

z·

minimizes the function

EIIV(z) _011²•

(iii) Ifwe have information about the density

p(y,

z·) of

H (y,

z·)with a measure

JI(dy),

then it could be shown that z.. maximizes the function

E

lnp (x,O)=

J

^Inp ^(x,

y)p(y, Z·)JI(dy).

These problems are reformulations of well· known principles for the least square, i.e., minimization of the function

1 N

~

^{IIV(z) -}

Okl12

Stochastic Qua,igradient Method,

and maximum likelihood, i.e., maximization of the function 1 N

L

^lnp ^(x,OI:).

1:=1

179

It gives us a good opportunity to apply SQG methods.

The above mentioned problems are the problems of purt" estimation. Very often the main reasons for estimation and identification are control or optimiza·

tion. In some cases, the task of optimization and estimation can be separated and optimization is performed after estimation. However, in the problems of adaptation it is usually necessary to optimize and estimate simultaneously. For instance, optimization cannot be separated from estimation if the observation of unknown parameters depends on the current value of the control variables.

Arising in such environment optimization task requires the development of a new optimization technique which have much in common with minimization of time-varying functions---the nonstationary optimization (see Section 6.4).

Consider an illustrative example--minimization of the differentiable func-tion

FO(x)

= "p(x,z·), x

ERn

where z· is a vector of unknown parameters. At each iteration 8

=

^{0,1, ...,}

an observation()8 is available which has the form of a direct observation of the parameter vector z·:

E()8

=

z· , 8

=

0,1, ... (6.85)

The problem is to create a sequence _{X8}~O which converges to the set of optimal solutions. Note that FO(x) cannot be opt,imized directly because of the unknown parameters

z·.

However, at iteration8 we could obtain a statistical estimate Z8 such that Z8 --+z· with probability 1 and a sequence of functions

FO(X,8) =

"p(X,Z8) such that

FO (x,

8)

^--+FO (x) with probability 1 for 8--+ 00.

Let us notice that at iteration ⁸ only the function

FO(X,8)

is available.

Therefore we led to the procedures of the nonstationary optimization

8+1 8 F^O

(s) - °

x = x - Ps :r X, 8 , 8 - , , •••

F~(z,8) = "p:r(x,Z8).

(6.86)

In the case of stochastic progranuning problems z· may corresp ond to the vector of unknown parameters of the probability measure P("

z·)

"p(x,z·) = J fO(x,w)P(dw,z·).

180 Stochast£c Opt£mization Problems If{ZS} is a sequence of estimates ZS -. z· with probability 1, then we led to the following type procedures

x⁸+¹= Z8 - P8eO(B), where eO (B) is an estimate of the_F~(z,B) at x = Z8,

FO(Z,B)

=

tf;(Z,Z8)

= J fO(x,w)P(dw,Z8).

For instance, similar to the Section 6.7,

eO(B)=

f2(Z8,W 8),

where w⁸ is an independent of the Bs sample of the w drawn from the non-stationary distribution

P(.,Z8).

We can also use more complicated estimates (similar to (6.52), (6.53)) More difficult problems arise when 8⁸^, B = 0,1, ...

are not direct observations of the vector

z·.

In other words, if, instead of the relationship (6.85), we have the following (see [20], [7"5], [7"6]).

E{8^S

/x

⁸^} =

p(x

⁸^,

z·),

which may depend on the current approximate solution Z8. Since we do not know Z8 in advance, then the (6.86) type procedure that directly solves an optimization problem and simultaneously estimates the z· is needed again.

References

II]

Yu.M. Ermoliev and Z.V. Nekrylova, "The Method Stochastic Subgradients and its Applications". Notes, Seminar on the Theory of Optimal Solution.

Academy of Sciences of the U.S.S.R., Kiev (1967).

[2] Yu.M. Ermoliev and N.Z. Shor, "Method of random walk for two-stage problem and its generalization", K£bernet£ka, 1(1968).

13] Yu.M. Ermoliev, "On the stochastic quasi-gradient method and stochastic quasi·Feyer sequences", K~'bernet£ka,2(1969).

14] Yu.M. Ermoliev, "General problem of stochastic programming", Kiber-netma, 8(1971).

15] Yu.M. Ermoliev,Stochash'c Programming Methods. Moscow: Nauka (1976)., [6] A.M. Gupal and L.G. Bajenov, "Stochastic linearization", K£berneh'ka,

1(1972).

17] Yu.M. Ermoliev and A.I. Jastremskiy, Stochastic Models and Methods £n Econom£c Plann£ng. Moscow: Nauka(1979).

[8] A.M. Gupal, "On the stochastic programming problem with constraints" , K£bernetika, 6(1974).

[9] A.M. Gupal, "Methods of almost-differentiable function minimization", K£bernet£ka, 1(1977).

Stochastic Quasigradient Methods 181 [10J A.M. Gupal,Stochastic Methods of Nonsmooth Optimization. Kiev:

Nauko-va Dumka(1979).

[l1J E.A. Nurminski, ~Convergenceconditions of algorithms of stochastic pro-gramming", Kibernetika, 3(1973).

[12] E.A. Nurminski, "Quasigradient method for solving problems on nonlinear programming", Kibernetika, 1(1973).

[13] E.A. Nurminski, Numerical Methods for Solving Deterministic and Sto-chastic Minimax Problems. Kiev: Naukova Dumka(1979).

[14] Yu.M. Ermoliev and E.A. Nurminski, "Limit extremal problems", Kiber-netika, 1(1973).

[15] A.M. Gupal, "Optimization method for problems with time-varying func-tions", Kibernetika, 2(1974).

[16] E.A. Nurminski, "The problem of nonstationary optimization", Kiber-netika, 2(1977).

[17] P.I. Vertchenko, "Limit Extremum Problems of Stochastic Optimization", Abstract of dissertation, Press of the Institute of Cybernetics, Kiev (1977).

[18] A.A. Gaivoronskiy, _~ethods of Stochastic Nonstationary Optimization", Collection: Operations Research and Systems ReliabiJity. Press of the Institute of Cybernetics, Kiev(1978).

[19] A.A. Gaivoronskiy, "Nonstationary stochastic programming probJems", Kibernetika, 4(1978).

[20] A.A. Gaivoronskiy and Yu.M. Ermoliev, "Stochastic optimization and si-multaneous parameter estimation", Izvestia Academii Nauk SSSR, Tech-nischeskaj Kibernetika, 4(1979).

[21] E.A. Nurminski and P.I. Verchenko, "On a convergence of saddle-point algorithms", Kibernetika, 3(1977).

[22] A.N. Golodnikov, "Finding of Optimal Distribution Function in Stochastic Programming Problems" , Abstract of dissertation, Institute of Cybernetics press, Kiev(1979).

{23] Yu.M. Ermoliev and Yu.M. Kaniovskiy, "Asymptotic behavior of stochastic programming methods with permanent step-size multiplier", USSR,

Com-putational Mathematt'cs and mathematical Physics, 2(1979).

[24] Yu.M. Kaniovskiy, P.S. Knopov, and Z.V. Nekrylova, Limiting Theorems of Stochastic Programming ProceBBes. Kiev: Naukova Dumka(1980).

[25] L.G. Bajenov and A.M. Gupal, "Stochastic analog of conjugate gradients method", Kibernetika, 1(1972).

[26] A.M. Gupal, "Stochastic method of feasible directions of nondifferentiable optimization", Kt'bernetika, 2(1978).

[27] Yu.M, Ermoliev, ~Stochasticmodels and methods of optimization", Kt'ber-nehka, 4(1975).

[28] Yu.M. Ermoliev and E,A. Nurminski, "Extremum Problems in Statistics and Numerical Methods of Stochastic Programming". Collection: Some Problems of Systems Control and Modeling. Press of the Institute of Math-ematics, Ukrainian Academy of Sciences, Kiev(1973)_

182 Stochastic Optimizat$'on Problems [29] Yu.M. Ermoliev, "The stochastic problem of optimal control",K$'bernetika,

1(1972).

[30] sYuM. Ermoliev and A.M. Gupal, "The linearization method in nondif·

ferentiable optimization", Kiberneh"ka, 1(1978).

131] A.M. Gupal and V.P. Norkin, "Method of discontinuous optimization", Kibernetika,2(1977).

[32] Yu.M. Errnoliev and E.A. Nurminski, "Stochastic quasigradient algorithms for minimax problems". Edited by M. Dempster. Proceedings

0/

the In-ternational Conference on Stochastic Programming. London: Academic Press(1980) .

[33] Yu.M. Ermoliev, V.P. Gulenko, and T.I. Tsarenko, Finite-Difference Meth-od in Optimal Control. Kiev: Naukova Dumka(1978).

[34] N.Z. Shor, "Application of the Gradient Method for the Solution of Net·

work Transportation Problems". Notes, Scientific seminar on theory and applications of cybernetics and operations research. Kiev: Academy of Sciences USSR(1962).

[35] YuM. Ermoliev, "Methods of solution of nonlinear t'Xtremal problems", Kibernetika 4(1966).

[36] B.T. Poljak, "A general method for solving extremal problems", Soviet Mathematic Doklady, 8(1967).

[37] P. Wolfe and M.L. Balinski (eds.), Nond$fferentiable Optimization. Math·

ematical Programming Study 3, North·Holland Publishing Co(1975).

[38J N.Z. Shor, The Methods

0/

Nondifferentiable Optimizah'on and their Ap-plications,Kiev: Naukova Dumka(1979).

[39J L.G. Bajenov, "Convergence conditions of almost· differentiable function minimization", Kibernetika, 4(1972).

[40J B.T. Poljak, "Convergence and rate of convergtnce of iterative stochastic algorithms", Augomatic and Remote Control, 12(1976).

[41J V.Ya. Katkovnik, Linear Estimat$'on and Stochastic Optim$zation Prob-lems,Moscow: Nauka(1976).

[42] L.A. Rastrigin,Eztremal Control Systems. Moscow: Nauka(1974).

[43] V.V. Fjedorov, Numerical Methods

0/

Mazmin Problems,Moscow: Nauka (1979).

[44] B.T. Poljak, "Nonlinear programming methods in the presence of noise", Mathematical Programming, 14(1978).

[45] M.T. Wasan, "Stochastic Approximations", Cambridge Transactions in Math. and Math. Phys. 58. Cambridge: Cambridge University Press (1969).

[46] V. Fabian, "Stochastic Approximation of Constrained Minima" . In: Trans-actions

0/

the 4th Prague Conference on In/ormation Theory, Statistical Decision Functions and Random Processes, 1965, (Prague, 1967).

[47] H.J. Kushner, "Stochastic approximation algorithms for constrained opti.

mization problems" ,Annals

0/

Statistics, 2 (4)( 1974).

[48] H.J. Kushner and T. Gavin, "Stochastic approximation type methods for

Im Dokument Stochastic quasigradient methods. Numerical techniques for stochastic optimization (Seite 196-200)

N ~ IIV(z) - Okl12