Introduction - Distributed methods for convex optimisation : application to cooperative wireles

The developments reported in this chapter include:

the formulation of standard forms for the convex stochastic optimisa-tion problem and for the stochastic NUM problem (Problem 5.2 and the discussions of Sections 5.2.2 and 5.2.3),

the suggestion of a setup which allows one to show the convergence of some optimisation methods in stochastic environments (i.e. Condi-tions 5.1, 5.2, 5.3, DeniCondi-tions 5.1, 5.2, and associated results such as Lemmas 5.1, 5.2, Proposition 5.1, and Result 5.1),

the suggestion of a stochastic optimisation algorithm based on scaled gra-dient projections (Algorithm 5.1 and Denition 5.4) and of a distributed, cyclic version of the algorithm (Algorithm 5.2 and Denition 5.5), discussions and results on the conditions of convergence of the

above-mentioned algorithms (Lemma 5.3, Proposition 5.2, Assumption 5.2, Results 5.2, 5.3, and 5.4).

5.1. Introduction 117

is extremely sensitive to the choice of these step-sizes. It was explained in Chapter 4 that step-size selection is a well-known issue of gradient methods in general, for which it is widely accepted that `o-line' step-size policies, i.e.

step-size sequences decided before running the optimisation algorithm, are in-ecient over the full length of the optimisation process. Even the supposed asymptotically optimal step size policy of stochastic approximation methods that can be implemented using either a priori unknown information such as the curvature of the function, or a more sophisticated technique called av-eraging, might be harmful in the beginning of the optimisation process and perform poorly in practice.

In the deterministic optimisation problems where the studied function is known at each point of the feasible set, the convergence of gradient meth-ods can be considerably accelerated using `on-line' step-size policies based on established techniques discussed in Section 3.4.4. These techniques in-clude step-size selection by approximate line-search, which ensures monotonic convergence to a solution, and the second-order scaling of the displacement directions (as in the Newton method). In the stochastic optimisation problem, however, the unavailability of an expression for the considered function makes it dicult to combine scaling techniques with stochastic approximation algo-rithms without introducing a large amount of a priori information (e.g. the curvature of the unknown function). One objective of this chapter is to extend some ecient `deterministic' optimisation techniques to a class of stochastic optimisation problems frequently treated by stochastic approximation meth-ods, though enjoying useful properties unexploited by procedures of the type Robbins-Monro. We are concerned in particular with the dual of the problem of convex stochastic optimisation, where the (possibly twice) continuous dif-ferentiability of the random `measurements' of the dual function potentially allows for an accelerated optimisation process.

In the present chapter, it is assumed that a sequence of models for the unknown function can be generated randomly via simulation or measurement, and that this sequence converges in some sense and with probability one to the considered function, sometimes called the true function. Taking the sequence of models as references, it is then possible to apply ecient iterative algorithms on the models in place of the true function. The particularity of this approach is that the process of optimisation of the considered function is potentially accelerated based on properties of inexact models for the function. Exploring the relevance of the approach and deriving conditions of convergence to a solution of the problem are the objectives of the developments of this chapter.

An early approach of the problem of stochastic optimisation based on con-vergent sequences of models for the function, suggested in [War90] and

re-ported in Appendix D.4.1, considers stochastic optimisation algorithm proto-types coupled with a given model learning techniquetypically the compu-tation at each step of the expeccompu-tation of a growing, predetermined number of Monte-Carlo simulations, and derives sucient conditions for such algo-rithms to converge on compact sets in some nonstandard fashion. Also, these convergence conditions are concerned with statistical properties of the stochas-tic optimisation algorithms and the convergence results based on probabilisstochas-tic inference.

In this study we follow a dierent approach, where each candidate optimi-sation method is regarded as a mapping taking as arguments a function model and a point of the feasible set, and yielding a new point (or a set of points) of the feasible set. A stochastic optimisation algorithm is then modeled as the combination of such a mapping with a given sequence of function models. The herein suggested sucient conditions for convergence strictly rely on structural properties of the mappings considered independently of the model generation procedures. Our results are derived by set and function analysis and ex-tend to any model learning procedure capable of generating, with probability one, sequences of models converging in some sense to the true function (e.g.

compact convergence, or compact convergence of the gradients). A similar approach of the stochastic optimisation problem was considered in the frame-work of [SW96] reported in Appendix D.4.2, where conditions for convergence in stochastic environments are derived for some optimisation mappings which satisfy a certain closure property. Although this closure property is known to hold for steepest gradient methods combined with exact line search, it is gen-erally not met in practice by the class of scaled gradient projection methods explored in Chapter 4. The interest of the convergence conditions provided in the present work is that they extend to gradient projection methods under certain conditions which are discussed in this chapter.

Outline and notation

Section 5.2 formulates a convex instance of the stochastic optimisation prob-lem. We discuss the particular case of separable stochastic networks and recall the principles of the stochastic approximation methods. The approach of basing the constraint optimisation of a function only measurable through stochastic observations on the construction of a sequence of models for the function is introduced in Section 5.3. A stochastic optimisation setup is then provided in the shape of conditions for the convergence of optimisation map-pings. In Section 5.4, the suggested setup is successively tested on a scaled gradient projection algorithm and on a cyclic implementation of the algorithm.

The previous notations related to vectors and matrices still hold for this

Im Dokument Distributed methods for convex optimisation : application to cooperative wireless sensor networks (Seite 129-132)