The Experimental Design of an Observational Network: Optimization Algorithms of an Exchange Type

(1)

Working Paper

The

Erperimental Design of an Observational N e t w o r k Optimization Algorithms of the Rchfmge Type

K K Fedorov

October 1986 WP-86-62

International Institute for Applied Systems Analysis

A-2361 Laxenburg, Austria

(2)

NOT FOR QUOTATION WITHOUT THE PERMISSION OF THE AUTHOR

The Experimental

Design

of an Observational

Network:

Optimization Algorithms of the Exchange Type

K K Fedorov

Ootober 1986 WP-86-62

Working Pbpers a r e interim reports on work of the International Institute f o r Applied Systems Analysis and have reoeived only limited review. Views o r opinions expressed herein do not neoessarily r e p r e s e n t those of t h e Institute or of i t s National Member Organizations.

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS _ANALYSLS 2361 Laxenburg, Austria

(3)

Preface

For many years, designers of environmental monitoring systems have faced t h e problem of optimal allocation of resources f o r observational networks (see, f o r instance, Munn, 1981): where, how frequently and what characteristics have to be measured or observed in o r d e r to obtain data that will be sufficient f o r prog- noses o r warnings. From t h e 1970's t o the early 19809s, a number of heuristic ap- proaches appeared in t h e "environmental" literature. Most of them are based on t h e analysis of space and time correlation structures (usually historical time series are used f o r t h e i r estimation) of t h e observed entities with subsequent sieving to keep t h e less correlated (and hopefully most informative) observational points.

Different procedures f o r "sieving" have been used in applications: viz., for- ward and backward versions with various objective functions. These procedures have led to reasonably good results; however, no accurate mathematical analysis were undertaken.

In this present paper, a new approach to optimal allocation of a n observation- a l network is proposed and some iterative numerical procedures are considered.

The approach is essentially based on t h e theory of t h e optimal design of regression experiments (Ermakov, ed., 1983). Using t h e classical results from t h e moment spaces theory, t h e author investigates the properties of optimal allocations and the oonvergence of t h e numerical procedures to optimal solutions.

Prof. M. Antonovsky (Environment Program)

-

ⁱⁱⁱ

-

(4)

The Ekperimental

Design

of

an

Obsemational Network:

Optimization Algorithms of the

Exchange

_Type K K Fedorov

1. Introduction

In this p a p e r t h e numerical procedures of t h e "exchange" type f o r construction of continuous optimal designs with r e s t r i c t e d measures (see definitions in Fedorov, 1986, Wynn, 1982) were considered. The "exchange" type procedures were based on t h e simple heuristic idea: a t every subsequent s t e p t o delete 'bad"

(less informative) points and t o include "good" (most informative) ones.

Before giving t h e a c c u r a t e mathematical formulation of t h e problem and to ^il-

luminate t h e place of t h e r e s u l t s in experimental practice, let us start with two simple hypothetical examples. "Real" examples, where t h e considered approach seems t o be a p p r o p r i a t e can b e found, f o r instance, in Munn, 1981.

Ezample 1. Let X b e a n area where N observational stations have t o b e located. An optimal ( o r at least, admissible) location depends upon models describing a system: "object under analysis

-

observational techniques".

The regression models:

y i

=

q(zi ,9)+ci , i

=1,~

(1)

are commonly used in experimental practice. Here yi is a r e s u l t of an observation of t h e i -th station, q ( z , 6 ) is a n a p r i o r i given function, 9 is a vector of parame- ters t o b e estimated and ci is an e r r o r which one believes t o be random (more de- tailed specification will b e given later). The optimal location _of stations h a s t o provide t h e minimum of s o m e measure of deviation of estimates 6 from true values of 6.

For sufficiently l a r g e N t h e location of stations can b e approximately described by some distribution function #(&) and one needs t o find an ^optimal

#*(&). If X i s not uniform, then one comes to t h e r c s t r i c t i o r ~ Mrat t h e s h a r e N(AX)/ N of stations in any given p a r t AX cannot exceed some prescribed level. In terms of distribution functions, i t means t h a t

where i s defined by an experimenter. Here is t h e crucial feature of t h e problem considered in this paper.

Ezample 2. Let some c h a r a c t e r i s t i c yi be observed f o r members of a sample of size N. Every i -th member of this sample can b e chosen f r o m a group labelled by variables zi. If t h e sampling i s randomized, then t h e observed c h a r a c t e r i s t i c y i can b e described by s o m e distribution (y / zi , 9).

In many cases, a f t e r some manipulations, t h e initial model can b e reduced t o ( I ) , where q ( z i , 6 ) is an a v e r a g e c h a r a c t e r i s t i c of a n i -th group and ci r e f l e c t s a variation within this group. The size of any group (or number of units available f o r sampling) is normally bounded. When applied to a continuous version of t h e

(5)

design problem o n e oan easily r e p e a t t h e considerations of t h e previous example and come to model ( I ) , (2).

In what follows, i t will b e assumed t h a t in model ( I ) , (2):

-

^a response function i s a l i n e a r function of unknown p a r a m e t e r s , i.e.

q ( z , 9 )

=

gT) ( z ) , 9- a n d functions ) ( z ) are given;

-

errors E( are independent and E[rf]=l ( o r E [ E ~ ] = A ( z ~ ) , where X(z) i s known, t h i s case c a n b e easily transformed to t h e previous one).

A s usual, some objective function O defined on t h e s p a c e of m Xm information matrices

will d e s c r i b e t h e quality ( o r a c c u r a c y ) of a design

t

^(M-'(€) ^asa normalized variance-covarianoe matrix of t h e least s q u a r e estimators of p a r a m e t e r s 9.

The purpose of optimum design of experiments i s to find

Constraint (4) defines t h e peculiarity of t h e design problem with r e s p e c t to s t a n d a r d a p p r o a c h e s . Similar to t h e moment s p a c e s t h e o r y (compare with Krein and Nudelmann, 1 9 7 3 Ch. VII), a solution of (3) and (4) will be called "(0 , *)-optimal design". In p r a c t i c e , *(&) r e s t r i c t s t h e number of observations in a given s p a c e element dz ( s e e t h e examples).

Optimization problem (1) and (2) were considered by Wynn, 1982 and Gaivoron- sky, 1985. To some e x t e n t , t h e y t r a n s l a t e d a number of classical r e s u l t s from moment s p a c e s t h e o r y to experimental design language. Gaivoronsky also analyzed t h e convergence of t h e i t e r a t i v e p r o c e d u r e f o r optimal design construction based on t h e traditional idea of s t e e p e s t descent (see, f o r instance, Ermakov (ed), 1983, Wu a n d Wynn, 1976)

where

t

^{h a s}to satisfy (4) a n d some additional l i n e a r constraints:

J

~ ( z ) t ( & ) s c

.

X

Wynn briefly discussed a number of h e u r i s t i c numerical p r o c e d u r e s based on some r e s u l t s from t h e moment s p a c e s t h e o r y .

The main objective of t h i s p a p e r i s to consider t h e i t e r a t i v e p r o c e d u r e s of exchange t y p e which extensively use t h e n a t u r e of optimal designs f o r problem (3),

(4) and t h e r e f o r e promises to b e more efficient than t h e ones mentioned above.

General p r o p e r t i e s of optimal designs are discussed in Section 2. Section 3 d e a l s with formulation and basic analysis of t h e i t e r a t i v e p r o c e d u r e and i t s modifi- cations.

(6)

2. C h a r a c t e r i z n t i o n o f (+

,

+)-optimal Desigas

In t h i s section, t h e p r o p e r t i e s of optimal designs will b e discussed only to t h e e x t e n t sufficient f o r t h e analysis of t h e proposed i t e r a t i v e procedures. More de- tails c a n b e found in Wynn, 1982.

The set of assumptions used l a t e r is t h e following:

a ) X i s c o m p a c t , XER' ;

b) ^j( z ) ERm are continuous functions in X ;

c ) +(z ) i s atomless;

d ) t h e r e exists c

<-

such t h a t

E c ( + ) =

{t:+[M(#)] ^Sc

< -

, t ~ z ( + ) j

+ ⁴

^,

where Z(+) i s t h e set of designs satisfying (4);

e ) 0(M) i s a convex function of M ;

c') *(z ) h a s a continuous density q ( z ) ;

f') derivatives

-

80 =

k

e x i s t and are bounded f o r all designs satisfying (d).

8M

Let

z(+)

^{t o be a}^setof measures

t

which e i t h e r coincide with 9'o r equal t o 0.

Theorem 1. & a s s u m p t i o n s (a)

-

(e) hold, t h e n t h e r e e x i s t s an o p t i m a l d e s i g n

t

^*

€2 ^(4').

Proof. The existence of a n optimal design follows from (d)-(e) and t h e conr- pactness of the set of information matrices. The compactness of t h e latter

Ls

pro- vided by (a) and (b). The fact t h a t at l e a s t one optimal design h a s to belong :(*) is t h e oorollary of Liapounov's Theorem on t h e r a n g e of a vector measure (see, for instance, Karlin and Studden, 1966, Ch. VIII, Wynn, 1982).

Note 1. Liapounov's Theorem leads to a n o t h e r

- -

r e s u l t which can b e useful in applications: f o r any design

4

t h e r e i s a design ~ E Z ( + ) such t h a t M ( ~ ) = M ( Z ) .

A function q ( z , t ) is said to s e p a r a t e sets Xi and X2 if t h e r e is a constant C such t h a t p ( z , t ) S C (a.e. *) on XI and q ( z , t)ZC (a.e. +) on X2 , (a.e. *) means "al- most everywhere with r e s p e c t t o t h e measure q".

Theorem 2.

U

a s s u m p t i o n s (a)-V) hold, w e n a n e c e s s a r y a n d sl4;riEcient c o n d i t i o n that t 8 f E ( + ) is (+,+)-optimal is that v ( z , 4 ' ) s e p a r a t e s two sets:

X* =suppt0 and

x\x*.

This theorem w a s f i r s t formulated by Wynn, 1982; but i t s proof was not per- fect. Therefore, w e give t h e newer one which i s also more illuminative f o r t h e formulation and analysis of t h e numerical procedures.

Proof. Necessity.

Consider t w o designs:

to

and

t ~ ? ( + ) .

Let

(7)

Assume t h a t

#'

i s (*,*)-optimal. Then f o r any design

#

(see (f)):

o s /

cp(z,tS)

#(&I

X

From t h e definition of cp(z ,t):

and, t h e r e f o r e , f o r any E and D :

j t + j

^c ^p

*(&I.

E D

This p r o v e s necessity.

Sr4fSicisncy. Consider designs # * and

# ~ z ( + )

satisfying t o (7) and (8) and assume now t h a t

#*

i s nonoptimal, i.e.

Let 7 = ( 1 - a ) t * + a

#

,

a m

and

#

i s now ( 9 , +)-optimal. Then, t h e convexity of @ leads t h e n to t h e inequality:

@ ~ ~ ( 7 1 1 ^d( 1 - 4 9 C M ( ~ * ) I + a @ CM(#)I (10) r (1 -a) 9 [M(#*)]

+

a t@[M(#*)]-bj

=

9 [M(#*)]-a d

.

Assumption (f) and inequality ( 8 ) lead t o t h e inequality

@ [ M ( ~ ) I

=

4 [M(t*)l + a

j

cp(z

, t o )

#(&I + o ( a )

=

(11) X

2 @ CM(#*)I + 0 ( a )

where E a n d D d e s c r i b e t h e difference between t h e supporting sets f o r

t *

^and

#.

When a + , t h e comparison (10) and (11) gives a contradiction. This completes t h e proof.

Note 1. If instead of (c), one uses ( c ' ) , then a necessary and sufficient condi- tion can b e formulated in t h e form of t h e following inequality:

max cp ( z , t * ) ^d min cp ( z , t o )

2

a*

z

m x *

Note 2. If (f) i s complemented by (f), then

cp (z,#) = 7 ( z . 0 - t r 9 ( t ) M(t) , where b(z , t )

=

f' T ( z )

&

(4) f' (Z ), and (12) c a n b e converted to

max y ( z , t * ) ^d min v ( z , # ' ) z

a*

z

m x *

(8)

3. Numerical Procedure of Exchange Type

Theorem 2 gives a hint on how t o construct optimal designs numerically: if f o r some given design # one c a n find a couple of sets:

t h e n i t i s hoped t h a t t h e design

7

with

supp

7 =

s u m # \ D U E

will b e "better" than

#.

The repetitions of t h i s p r o c e d u r e c a n lead t o a n optimal design.

A number of algorithms based on t h i s idea c a n b e easily invented. In t h i s pa- p e r one of t h e simplest algorithms i s considered in detail and i t i s evident t h a t thorough consideration of o t h e r s from t h i s c l u s t e r i s r o u t i n e technique.

In what follows, t h e fulfillment of (c') i s assumed.

ALgorithm. Let

-

lim 6,

=

0 , lim

x

^6,

⁼ -

^{and lim}

^x

^6:

⁼

^k

^<- ^.

s +- s + - , = i ^S^+-

Step a. T h e r e is a design #,

EZ(+).

Two sets

D,

and E, with equal measures:

and including, correspondingly, points:

z

=

A r g max 6 ( z ,#, ) and zz,

=

A r g min 6 ( z ,

#,

) ,

+ + e m

where XI, ^=supp

t,

and Xzs =X\X1,, h a v e t o b e found.

Step b. The design

#,

with t h e supporting set

SUPP

ts

^{+ I}

=

Xl(S +1)

=

XIS \Ds

UES

i s constructed.

I t e r a t i v e p r o c e d u r e (14)-(16) i s based on t h e approximation (6+0):

The analysis of i t e r a t i v e p r o c e d u r e (14)-(16) becomes simpler if (g) f o r any design

#€z(+):

IM ( 0 1 2 ( > O

This assumption i s not v e r y r e s t r i c t i v e . If, f o r instance, $ ( z ) 2 q >O and t h e functions f ( z ) are linearly independent on any open finite measure subset of X, t h e n (g) i s valid.

Most optimality c r i t e r i a (g) lead t o t h e fulfillment of t h e following inequalities:

(9)

f o r any

**€EE(*).**

Otherwise (17) i s supposed to b e included in (g).

Theorem 3. a s s u m p t i o n s (a), 0, (c '), (e)-(g) hold, t h e n lirn O [ M ( t

)I = iw

^O^[M(€)]

⁼

^0'

S *- C

R o o f . The a p p r o a c h i s s t a n d a r d f o r optimization t h e o r y (in t h e statistical l i t e r a t u r e s e e , f o r instance, Wu and Wynn, 1978). T h e r e f o r e , some elementary considerations will b e omitted.

Expanding (see (g) a n d (17)) by a Taylor s e r i e s in 6, gives:

where

1 % 1

SK,=K,(Kl,K2,K3). Due to t h i s inequality a n d (14) t h e sequence S2,

= tx

^K~^6:j converges. By definition:

s

and, t h e r e f o r e , t h e sequence:

s,, = C

6, [7(z2, ⁹ € s )

-

^Y^(2,s ⁹ ^{€ s ) l}

s monotonically d e c r e a s e s .

F r o m (g) and (19):

K120[M (€2+1)]

=

@[M(t0)I + Sls + s 2 s 2 @ * leads t o t h e boundness of SlS

.

Subsequently, t h e monotonicity of

IS,,

provides i t s convergence and t h e convergence of 0[M

(€,

)] j

.

Assume t h a t

lirn @ [M(€,)]

=

²@*+a , a > o .

s ^+- (20)

Then, from Theorem 2 and assumptions (b), (c') i t follows t h a t

and

lirn SlS r b lim

x

^6, ⁼^-ao,

s ^+- s ^+-

lim O [M(€,)]

s -.

s +-

The contradiction between (20) and (21) p r o v e s t h e theorem.

Note 1. In (14)-(16), t h e r e i s some uncertainty in t h e choice of Ds and Es.

Somehow, t h e y have to b e located around z

,,

^and^z2,.^{When $(z}⁾⁼const (and one a r r i v e s at t h i s c a s e by t h e transformation &=$(z)&), t h e n zls and z2, could be t h e "geometrical" c e n t e r s of Ds and Es

.

Note 2. The i t e r a t i v e p r o c e d u r e can b e more effective (especially in t h e f i r s t s t e p s ) if t h e r e i s a possibility to easily find

and

(10)

subject t o

Note 3. When 6, i s sufficiently s m a l l and

J

f ( z ) f T ( z )

S

( 2 ) d z * f ( z l s ) f T ( z l s ) 6, D

then, t h e calculations in (14)-(16) can b e simplified if one use t h e following recur- sion formula (see, f o r instance, Fedorov, 1972)

(M*bffT)

=

( I T 6 ~ - I ff

) a-I i r t 6 f T M-11

The modified version of t h e algorithms, presented in Note 2. gives a hint f o r t h e construction of

Algorithm 2.

S t e p a. The same as (22). but instead of (23)

(no constraints on t h e sizes of D, and E, !).

Step b. Coincides with s t e p b of algorithm 1.

This algorithm seems to b e r a t h e r promising f o r changing t h e s t r u c t u r e of a n initial design

€,

rapidly. but i t allows some oscillation regimes, at least principally.

The a u t h o r failed t o prove i t s convergence. Probably some combination of both considered algorithms (for instance, t h e majorization of (24) by some vanishing sequence 6, ) could b e useful.

4. Exchange algorithm in the standard design problem

The possibility of changing t h e algorithms similar to (14)-(16) f o r design problem (3) (without constraint (4)) w a s somehow overlooked in t h e design theory.

Atwood (1973) proposed a very similar algorithm but based on (5) and t h e r e f o r e handling all supporting points in design

€, .

The simplest analogue of (14)-(16) can b e formulated as follows:

Step a. There i s a design

€, .

Two points

z ¹^,

=

Argmax v ( z ,

€,

) and z ^2s

=

Argminv(z,

t,

) ,

x % x u (25)

where

& =

suppX, have to b e found.

Step b.

where t ( z ) i s a design with one supporting point z

.

The sequence id,

1

can b e chosen as in (14). The convergence of t h e algorithm can b e proven similarly t o Theorem 3.

(11)

It is worthwhile noting that the convergence of procedures (25), ( 2 6 ) , in the discrete case (when 6,

=

K / N , a ^{N - I ,} where N i s the total number of observa- tions) i s questionable. because proof of Theorem 3 i s essentially based on the fact that 6,4.

(12)

REFERENCES

Atwood, C.L. (1973) Sequences Converging t o D-Optimal Designs of Experiments, Stat.. l, 342-352.

Ermakov, S. ed. (1983) Mathematical Theory of t h e Design of Experiments (in Rus- sian), Moscow, Nauka, p. 386.

Fedorov, V. (1986) Optimal Design of Experiments: Numerical Methods, WP-86-55, Laxenburg, Austria, International Institute f o r Applied Systems Analysis.

Gaivoronsky, A. (1985) Stochastic Optimization Techniques f o r Finding Optimal Submeasures, WP-85-28, Laxenburg, Austria, International institute f o r Ap- plied Systems Analysis.

Karlin, S. and W.J. Studden (1966) Tchebycheff Systems: With Applications in Analysis and Statistics. N e w York: Wiley & Sons, p. 586.

Krein, M.G. and A.A. Nudelman (1973) Markov Moment Problem and Extremal Prob- l e m s , Moscow, Nauka, p. 552.

Munn, R.E. (1981) The Design of Air Quality Monitoring Networks, Macmillan Pub- lishers LTD, London, p. 109.

Wu, C.F. and Wynn. M. (1978) The Convergence of General Step-Length Algorithms f o r Regular Optimum Design Criteria, Ann. Statist., 6, 1273-1285.

Wynn, H. (1982) Optimum Submeasures With Applications To Finite Population Sam- pling in "Statistical Decision Theory and Related Topics 111", 2, Academic P r e s s , N e w York, pp. 485-495.

The Experimental Design of an Observational Network: Optimization Algorithms of an Exchange Type

Working Paper

The

International Institute for Applied Systems Analysis

A-2361 Laxenburg, Austria

Design

Network:

-

-

Design

an

Exchange

-

=

=1,~

-

=

-

t

t

J

.

,

<-

E c ( + ) =

< -

+ 4

-

k

z(+)

t

-

t

€2 (4').

Ls

- -

4

U

x\x*.

to

t ~ ? ( + ) .

#'

#

o s /

#(&I

j t + j

*(&I.

# ~ z ( + )

#*

#

a m

#

+

=

.

=

j

, t o )

=

t *

#.

a*

m x *

=

&

a*

m x *

7

7 =

#.

=

x

= -

x

=

<- .

EZ(+).

D,

=

=

+ ⁴

€2 ^(4').

⁼ -

^x

⁼

^<- ^.

**€EE(*).**

⁼