Design of Experiments under Constraints

(1)

NOT FOR QUOTATION WITHOUT PERMISSION OF THE ALTTHOR

DESIGN OF EXPElUblKNTS UNDER CONSIWWC3

A Gaivoronsld V. Fedorov

January 1984 WP-84-8

Working Papers a r e i n t e r i m reports on work of t h e International Institute for Applied Systems Analysis and have received only limited review. Views or opinions expressed herein do n o t necessarily r e p r e s e n t those of t h e I n s t i t u t e o r of i t s National Member Organiza- tions.

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS 2361 Laxenburg, Austria

(2)

PREFACE

This paper was done in collaboration between t h e System and Decision Sciences Area (SDS) and t h e Adaptive Resource Policy Project (ARP). It faces the problem of optimal experimental design. This problem arises in adaptive policy making a t t h e stage of estimating a model's parameters. It can be con- sidered as an optimization problem with both objective functions and constraints dependent upon probabilistic measures. Methods for dealing with such problems have recently been developed in SDS. In this paper, these methods a r e applied to optimal experimental design which allows u s to get nontrivial results both in statistics and optimization theory.

Andrzej Wierzbicki Chairman

System a n d Decision Sciences Area

(3)

UNDER

CONSlXAINTS

A. Gaivoronski and V. Fedorov

INTRODUCTION

I t

is a specific feature of applied systems analysis t h a t t h e organization and implementation of experiments is a very difficult a n d expensive process.

Any change in controllable variables (for instance, in agriculture, health ser- vice, economic experiments, etc.) can lead t o significant expense or t o some kind of loss which c a n n o t be measured in c u r r e n c y units. Therefore, it is necessary t o have methods of experimental design which take into account this side of experimental research. These methods were partly developed in t h e traditional theory of optimal design (see, for example, Fedorov 1972 and Silvey 1980). In t h e traditional approach i t is usually only assumed t h a t controllable variables belong to some given s e t (so called operability region). In this paper we t r y t o analyze t h e experimental design problem under more sophisticated constraints.

From t h e mathematical point of view, we deal with t h e designing of exper- i m e n t s which a r e described by a linear regression model:

where f ( 2 ) is a (mx1)-vector of a known basic function, xi describes

(5)

conditions of t h e i - t h m e a s u r e m e n t , I9 is a (mx1)-vector of unknown parameters, t h e subscript t stands for t h e t r u e value of these parameters, i stands for t h e n u m b e r of measurements, y i ~ ~ l is t h e result of t h e i - t h measure- m e n t , is t h e random e r r o r with zero mean and t h e s a m e variance for all m e a s u r e m e n t s which obviously can be chosen equal t o 1 by t h e appropriate scaling, moreover all e r r o r s a r e uncorrelated.

For model (1) i t is natural t o use t h e best linear unbiased e s t i m a t e s (see Rao, 1968)

N N

where

Y = x

^{f (xi) f}^T(zi),^Y

⁼ x

^f^{( z i )}

^vi

^and

²

is supposed t o be regular. I t

i = l a = l

is well known t h a t t h e variance matrix (which defines t h e precision of estima- tor

5)

^equals

Matrix

a

is called t h e information m a t r i z . I t is clear from ( 2 ) and (3) t h a t matrix

M

is defined completely by t h e s e t tzi{p. If in some points zi t h e r e a r e ri m e a s u r e m e n t s , t h e n this matrix is defined by t h e s e t

which is usually called t h e design, and points zi a r e i t s supporting points. If one can control o r choose t h e value of z i , then i t is sensible t o look for optimal designs.

The design

6;

is optimal if

where Q is sorne precision measure; for instance, i t can be

I p1 1 ,

t r

M

or tr

A B

(for details, see Fedorov 1972 and Silvey 1980).

To specify t h e extremal problem (4) one should describe (or do some suggestion on) t h e properties of function Q and t h e admissible s e t of designs tN. In traditiorlal experimental design theory, this s e t is defined through con- s t r a i ~ . l t s on t h e supporting points: z E X E R ~ , where X is t h e "operability"

region.

(6)

The r e s u l t s of t h i s paper a r e essentially c o n n e c t e d with additional con- s t r a i n t s . Namely, we suggest t h a t t o g e t h e r with t h e previous c o n s t r a i n t , o n e can deal with t h e following c o n s t r a i n t s :

In (5), f u n c t i o n s <,(z) describe s o m e losses when a m e a s u r e m e n t is done a t point z , a n d usually <,(z)

>

0.

As in t h e traditional case, i t is c o n v e n i e n t t o introduce i n s t e a d of M(tN), a normalized information matrix:

a n d deal with t h e function

U i n g t h i s new notation, t h e e x t r e m a l problem (4)-(5) c a n be p r e s e n t e d in t h e following form:

n -

t i =

~ r g min + [ M ( ( ~ ) ] ,

C

P ~ G o ( ~ ~ ) ~ O *

X 1

i = l l n (6)

t

a = l

APPROXIbUlTE OF'TIMAL DESIGN

E x t r e m a l problem (6) is d i s c r e t e (pi=r,/ N) a n d i t s solution is quite diffi- c u l t for a n y practical situation. But when N is sufficiently large o n e can hope t h a t a "continuous" design (when pi is allowed t o equal a n y value between 0 a n d 1) c a n be a good approximation of a n exact (discrete) design ( c o m p a r e with Fedorov 1972; Silvey 1980). Moreover, it is convenient t o describe a design n o t by t h e s e t of "weights"

bijr,

b u t by t h e a r b i t r a r y probabilistic m e a s u r e t ( & ) with t h e supporting s e t X. Of course, i t c a n happen t h a t s o m e optimal design could be described by a continuous m e a s u r e , which is n o t n a t u r a l l y convenient in practice. B u t i t will be shown l a t e r t h a t i t is always possible t o find a design with t h e s a m e information matrix, b u t with a finite n u m b e r of supporting points.

(7)

For c o n t i n u o u s design. (6) c a n be rewritten in t h e following way:

In t h e sequel we shall n e e d fulfillment of t h e following assumptions:

(a) The s e t X is compact.

(b) The functions f ( z ) a n d p ( z ) a r e c o n t i n u o u s on X.

( c ) +[MI is a convex function.

(d) There exists Q s u c h t h a t [#:+[M(#)]sQ<~, J p ( z ) [ ( d z ) < o j

=

E(Q)#$.

X

-

(e) For a n y #EE(Q) a n d

#EZ,

where E is t h e s e t of designs satisfying t o (8)v

where r ( d , # , j )

=

o (a).

THEOREM 1. If conditions (a) a n d (b) hold, t h e n f o r any design

#EE

there c a n always be f o u n d a design

E E

d hthe s a m e information m a t r i z [M(#)

=

~ ( j ) ] , the s a m e v a l u e of the cost f u n c t i o n [Jp(z)#(&)

=

f y D ( ~ ) ~ ( d z ) ] . a n d confaining n o more than

X X

m ( m + l )

+

¹

+

1 s u p p o r t i n g p o i n t s , 2

R o o f . Since a n y m a t r i x M(#) is s y m m e t r i c , i t is completely described by m ( m + I ) / 2 e l e m e n t s . Therefore both M(#) a n d

a(#) = f

p ( z ) # ( & ) c a n b e described by a v e c t o r of dimension X

m m + l

k

= )

+l. From t h e definition t h e s e t S* of t h e corresponding vec- 2

t o r s i s t h e convex hull of t h e s e t S

=

f q ( z ) , ~ E X ~ E @ , where

-

s T ( ~ )

= [f

n ( z ) f & z ) ,

axp,

p7(z)], a$=%, r = l . l . Due to Caratheodory's t h e o r e m , a n y point s f r o m S* c a n be r e p r e s e n t e d in t h e form

(8)

k +1

where s i € S , pirO. p i = l . This fact proves t h e theorem.

i = l

THEOREM 2.

I.

/f the conditions ( a ) ( c ) hold, then a n e c e s s a r y and s u f f i c i e n t con- d i t i o n for a d e s i g n

[*

to be optimal .is fulfillment of the i n e q u a l i t y

II.

7 7 ~ s e t of optimal designs is c o n v e z .

Proof. The inequality (9) follows from assumption ( c ) and from t h e fact t h a t a necessary and sufficient condition for M * t o be t h e solution of t h e minimization problem min\k[M], where

+

is a convex function, is t h e nondecreasing of

+

along any feasible h r e c t i o n (compare for instance with Whittle 1973 a n d Fedorov 1981). The convexity of t h e s e t of optimal

d e s i g n s i s t h e obvious consequence of t h e convexity of t h e f u n c t i o n ^I)

.

Remark. I f t h e r e a r e no constraints (8), then mi; j t ( z . [ * ) [ ( d Z )

=

min

$42

,[*)

&=

x

z E X

a n d Theorem 2 coincides with the well-known "equivalency theorem" from traditional experimental design theory (see, for instance, Fedorov and Malyutov 1972; Kiefer 1974; Whittle 1973).

According to Theorem 2 we should solve problem (9) in order to check particular plans for optimality. This problem is m u c h easier than t h e initial one because it is linear with respect t o [. However, i t still r e m a i n s an optimization problem in regard to probabilistic measures and further a t t e m p t s should be made t o reduce it to a more tractable one.

This can be done by applying duality results for optimization problems in which the objective function depends on probabilistic m e a u s r e s (Ermoliev 1970; Ermoliev and Nedeva 1982; Ermoliev, Gaivoronski, and Nedeva 1983).

THEOREM 3.

%ppose that conditions (a)-(c) are held and function ^iscontinuous w i t h r e s p e c t to

**t * .**

^Then

(9)

1. min ~ E Z / $ ( z . [ * ) [ ( d r )

=

m a x p ( u ) U E U +

where

2. Iibr a n y s u c h t h a t

/ $ ( z , c * ) K & )

=

m$ / $ ( z B t * ) t ( & )

&a

t h e r e exists ZL s u c h t h a t r,o(ii)

=

m a x p ( u ) where h a s a support s e t

U E LJ+

belonging t o

X ( 5 )

=

l z : z ~ X , ~(ii)

=

$ ( z . [ * )

+ cT$(z)j.

3. Among t h e s o l u t i o n s of (9) t h e r e a l w a y s e z i s t s o n e with n o m o r e t h a n l s u p p o r t i n g p o i n t s

This t h e o r e m is actually a r e - s t a t e m e n t of Theorem 1 from a paper by Ermoliev, Gaivoronski, a n d Nedeva ( 1 9 8 3 ) . I t reduces problem (9) to a finite-dimensional minimax problem.

Therefore, in Theorem 2 t h e unequality (9) can be replaced by t h e following one

m a x min [ $ ( z , [ * )

+

u T p ( z ) ] r 0

U E U t 2E-X

which is m o r e similar t o t h e "traditional" condition. In t h e following notation q ( z , u , [ )

= $ ( z

, [ ) + u T p ( z ) will be used.

Let u * be a solution of ( 1 0 ) and all c o n s t r a i n t s from ( 8 ) a r e active;

i.e.,

In t h e opposite c a s e one can consider (7) which contains fewer, and only active, constraints.

THEOREM 4.

If / t

^{* ( d r}⁾ⁿ^y>0.t h e n t h e f u n c t i o n q ( z .u '.,$*) a c h i e v e s X'

z e r o o n t h e s e t

X.

P r o o f . Let us suggest t h a t a t least on some s e t

X:

Then, due t o ( 1 0 ) a n d ( 1 2 ) :

(10)

But a t t h e s a m e time

because for any design ,$

due to condition (c), and t h e second s u m m a n d equals zero due t o (11).

This contradiction proves t h e theorem.

Remark. If the design [ * contains a finite number of supporting points ,:z i ==, then for all of them,

Of course Theorems 2 and 3 cannot provide prescriptions for t h e design's construction in general, but very often they help in t h e under- standing of some essential features of them.

&ample 1. Let u s consider t h e design problem for one-dimension polynomial regressions:

with t h e D-criterion of optimality:

a n d with t h e following constraints

Let u s suggest t h a t p ( z ) a r e continuous functions on t h e interval ( z ( < l a n d t h a t t h e system

i s a Chebyshev system on t h e s a m e interval.

(11)

I t is easy t o c h e c k t h a t t h e conditions (a)-(c) a r e fulfilled a n d t h e r e s u l t s of Theorems 2 and 4 t a k e place h e r e . For D-optimal designs o n e h a s ll/(z,[)

=

m-f T ( z ) ~ - l f ( z ) ( s e e , for i n s t a n c e , Fedorov 1972). In o u r c a s e f T ( z )

=

(1,z ,..., z m - I ) , and t h e r e f o r e

In o t h e r words, t h e function q ( z , u , [ ) is a l i n e a r combination of t h e function (15). I t is known t h a t a l i n e a r combination with s o m e non-zero coefficients of s functions which is a Chebyshev s y s t e m c a n have no m o r e t h a t s roots. Therefore t h e function g ( z , u , [ ) h a s no m o r e t h a n ( 2 m + l )

1 1 + 1

r o o t s a n d has n o m o r e t h a n rn

+ -

(if 1 is e v e n ) o r m

+

- (if 1 is odd)

2 2

m i n i m a on t h e interval lz ( g l . But in a c c o r d a n c e t o Theorem 3, t h e function q ( z , u *,[*) should approach i t s low boundary a t t h e supporting points of a n optimal design. So their n u m b e r c a n n o t exceed m + - , if 1 1 is e v e n ,

2

if 1 is odd, which is m u c h l e s s t h a n t h e upper boundary from or m + -

2 Theorem 1.

&ample 2. Let us now apply t h e simplest version of (13), (14) with rn =2 (simple l i n e a r regression), b u t with t h e following c o n s t r a i n t s .

In t h i s case, s y s t e m (15) is n o t a Chebyshev one, a n d t h e r e f o r e t h e previ- o u s r e s u l t c a n n o t apply.

According t o Theorem 2 a n d t h e s y m m e t r y of c o n s t r a i n t s (16), t h e information m a t r i x hi(() for a n y optimal design should be diagonal. For a diagonal m a t r i x

M([)

one has:

I t

is evident t h a t q(z,u,,$)-0, when u * = c - I a n d

Therefore if a design [ * which satisfies (17) c a n be found, t h e n according t o Theorem 2, i t will be optimal design. In f a c t (17) describes a family of distributions with t h e given second m o m e n t a n d i t is not difficult t o find

(12)

s o m e m e m b e r s of it. For i n s t a n c e , t h e following two designs:

a n d

belong t o t h i s family. We r e m e m b e r t h a t in t h e traditional c a s e (only one c o n s t r a i n t :

1

z

1

< I ) , t h e optimal design problem h a s a unique solution:

(13)

REFERENCES

Ermoliev, Y. 1970. Methods for Stochastic Programming in Randomized Stra- tegies. Vo1.1:3-7. Moscow: Kibernetika.

Ermoliev, Y. and C. Nedeva. 1982. Stochastic Optimization Problems with Par- tially Known Distribution Functions. CP-62-60. Laxenburg, Austria: Inter- national Institute for Applied Systems Analysis.

Ermoliev, Y., A. Gaivoronski, and C. Nedeva. 1963. Stochastic Optimazation Problems with Incomplete Information About Distribution Functions. WP- 83-13. Laxenburg, Austria: International Institute for Applied Systems Analysis.

Fedorov, V.V. and M.B. Malyutov. 1972. Optimal Designs in Regression Prob- lems. Math. Oper. a n d Statist. (B) 4:300-324.

Fedorov, V.V. 1972. Theory of Optimal Experiments. New York: Academic Press.

Fedorov, V.V. 1981. Active regression experiments, edited by V.V. Penenko.

Mathematical Methods in t h e Design of Experiments. Pp 19-73. Novosi- birsk: Nauka. (In Russian.)

Kiefer, 6 . 1974. General Equivalence Theory for Optimum Design. Ann. Sta- tist. 2:849-879.

Rao. C.R. 1968. Linear Statistical Inference a n d Its Applications. New York:

John Wiley and Sons, Inc.

Silvey, S.D. 1980. Optimal Design. Chapman and Hall.

Whittle, P. 1973. Some General Points in t h e Theory of Optimal Experimental Design. JRSS (El) 35: 123-130.