NOT FOR QUOTATION WITHOUT PERMISSION OF THE ALTTHOR
DESIGN OF EXPElUblKNTS UNDER CONSIWWC3
A Gaivoronsld V. Fedorov
January 1984 WP-84-8
Working Papers a r e i n t e r i m reports on work of t h e International Institute for Applied Systems Analysis and have received only limited review. Views or opinions expressed herein do n o t necessarily r e p r e s e n t those of t h e I n s t i t u t e o r of i t s National Member Organiza- tions.
INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS 2361 Laxenburg, Austria
PREFACE
This paper was done in collaboration between t h e System and Decision Sciences Area (SDS) and t h e Adaptive Resource Policy Project (ARP). It faces the problem of optimal experimental design. This problem arises in adaptive policy making a t t h e stage of estimating a model's parameters. It can be con- sidered as an optimization problem with both objective functions and con- straints dependent upon probabilistic measures. Methods for dealing with such problems have recently been developed in SDS. In this paper, these methods a r e applied to optimal experimental design which allows u s to get nontrivial results both in statistics and optimization theory.
Andrzej Wierzbicki Chairman
System a n d Decision Sciences Area
CONTENTS
1. I n t r o d u c t i o n
2. Approximate Optimal Design 3. References
DESIGN OF MPERlMENl3
UNDER
CONSlXAINTSA. Gaivoronski and V. Fedorov
INTRODUCTION
I t
is a specific feature of applied systems analysis t h a t t h e organization and implementation of experiments is a very difficult a n d expensive process.Any change in controllable variables (for instance, in agriculture, health ser- vice, economic experiments, etc.) can lead t o significant expense or t o some kind of loss which c a n n o t be measured in c u r r e n c y units. Therefore, it is necessary t o have methods of experimental design which take into account this side of experimental research. These methods were partly developed in t h e traditional theory of optimal design (see, for example, Fedorov 1972 and Silvey 1980). In t h e traditional approach i t is usually only assumed t h a t con- trollable variables belong to some given s e t (so called operability region). In this paper we t r y t o analyze t h e experimental design problem under more sophisticated constraints.
From t h e mathematical point of view, we deal with t h e designing of exper- i m e n t s which a r e described by a linear regression model:
where f ( 2 ) is a (mx1)-vector of a known basic function, xi describes
conditions of t h e i - t h m e a s u r e m e n t , I9 is a (mx1)-vector of unknown parame- ters, t h e subscript t stands for t h e t r u e value of these parameters, i stands for t h e n u m b e r of measurements, y i ~ ~ l is t h e result of t h e i - t h measure- m e n t , is t h e random e r r o r with zero mean and t h e s a m e variance for all m e a s u r e m e n t s which obviously can be chosen equal t o 1 by t h e appropriate scaling, moreover all e r r o r s a r e uncorrelated.
For model (1) i t is natural t o use t h e best linear unbiased e s t i m a t e s (see Rao, 1968)
N N
where
Y = x f (xi) f T(zi), Y = x f ( z i ) vi
and 2
is supposed t o be regular. I t
vi
and2
is supposed t o be regular. I ti = l a = l
is well known t h a t t h e variance matrix (which defines t h e precision of estima- tor
5)
equalsMatrix
a
is called t h e information m a t r i z . I t is clear from ( 2 ) and (3) t h a t matrixM
is defined completely by t h e s e t tzi{p. If in some points zi t h e r e a r e ri m e a s u r e m e n t s , t h e n this matrix is defined by t h e s e twhich is usually called t h e design, and points zi a r e i t s supporting points. If one can control o r choose t h e value of z i , then i t is sensible t o look for optimal designs.
The design
6;
is optimal ifwhere Q is sorne precision measure; for instance, i t can be
I p1 1 ,
t rM
or trA B
(for details, see Fedorov 1972 and Silvey 1980).To specify t h e extremal problem (4) one should describe (or do some suggestion on) t h e properties of function Q and t h e admissible s e t of designs tN. In traditiorlal experimental design theory, this s e t is defined through con- s t r a i ~ . l t s on t h e supporting points: z E X E R ~ , where X is t h e "operability"
region.
The r e s u l t s of t h i s paper a r e essentially c o n n e c t e d with additional con- s t r a i n t s . Namely, we suggest t h a t t o g e t h e r with t h e previous c o n s t r a i n t , o n e can deal with t h e following c o n s t r a i n t s :
In (5), f u n c t i o n s <,(z) describe s o m e losses when a m e a s u r e m e n t is done a t point z , a n d usually <,(z)
>
0.As in t h e traditional case, i t is c o n v e n i e n t t o introduce i n s t e a d of M(tN), a normalized information matrix:
a n d deal with t h e function
U i n g t h i s new notation, t h e e x t r e m a l problem (4)-(5) c a n be p r e s e n t e d in t h e following form:
n -
t i =
~ r g min + [ M ( ( ~ ) ] ,C
P ~ G o ( ~ ~ ) ~ O *X 1
i = l l n (6)t
a = lAPPROXIbUlTE OF'TIMAL DESIGN
E x t r e m a l problem (6) is d i s c r e t e (pi=r,/ N) a n d i t s solution is quite diffi- c u l t for a n y practical situation. But when N is sufficiently large o n e can hope t h a t a "continuous" design (when pi is allowed t o equal a n y value between 0 a n d 1) c a n be a good approximation of a n exact (discrete) design ( c o m p a r e with Fedorov 1972; Silvey 1980). Moreover, it is convenient t o describe a design n o t by t h e s e t of "weights"
bijr,
b u t by t h e a r b i t r a r y probabilistic m e a s u r e t ( & ) with t h e supporting s e t X. Of course, i t c a n happen t h a t s o m e optimal design could be described by a continuous m e a s u r e , which is n o t n a t u r a l l y convenient in practice. B u t i t will be shown l a t e r t h a t i t is always possible t o find a design with t h e s a m e information matrix, b u t with a finite n u m b e r of supporting points.For c o n t i n u o u s design. (6) c a n be rewritten in t h e following way:
In t h e sequel we shall n e e d fulfillment of t h e following assumptions:
(a) The s e t X is compact.
(b) The functions f ( z ) a n d p ( z ) a r e c o n t i n u o u s on X.
( c ) +[MI is a convex function.
(d) There exists Q s u c h t h a t [#:+[M(#)]sQ<~, J p ( z ) [ ( d z ) < o j
=
E(Q)#$.X
-
(e) For a n y #EE(Q) a n d
#EZ,
where E is t h e s e t of designs satisfying t o (8)vwhere r ( d , # , j )
=
o (a).THEOREM 1. If conditions (a) a n d (b) hold, t h e n f o r any design
#EE
there c a n always be f o u n d a design
E E
d hthe s a m e information m a t r i z [M(#)=
~ ( j ) ] , the s a m e v a l u e of the cost f u n c t i o n [Jp(z)#(&)=
f y D ( ~ ) ~ ( d z ) ] . a n d confaining n o more thanX X
m ( m + l )
+
1+
1 s u p p o r t i n g p o i n t s , 2R o o f . Since a n y m a t r i x M(#) is s y m m e t r i c , i t is completely described by m ( m + I ) / 2 e l e m e n t s . Therefore both M(#) a n d
a(#) = f
p ( z ) # ( & ) c a n b e described by a v e c t o r of dimension Xm m + l
k
= )
+l. From t h e definition t h e s e t S* of t h e corresponding vec- 2t o r s i s t h e convex hull of t h e s e t S
=
f q ( z ) , ~ E X ~ E @ , where-
s T ( ~ )
= [f
n ( z ) f & z ) ,axp,
p7(z)], a$=%, r = l . l . Due to Caratheodory's t h e o r e m , a n y point s f r o m S* c a n be r e p r e s e n t e d in t h e formk +1
where s i € S , pirO. p i = l . This fact proves t h e theorem.
i = l
THEOREM 2.
I.
/f the conditions ( a ) ( c ) hold, then a n e c e s s a r y and s u f f i c i e n t con- d i t i o n for a d e s i g n[*
to be optimal .is fulfillment of the i n e q u a l i t yII.
7 7 ~ s e t of optimal designs is c o n v e z .Proof. The inequality (9) follows from assumption ( c ) and from t h e fact t h a t a necessary and sufficient condition for M * t o be t h e solution of t h e minimization problem min\k[M], where
+
is a convex function, is t h e nondecreasing of+
along any feasible h r e c t i o n (compare for instance with Whittle 1973 a n d Fedorov 1981). The convexity of t h e s e t of optimald e s i g n s i s t h e obvious consequence of t h e convexity of t h e f u n c t i o n I)
.
Remark. I f t h e r e a r e no constraints (8), then mi; j t ( z . [ * ) [ ( d Z )
=
min$42
,[*)&=
x
z E Xa n d Theorem 2 coincides with the well-known "equivalency theorem" from traditional experimental design theory (see, for instance, Fedorov and Malyutov 1972; Kiefer 1974; Whittle 1973).
According to Theorem 2 we should solve problem (9) in order to check particular plans for optimality. This problem is m u c h easier than t h e initial one because it is linear with respect t o [. However, i t still r e m a i n s an optimization problem in regard to probabilistic measures and further a t t e m p t s should be made t o reduce it to a more tractable one.
This can be done by applying duality results for optimization problems in which the objective function depends on probabilistic m e a u s r e s (Ermoliev 1970; Ermoliev and Nedeva 1982; Ermoliev, Gaivoronski, and Nedeva 1983).
THEOREM 3.
%ppose that conditions (a)-(c) are held and function is continuous w i t h r e s p e c t to
t * .
Then1. min ~ E Z / $ ( z . [ * ) [ ( d r )
=
m a x p ( u ) U E U +where
2. Iibr a n y s u c h t h a t
/ $ ( z , c * ) K & )
=
m$ / $ ( z B t * ) t ( & )&a
t h e r e exists ZL s u c h t h a t r,o(ii)
=
m a x p ( u ) where h a s a support s e tU E LJ+
belonging t o
X ( 5 )
=
l z : z ~ X , ~(ii)=
$ ( z . [ * )+ cT$(z)j.
3. Among t h e s o l u t i o n s of (9) t h e r e a l w a y s e z i s t s o n e with n o m o r e t h a n l s u p p o r t i n g p o i n t s
This t h e o r e m is actually a r e - s t a t e m e n t of Theorem 1 from a paper by Ermoliev, Gaivoronski, a n d Nedeva ( 1 9 8 3 ) . I t reduces problem (9) to a finite-dimensional minimax problem.
Therefore, in Theorem 2 t h e unequality (9) can be replaced by t h e following one
m a x min [ $ ( z , [ * )
+
u T p ( z ) ] r 0U E U t 2E-X
which is m o r e similar t o t h e "traditional" condition. In t h e following notation q ( z , u , [ )
= $ ( z
, [ ) + u T p ( z ) will be used.Let u * be a solution of ( 1 0 ) and all c o n s t r a i n t s from ( 8 ) a r e active;
i.e.,
In t h e opposite c a s e one can consider (7) which contains fewer, and only active, constraints.
THEOREM 4.
If / t
* ( d r ) n y>0. t h e n t h e f u n c t i o n q ( z .u '.,$*) a c h i e v e s X'z e r o o n t h e s e t
X.
P r o o f . Let us suggest t h a t a t least on some s e t
X:
Then, due t o ( 1 0 ) a n d ( 1 2 ) :
But a t t h e s a m e time
because for any design ,$
due to condition (c), and t h e second s u m m a n d equals zero due t o (11).
This contradiction proves t h e theorem.
Remark. If the design [ * contains a finite number of supporting points ,:z i ==, then for all of them,
Of course Theorems 2 and 3 cannot provide prescriptions for t h e design's construction in general, but very often they help in t h e under- standing of some essential features of them.
&le 1. Let u s consider t h e design problem for one-dimension polynomial regressions:
with t h e D-criterion of optimality:
a n d with t h e following constraints
Let u s suggest t h a t p ( z ) a r e continuous functions on t h e interval ( z ( < l a n d t h a t t h e system
i s a Chebyshev system on t h e s a m e interval.
I t is easy t o c h e c k t h a t t h e conditions (a)-(c) a r e fulfilled a n d t h e r e s u l t s of Theorems 2 and 4 t a k e place h e r e . For D-optimal designs o n e h a s ll/(z,[)
=
m-f T ( z ) ~ - l f ( z ) ( s e e , for i n s t a n c e , Fedorov 1972). In o u r c a s e f T ( z )=
(1,z ,..., z m - I ) , and t h e r e f o r eIn o t h e r words, t h e function q ( z , u , [ ) is a l i n e a r combination of t h e function (15). I t is known t h a t a l i n e a r combination with s o m e non-zero coefficients of s functions which is a Chebyshev s y s t e m c a n have no m o r e t h a t s roots. Therefore t h e function g ( z , u , [ ) h a s no m o r e t h a n ( 2 m + l )
1 1 + 1
r o o t s a n d has n o m o r e t h a n rn
+ -
(if 1 is e v e n ) o r m+
- (if 1 is odd)2 2
m i n i m a on t h e interval lz ( g l . But in a c c o r d a n c e t o Theorem 3, t h e func- tion q ( z , u *,[*) should approach i t s low boundary a t t h e supporting points of a n optimal design. So their n u m b e r c a n n o t exceed m + - , if 1 1 is e v e n ,
2
if 1 is odd, which is m u c h l e s s t h a n t h e upper boundary from or m + -
2 Theorem 1.
&le 2. Let us now apply t h e simplest version of (13), (14) with rn =2 (simple l i n e a r regression), b u t with t h e following c o n s t r a i n t s .
In t h i s case, s y s t e m (15) is n o t a Chebyshev one, a n d t h e r e f o r e t h e previ- o u s r e s u l t c a n n o t apply.
According t o Theorem 2 a n d t h e s y m m e t r y of c o n s t r a i n t s (16), t h e information m a t r i x hi(() for a n y optimal design should be diagonal. For a diagonal m a t r i x
M([)
one has:I t
is evident t h a t q(z,u,,$)-0, when u * = c - I a n dTherefore if a design [ * which satisfies (17) c a n be found, t h e n according t o Theorem 2, i t will be optimal design. In f a c t (17) describes a family of distributions with t h e given second m o m e n t a n d i t is not difficult t o find
s o m e m e m b e r s of it. For i n s t a n c e , t h e following two designs:
a n d
belong t o t h i s family. We r e m e m b e r t h a t in t h e traditional c a s e (only one c o n s t r a i n t :
1
z1
< I ) , t h e optimal design problem h a s a unique solution:REFERENCES
Ermoliev, Y. 1970. Methods for Stochastic Programming in Randomized Stra- tegies. Vo1.1:3-7. Moscow: Kibernetika.
Ermoliev, Y. and C. Nedeva. 1982. Stochastic Optimization Problems with Par- tially Known Distribution Functions. CP-62-60. Laxenburg, Austria: Inter- national Institute for Applied Systems Analysis.
Ermoliev, Y., A. Gaivoronski, and C. Nedeva. 1963. Stochastic Optimazation Problems with Incomplete Information About Distribution Functions. WP- 83-13. Laxenburg, Austria: International Institute for Applied Systems Analysis.
Fedorov, V.V. and M.B. Malyutov. 1972. Optimal Designs in Regression Prob- lems. Math. Oper. a n d Statist. (B) 4:300-324.
Fedorov, V.V. 1972. Theory of Optimal Experiments. New York: Academic Press.
Fedorov, V.V. 1981. Active regression experiments, edited by V.V. Penenko.
Mathematical Methods in t h e Design of Experiments. Pp 19-73. Novosi- birsk: Nauka. (In Russian.)
Kiefer, 6 . 1974. General Equivalence Theory for Optimum Design. Ann. Sta- tist. 2:849-879.
Rao. C.R. 1968. Linear Statistical Inference a n d Its Applications. New York:
John Wiley and Sons, Inc.
Silvey, S.D. 1980. Optimal Design. Chapman and Hall.
Whittle, P. 1973. Some General Points in t h e Theory of Optimal Experimental Design. JRSS (El) 35: 123-130.