NOT FOR QUOTATION WITHOUT PERMISSION
OF
THE
AUTHORDUALITY
OF OFTIWU DESIGNSM)R
MODEL DISC-ATING AND- P
ESIlMATION P R O B m SV.
FedorovV.
KhaborovSeptember 1984 WP-84-77
Working Papers a r e interim reports on work of the International I n s t i t u t e for Applied Systems Analysis and have received only limited review. Views o r opinions expressed h e r e i n do not necessarily r e p r e s e n t t h o s e of t h e I n s t i t u t e o r of i t s National Member Organizations.
INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS 2361 Laxenburg, Austria
I am grateful to Lucy Tomsits for editing and typing this paper.
Attempts t o find o u t relations between different c r i t e r i a of optimal- ity have a long history descending to t h e fifties (Kiefer, 1958; Stone, 1958). This paper mainly deals with analysis of relations between most widespread criteria used in estimation problems a n d some c r i t e r i a for discriminating experiments which belong t o T-criteria family (Atkinson and Fedorov, 1974).
CONTENTS
1.
INTRODUCTION
2.
EQUIVALENCY OF DIFFERENT DESIGN CRITERIA
3.SOME PROPERTIES OF OPTIMAL DESIGNS
4.NUMERICAL PROCEDURES
-
vii-
DUALlTY OF 0FTIMA.L DESIGNS F'OR MODEL
DISCRIMINATING
AND PARAMETER EZXMATION PROBLEMSV. Fedorov and
V.
Khaborov1. INTRODUCTION
The main object of this paper is the optimal designs for experiments which can be described by linear regression models:
vij =
l ) ; ~ (zi)+
eij=
I) (2. g l )+
eij. (1) Vector zi describes the conditions which the i - t h set of observations a r e-
- nmade under ( j
=
l,ni, i=
l , n , C ni= N)
and the value of its com-a = l
ponents can be chosen (controlled) by an experimenter: zi
E X c R~
where
X
is a compact. Components of the vector 19 ER
a r e unknown parameters and the subscript" t "
points out their t r u e values. Com- ponents of vector f (zi) a r e given basic functions, which a r e continuous on the compactX.
The e r r o r s E ~ , a r e a s s u m e d to be random, identically independently distributed with zero m e a n a n d finite variance which, without losing gen- erality, c a n be chosen equal t o 1. These assumptions on e r r o r s a r e sufficient i n t h e case when t h e estimation problem is u n d e r considera- tion, but for discriminating e x p e r i m e n t s t h e normality of t h e i r distribu- tion will be a s s u m e d i n what follows below. Some m o r e general situa- tions can be considered similarly t o (Fedorov, 1980; Denisov, Fedorov.
and Khaborov. 1981).
The s e t of values
is a design of an experiment. Fractions pi c a n be considered a s meas- u r e s prescribed t o points zi a n d variations of t h e s e m e a s u r e s m u s t be proportional t o N - I in experimental practice.
To deal with discrete m e a s u r e s in optimization problems, one should apply t o very complicated m a t h e m a t i c a l technique. The problem essen- tially c a n be simplified if t h e discreteness is neglected a n d any proba- bilistic m e a s u r e ((dz) on
X
is considered a s s o m e experimental design.Corresponding designs a r e called approximate or continuous. In this paper, they will be referred as "designs".
For a comparatively large
N,
it is not a problem t o c o n s t r u c t a n appropriate discrete approximation of any m e a s u r e [(dz) especially if one takes i n t o a c c o u n t t h a t almost for all widely used c r i t e r i a of optimal- ity, t h e r e exist optimal designs with finite n u m b e r of supporting points (points of concentration of m e a s u r e ( ( d z ) ; s e e for instance, Fedorov.1972; a n d section 3 of this paper). Formally, t h e c o n s t r u c t i o n of optimal designs c a n be considered a s a n optimization problem in t h e space of probabilistic measures:
( *
=
Arg inf +(() Cwhere t h e optirnality criterion
+
is defined by objectives of a n experi- m e n t e r a n d is usually a convex function of (.In t h e p a r a m e t e r estimation problem, t h e dependence of
+
from (c a n be expressed through e l e m e n t s of Fisher's information matrix:
+(O =
+[M(OI*where
This m a t r i x in t h e regular c a s e is inverse to t h e normalized dispersion (variance-covariance) matrix:
$
is t h e ( b e s t linear unbiased) estimator of 19. In the case of discrirninat- ing ( o r m o r e accurately, model testing) experiments, t h e s t r u c t u r e of .k is slightly m o r e complicated For instance, when t h e r e a r e two rival models:t h e design problem can be described by the following optimization p r o b l e m ( ~ t k i n s o n a n d Fedorov, 1975).
I' =
~ r g s u p infj
[ q I ( z .JI) - ~ ~ b . f P 2 ) 1 ~ # ~ ) .€ X
eeCoe
Very oFten some n o n d e g e n e r a t e regression Function q(d,z) is compared with zero hypothesis a n d in this c a s e , ( 3 ) transforms in t h e m o r e simple problem
('
=
Arg s u p9((),
C where
which will be mainly considered in t h e following sections.
2. EQUlVALJ3NCY
OF
DFFZREXT DESIGN CRImRIAIn this section, t h e equivalency between some c r i t e r i a correspond- ing t o model testing e x p e r i m e n t s a n d experiments o r i e n t e d t o p a r a m e t e r estimation will be analyzed. The majority of r e s u l t s a r e based on t h e well-known r e s u l t s f r o m t h e t h e o r y of e x t r e m a of quadratic forms.
1. Let us s t a r t with t h e m o s t evident a n d simple case when a n experimenter is i n t e r e s t e d in some l i n e a r combination c T~ of unknown parameters. For interpolation or extrapolation problem c
=
f (z,), where z, is t h e point of i n t e r e s t . Then if he wants t o e s t i m a t e cT.9 t h e criterion*(#)
=
c T~'(#)c. (5)where "-" m e a n s pseudo-inverse matrix, c a n be used. If t h e significance of c T291 is tested, t h e n
It is easy to c h e c k o u t t h a t in (6), instead of 1, a n y positive c o n s t a n t c a n
be taken without influence of a n optimal design if q ( z , d ) depends linearly of d. The similar f a c t will t a k e place For t h e c r i t e r i a considered below a n d it will be used without any c o m m e n t s .
I t
is n a t u r a l to suggest t h a tf o r any pseudo-inverse m a t r i x . Of course, if ( 7 ) t a k e s place, t h e n For any optimal design [ * (very often t h e solution of (2) is not unique), c T ~ - ( c * ) ~
<
m, o r in a n o t h e r words, we a s s u m e t h a t c Td is estimable in the experiments defined byc*. I t
will be useful t o note t h a t t h e neces- s a r y and s u E c i e n t condition of t h e estimability of c T d is t h e following equality (see for i n s t a n c e , Rao, 1973):c T(~-!d'(c)M(())
=
0for any pseudo-inverse matrix. The designs satisfying t o (8) will be called regular.
Consider now criterion ( 6 ) more detailly. Due t o (Z), one h a s
a n d (6) t r a n s f o r m s t o
@(c) =
inf dTh!(<)d(c rd)Zz 1
I t
is obvious t h a t all optimal designs [ * for (9) coincide with t h e optimal designs for of t h e more simple probleminf d T ~ ( [ ) $
c rd=l
Taking into a c c o u n t t h e condition (8) a n d using t h e s t a n d a r d Lagrangian
technique, one can g e t
inf f l T J f ( [ ) $
=
c T J f - ( t ) c ,c Td=l
with
From t h e last equation, i t immediately follows t h a t regular optimal designs (in other words, the solutions of ( 2 ) ) are the same both for cri- teria ( 5 ) and (6). In this sense these criteria a r e equivalent (compare with Kiefer's equivalency theorem, Fedorov, 1972). The equivalency pro- perty is useful in several aspects:
- I t
helps an experimenter, ensuring him t h a t he can solve two statistical problems simultaneously;-
In numerical construction of optimal designs, i t gives possibil- ity to choose the most convenient algorithm, because depen- dently onf
( z ) , ~ and c either optimization problem (5) or (6) can be more simple;-
In theoretical analysis of optimal designs, sometimesit
is con- venient to relay between ( 5 ) and (6).2.
If
in the model testing case, there is some prior information on the parameters Gt described by prior distribution function,Fo
( d f l ) , then it is reasonable to use t h e mean of the noncentrality parameter as a cri- terion of optimality: 'If
t h e d s t r i b u t i o nFo
(d19) has a dspersion matrix equals toDo
then ( 1 2 ) can be transformed t oIn practice, the knowledge of Do is problematic and one can relax this demand and assume t h a t only t h e determinant value of a dspersion matrix a r e given to be greater t h a n d
>
0. In this case, t h e criterioncan be t h e point of an interest. If the matrix
M(#)
is nonsingular, then t h e infrenum in (13) can be found easily (compare with Fedorov, 1981)Evidently t h e maximization of (15) is equivalent to t h e maximization of
1 M(#) 1,
or in o t h e r words, criterion (14) is equivalent to D-criterion:This criteria is one of t h e most widely used criteria in the estimation problem. Some properties of D-optimal designs connected with model testing were discussed by Kiefer (1958) and Stone (1958). The above result gives additional explanation of t h e relation between D-criterion a n d the model testing problem. In t h e next section even more startling example illuminating this relation will be considered
3. Let us s t a r t with a very natural criteria for model testing prob- lem:
@(#I =
infJ
q2(z.$)C(&
)I rm*R
(pe(z,d)z 1 Xwhich in the linear case takes t h e form:
I t
i s not difficult t o c h e c k t h e following c h a i n of equalities:w h e r e , of c o u r s e , M-l(() exists for a n y design with @(<)
>
0 o r +(()= I I <
m.The first equality follows from t h e inclusion:
t h e s e c o n d o n e i s t h e corollary of t h e r e s u l t of s e c t i o n 1.
The c r i t e r i a
belongs t o t h e family of g - c r i t e r i a (see for i n s t a n c e , F e d o r o v (1991)).
When
U =
X a n d q ( z )=
f ( z ) , o n e c a n g e t e v e n s t r o n g e r re:sult b e c a u s e of t h e c r i t e r i a1 M(<) k1
a n d s u f T ( z ) ~ - l ( < ) f ( z ) a r e e q u i v a l e n t i n t h eZ E
1
c a s e of c o n t i n u o u s designs d u e t o Kiefer-Wolfowitz's t h e o r e m ( s e e for example, Fedorov (1972)). This fact l e a d s t o t h e equivalency of (16) a n d (17) i m m e d i a t e l y .
4. The equivalency of s o m e c r i t e r i a c a n b e achieved with t h e h e l p of t h e well-known r e s u l t o n eigenvalues of m a t r i c e s (Rao, 1973). Let M be a s y m m e t r i c m a t r i x a n d
C
be a positively definite m a t r i x .If A,
z . . . rAm
a r e t h e r o o t s of1 M
-hCI =
0 t h e n#A479
inf
- - - A m .
fl 1 9 ~ ~ 3
F r o m t h i s relation, t h e equivalency of t h e following two c r i t e r i a
immediately occurs:
a n d
When C
=
I,, t h e n *([) is t h e popular E-criteria in t h e design theory.The results of sections 1-4 c a n be s u m m a r i z e d in
THEOREM 1. The following c r i t e r i a a r e equivalent on t h e s e t of regular designs
1 ) c T m - ( [ ) c a n d inf y ( [ . $ ) .
(c T d ) 4 6
Z )
I
M - ' ( [ )I
a n d ,iojf, J Y ( c . $ ) P , ( d - 9 ) .
su ( q T ( z ) ~ - l ( [ ) q ( z ) a n d inf
3' r E 5 Y ( [ , * )
: & I T (2 )+)%a
4 ) A ~ [ B ~ M - ~ ( [ ) B ] a n d inf y ( [ , $ ) , where 6
>
0 a n d d T B ~ T * 63.
SOME
PROPERTIESOF
OPTIMAL DESlGNSm e o r e r n 1 allows some new results on t h e properties of optimal designs t o be achieved or illuminate some of t h e known r e s u l t s both for p a r a m e t e r estirr~ation a n d model testing problems. In application, t h e n u m b e r of supporting points in an optimal design is one of t h e p r i m e i n t e r e s t s ; t h e lesser t h e n u m b e r , t h e simpler i t is t o realize in p r a c t i c e t h e corresponding optimal designs.
The results on t h e n u m b e r of supporting points can be achieved by switching between t h e following two theorems ( s e e for example, Stone (1958); Fedorov (1972); Denisov, Fedorov, and Khaborov (1981)).
THEOREM 2. In design problem (2) t h e r e exists optimal design containing n o more t h a n m ( m
+
I)/ 2 supporting points.TEEOREM
3. In design problem (4) t h e r e exists optimal designcontaining n o m o r e t h a n (m
+
1) supporting points ifR
is a com- pact a n d convex set. If additionally in (4),at least k of con- s t r a i n t s a r e active for optimal designs, t h e n t h e r e exists optimal design containing n o more than ( m -k + I ) supporting points.I t
should be noted t h a t if t h e conditions of meorem 3 a r e fulfilled, it gives m o r e strong r e s u l t t h a n m e o r e m 2.Ezarnple 1
Consider t h e first case from Theorem 1 concerned with t h e extrapo- lation problem. There exist some r e s u l t s on the number of supporting points in this case which a r e r a t h e r complicated in proving a n d a r e significantly based on t h e s t r u c t u r e of basic functions f ( z ) (see for i n s t a n c e , Fedorov (1972)). From lheorem 3, it follows t h a t for c r i t e r i a u n d e r consideration, t h e r e exist optimal design containing n o m o r e t h a n m supporting points. To g e t this result, it is necessary t o take i n t o a c c o u n t t h a t t h e design problem
s u p inf y((,tP) C (cTd)%d
is equivalent to
s u p inf y((,+)
t
c T6n6d u e to evenness, both functions (c T+)2 a n d y((,+) and t h a t for a n y design [, t h e constraint c T+ 2
6
is active.It is useful t o note t h a t t h e result does n o t depend on t h e dimension of 2 .
Ezample 2
In spite of t h e similarity of t h e model testing c r i t e r i a f r o m point 3 a n d 4 of m e o r e m 1 t o t h e one considered previously, it is n o t possible t o g e t analogous r e s u l t s h e r e .
I t
is t h e m a t t e r of fact t h a t t h e s e t s s u(f
T(z)IJ)2> d o r gTBBTIJ % d a r e not convex, a n d t h e r e f o r e t h e z E Rr e s u l t of Theorem 3 c a n n o t be applied. Naturally, t h e r e s u l t of m e o r e m 2 happens to be t r u e but t h e bound n,
=
m ( m+I)/
2 for t h e n u m b e r of supporting points is n o t very eEcient a n d often c a n n o t satisfy a n experi- m e n t e r . In t h e s e cases, m o r e detailed analysis could be don with t h e h e l p of t h e so-called equivalency theorems. These t h e o r e m s c a n be for- m u l a t e d (see Fedorov, 1980) for both s e t s of optirnality c r i t e r i a (for p a r a m e t e r estimation a n d model testing problems).THEOREM
4 (estimation problem). A n e c e s s a r y a n d sufficient condition for a design ( * to be optimal is t h e fulfillment of t h e inequalitywhere
If 1. (*(dz) >
0 t h e n t h e function ( z ) achieves i t s lower
x'
bound on t h e s e t of XI. Naturally t h e existence of derivatives is suggested.
aM
THEOREM
5 (model testing problem). A necessary and sufficient condition for a design ( * t o be optimal is the existence of s u c h m e a s u r e p*(d19) t h a ty ( z , ( * ) @((**+*) where
a n d t h e m e a s u r e p* is defined on t h e s e t
If 1 I * ( & ) >
0 then t h e function ?(z,[*) achieves i t s upperx'
bound o n t h e s e t of XI.
Note t h a t in Theorem 5 t h e convexity of
n
is not assumed.Consider t h e polynomial regression (f T ( z ) = l , z , . . . , z m - l , \ z
1
S 1) and prove by two different ways t h a t the number of supporting points in the optimal design for c a s e 3 f r o m Theorem I equals rn. Let us s t a r t with Theorem 4 repeating t h e well-known approach ( s e e for i n s t a n c e , Fedorov (1972)). It is m o r e convenient t o p u t h e r e +(()=
-1nIM([) I.
In this case, -p(z,[)=
f ' ( z ) ~ - l ( ( ) f ( 2 ) and it is t h e polynomial of degree 2m-2.Evidently, this polynomial c a n achieve its maxima on interval
1
z11 1 n o m o r e t h a n in m points. So the n u m b e r of supporting points no due t o Theorem 4 c a n n o t exceed m .If
no < m , t h e n1 M(() I
=O. Therefore for t h e optimal design no=
m .Apply now t o Theorem 5 to g e t the s a m e result.
I t
is c l e a r t h a t t h e function y ( z . ( ) is a polynomial of degree less or equal 2m-2. Repeating t h e last p a r t O F t h e previous proof, one gets no = m .Consider now case (4) f r o m Theorem 1 with B
=
I for t h e sake of sim- plicity. meorem 4 c a n n o t be u s e d h e r e without additional considera- tions because generally, optimal designs c a n have nonunique l a r g e s t eigenvalueA[M-'([*)]
a n d function +(M) is nondifferentiable in t h i s case.But Ilheorem 5 works h e r e a n d similar to t h e previous case, one g e t s no
=
m for one dimensional polynomial regression of degree m-
1.Note t h a t l'heorem 5 becomes m o r e convenient t o use particularly when r a n k
B <
m a n d one faces t h e nondfferentiability of+[MI
in m o s t cases due t o t h e possible singularity of t h e information m a t r i xM((*).
4. NUldERICAL
PROCEDURES
Ilheorem I enables one t o choose between principally different algo- r i t h m s of t h e n u m e r i c a l c o n s t r u c t i o n OF optimal designs. The first s e t of algorithms based on 7heorem 4 a n d their description c a n be found in [Fedorov (1972); Silvy, (1980)l. The algorithms related t o model testing c r i t e r i a were described in [Atkinson a n d Fedorov (1975); Denisov, Fedorov and Khaborov (l981)l.
REFERENCES
Atldnson, A., a n d V. Fedorov (1975) The Design of Experiments for Discriminating Between two Rival Models. Biometrika 62, 57-70.
Denisov, V., V. Fedorov, a n d V. Khaborov ( 1981) Tchebysheff's Approxima- tion in t h e Problem of Asymptotical Locally-Optimal Designs Con- s t r u c t i o n for Discriminating Experiments, in "Linear a n d Nonlinear Parameterization in Experimental Design Problems". R o b l e m s in Q b e r n e t i c s , V . Fedorov a n d
V.
Nalimov (eds.), pp.3-10 (in Russian).Moscow.
Fedorov, V. (1972) i%eory o f Optimal Lkperiments. New York: Academic Press.
Fedorov,
V.
(1980) Design of Model Testing Experiments, in 2ymposia Mathematics, pp. 171-180. Bologna.Fedorov, V. (1980) Convex Design Theory. Math. Operationsforsch, SEa- t i s t . , Ser. Statist. 11:403-413.
Fedorov,
V.
(1981) Active Regression Experiments, edited byV.
Penenko.Mathematical Methods i n Ezperimental Uesign, pp. 19-73, Novoski- birsk, Nauka (in Russian).
Kiefer,
J.
(1958) On t h e Nonrandomized Optimality a n d Randomized Nonoptirnality of Symmetrical Designs. Am. Math. Statist. 29:675- 699.Rao, C.R. (1973) Linear Statisticd h f e r e n c e and i t s Application. New York: J. Wiley a n d Sons.
Silvy, S.D. (1980) m t i m a l Design. London: C h a p m a n n a n d Hill.
S t o n e , M. (1958) Application of a Measure of Information t o t h e design a n d comparison of Regression Experiments. A m . Math. Stat. 29:55- 70.