Duality of Optimal Designs for Model Discriminating and Parameter Estimation Problems

(1)

NOT FOR QUOTATION WITHOUT PERMISSION

OF

THE

AUTHOR

DUALITY

OF OFTIWU DESIGNS

M)R

MODEL DISC-ATING AND

- P

ESIlMATION P R O B m S

V.

Fedorov

V.

Khaborov

September 1984 WP-84-77

Working Papers a r e interim reports on work of the International I n s t i t u t e for Applied Systems Analysis and have received only limited review. Views o r opinions expressed h e r e i n do not necessarily r e p r e s e n t t h o s e of t h e I n s t i t u t e o r of i t s National Member Organizations.

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS 2361 Laxenburg, Austria

(2)

I am grateful to Lucy Tomsits for editing and typing this paper.

(3)

Attempts t o find o u t relations between different c r i t e r i a of optimality have a long history descending to t h e fifties (Kiefer, 1958; Stone, 1958). This paper mainly deals with analysis of relations between most widespread criteria used in estimation problems a n d some c r i t e r i a for discriminating experiments which belong t o T-criteria family (Atkinson and Fedorov, 1974).

(4)

INTRODUCTION

2.

EQUIVALENCY OF DIFFERENT DESIGN CRITERIA

3.

SOME PROPERTIES OF OPTIMAL DESIGNS

4.

NUMERICAL PROCEDURES

-

vii

-

(5)

DUALlTY OF 0FTIMA.L DESIGNS F'OR MODEL

DISCRIMINATING

AND PARAMETER EZXMATION PROBLEMS

V. Fedorov and

V.

Khaborov

1. INTRODUCTION

The main object of this paper is the optimal designs for experiments which can be described by linear regression models:

vij =

^{l ) ; ~}(zi)

+

eij

=

^I)^(2.g l )

+

eij. (1) Vector zi describes the conditions which the i - t h set of observations a r e

-

- n

made under ( j

=

l,ni, i

=

l , n , C ni

= N)

and the value of its com-

a = l

ponents can be chosen (controlled) by an experimenter: zi

E X c R~

where

X

is a compact. Components of the vector 19 E

R

a r e unknown parameters and the subscript

" t "

points out their t r u e values. Com- ponents of vector f (zi) a r e given basic functions, which a r e continuous on the compact

X.

(6)

The e r r o r s E ~ , a r e a s s u m e d to be random, identically independently distributed with zero m e a n a n d finite variance which, without losing gen- erality, c a n be chosen equal t o 1. These assumptions on e r r o r s a r e sufficient i n t h e case when t h e estimation problem is u n d e r consideration, but for discriminating e x p e r i m e n t s t h e normality of t h e i r distribution will be a s s u m e d i n what follows below. Some m o r e general situa- tions can be considered similarly t o (Fedorov, 1980; Denisov, Fedorov.

and Khaborov. 1981).

The s e t of values

is a design of an experiment. Fractions pi c a n be considered a s meas- u r e s prescribed t o points zi a n d variations of t h e s e m e a s u r e s m u s t be proportional t o N - I in experimental practice.

To deal with discrete m e a s u r e s in optimization problems, one should apply t o very complicated m a t h e m a t i c a l technique. The problem essen- tially c a n be simplified if t h e discreteness is neglected a n d any probabilistic m e a s u r e ((dz) on

X

is considered a s s o m e experimental design.

Corresponding designs a r e called approximate or continuous. In this paper, they will be referred as "designs".

For a comparatively large

N,

it is not a problem t o c o n s t r u c t a n appropriate discrete approximation of any m e a s u r e [(dz) especially if one takes i n t o a c c o u n t t h a t almost for all widely used c r i t e r i a of optimality, t h e r e exist optimal designs with finite n u m b e r of supporting points (points of concentration of m e a s u r e ( ( d z ) ; s e e for instance, Fedorov.

(7)

1972; a n d section 3 of this paper). Formally, t h e c o n s t r u c t i o n of optimal designs c a n be considered a s a n optimization problem in t h e space of probabilistic measures:

( *

=

Arg inf +(() C

where t h e optirnality criterion

+

is defined by objectives of a n experi- m e n t e r a n d is usually a convex function of (.

In t h e p a r a m e t e r estimation problem, t h e dependence of

+

^from⁽

c a n be expressed through e l e m e n t s of Fisher's information matrix:

+(O =

+[M(OI*

where

This m a t r i x in t h e regular c a s e is inverse to t h e normalized dispersion (variance-covariance) matrix:

$

is t h e ( b e s t linear unbiased) estimator of 19. In the case of discrirninat- ing ( o r m o r e accurately, model testing) experiments, t h e s t r u c t u r e of .k is slightly m o r e complicated For instance, when t h e r e a r e two rival models:

t h e design problem can be described by the following optimization p r o b l e m ( ~ t k i n s o n a n d Fedorov, 1975).

I' =

~ r g s u p inf

j

[ q I ( z .JI) - ~ ~ b . f P 2 ) 1 ~ # ~ ) .

€ X

eeCoe

(8)

Very oFten some n o n d e g e n e r a t e regression Function q(d,z) is compared with zero hypothesis a n d in this c a s e , ( 3 ) transforms in t h e m o r e simple problem

('

=

Arg s u p

9((),

C where

which will be mainly considered in t h e following sections.

2. EQUlVALJ3NCY

OF

DFFZREXT DESIGN CRImRIA

In this section, t h e equivalency between some c r i t e r i a corresponding t o model testing e x p e r i m e n t s a n d experiments o r i e n t e d t o p a r a m e t e r estimation will be analyzed. The majority of r e s u l t s a r e based on t h e well-known r e s u l t s f r o m t h e t h e o r y of e x t r e m a of quadratic forms.

1. Let us s t a r t with t h e m o s t evident a n d simple case when a n experimenter is i n t e r e s t e d in some l i n e a r combination c T~ of unknown parameters. For interpolation or extrapolation problem c

=

f (z,), where z, is t h e point of i n t e r e s t . Then if he wants t o e s t i m a t e cT.9 t h e criterion

*(#)

=

c T~'(#)c. (5)

where "-" m e a n s pseudo-inverse matrix, c a n be used. If t h e significance of c T291 is tested, t h e n

It is easy to c h e c k o u t t h a t in (6), instead of 1, a n y positive c o n s t a n t c a n

(9)

be taken without influence of a n optimal design if q ( z , d ) depends linearly of d. The similar f a c t will t a k e place For t h e c r i t e r i a considered below a n d it will be used without any c o m m e n t s .

I t

is n a t u r a l to suggest t h a t

f o r any pseudo-inverse m a t r i x . Of course, if ( 7 ) t a k e s place, t h e n For any optimal design [ * (very often t h e solution of (2) is not unique), c T ~ - ( c * ) ~

<

^m,o r in a n o t h e r words, we a s s u m e t h a t c Td is estimable in the experiments defined by

c*. ^{I t}

will be useful t o note t h a t t h e neces- s a r y and s u E c i e n t condition of t h e estimability of c T d is t h e following equality (see for i n s t a n c e , Rao, 1973):

c T(~-!d'(c)M(())

=

⁰

for any pseudo-inverse matrix. The designs satisfying t o (8) will be called regular.

Consider now criterion ( 6 ) more detailly. Due t o (Z), one h a s

a n d (6) t r a n s f o r m s t o

@(c) ⁼

^inf dTh!(<)d

(c rd)Zz 1

I t

is obvious t h a t all optimal designs [ * for (9) coincide with t h e optimal designs for of t h e more simple problem

inf d T ~ ( [ ) $

c rd=l

Taking into a c c o u n t t h e condition (8) a n d using t h e s t a n d a r d Lagrangian

(10)

technique, one can g e t

inf f l T J f ( [ ) $

=

^cT J f - ( t ) c ,

c Td=l

with

From t h e last equation, i t immediately follows t h a t regular optimal designs (in other words, the solutions of ( 2 ) ) are the same both for criteria ( 5 ) and (6). In this sense these criteria a r e equivalent (compare with Kiefer's equivalency theorem, Fedorov, 1972). The equivalency pro- perty is useful in several aspects:

- I t

helps an experimenter, ensuring him t h a t he can solve two statistical problems simultaneously;

-

In numerical construction of optimal designs, i t gives possibil- ity to choose the most convenient algorithm, because depen- dently on

f

( z ) , ~ and c either optimization problem (5) or (6) can be more simple;

-

In theoretical analysis of optimal designs, sometimes

it

is convenient to relay between ( 5 ) and (6).

2.

If

in the model testing case, there is some prior information on the parameters Gt described by prior distribution function,

Fo

( d f l ) , then it is reasonable to use t h e mean of the noncentrality parameter as a criterion of optimality: ^'

If

t h e d s t r i b u t i o n

Fo

(d19) has a dspersion matrix equals to

Do

then ( 1 2 ) can be transformed t o

(11)

In practice, the knowledge of Do is problematic and one can relax this demand and assume t h a t only t h e determinant value of a dspersion matrix a r e given to be greater t h a n d

>

0. In this case, t h e criterion

can be t h e point of an interest. If the matrix

M(#)

is nonsingular, then t h e infrenum in (13) can be found easily (compare with Fedorov, 1981)

Evidently t h e maximization of (15) is equivalent to t h e maximization of

1 ^M(#) 1,

or in o t h e r words, criterion (14) is equivalent to D-criterion:

This criteria is one of t h e most widely used criteria in the estimation problem. Some properties of D-optimal designs connected with model testing were discussed by Kiefer (1958) and Stone (1958). The above result gives additional explanation of t h e relation between D-criterion a n d the model testing problem. In t h e next section even more startling example illuminating this relation will be considered

3. Let us s t a r t with a very natural criteria for model testing problem:

@(#I ⁼

^inf

J

^q2(z

.$)C(&

)I rm

*R

(pe(z,d)z 1 X

which in the linear case takes t h e form:

(12)

I t

i s not difficult t o c h e c k t h e following c h a i n of equalities:

w h e r e , of c o u r s e , M-l(() exists for a n y design with @(<)

>

0 o r +(()

= I I <

^m.

The first equality follows from t h e inclusion:

t h e s e c o n d o n e i s t h e corollary of t h e r e s u l t of s e c t i o n 1.

The c r i t e r i a

belongs t o t h e family of g - c r i t e r i a (see for i n s t a n c e , F e d o r o v (1991)).

When

U =

X a n d q ( z )

=

f ( z ) , o n e c a n g e t e v e n s t r o n g e r re:sult b e c a u s e of t h e c r i t e r i a

1 ^M(<) k1

a n d s u f T ( z ) ~ - l ( < ) f ( z ) a r e e q u i v a l e n t i n t h e

Z E

1

c a s e of c o n t i n u o u s designs d u e t o Kiefer-Wolfowitz's t h e o r e m ( s e e for example, Fedorov (1972)). This fact l e a d s t o t h e equivalency of (16) a n d (17) i m m e d i a t e l y .

4. The equivalency of s o m e c r i t e r i a c a n b e achieved with t h e h e l p of t h e well-known r e s u l t o n eigenvalues of m a t r i c e s (Rao, 1973). Let M be a s y m m e t r i c m a t r i x a n d

C

be a positively definite m a t r i x .

If A,

z . . . r

Am

a r e t h e r o o t s of

1 ^M

^-hC

I =

0 t h e n

#A479

inf

- - - A m .

fl 1 9 ~ ~ 3

F r o m t h i s relation, t h e equivalency of t h e following two c r i t e r i a

(13)

immediately occurs:

a n d

When C

=

I,, t h e n *([) is t h e popular E-criteria in t h e design theory.

The results of sections 1-4 c a n be s u m m a r i z e d in

THEOREM 1. The following c r i t e r i a a r e equivalent on t h e s e t of regular designs

1 ) c T m - ( [ ) c a n d inf y ( [ . $ ) .

(c T d ) 4 6

Z )

I

M - ' ( [ )

I

a n d ,

iojf, J

Y ( c . $ ) P , ( d - 9 )

.

su ( q T ( z ) ~ - l ( [ ) q ( z ) a n d inf

3' ^{r E 5} Y ( [ , * )

: & I T (2 )+)%a

4 ) A ~ [ B ~ M - ~ ( [ ) B ] a n d inf y ( [ , $ ) , where 6

>

0 a n d d T B ~ T * 6

3.

SOME

PROPERTIES

OF

_OPTIMALDESlGNS

m e o r e r n 1 allows some new results on t h e properties of optimal designs t o be achieved or illuminate some of t h e known r e s u l t s both for p a r a m e t e r estirr~ation a n d model testing problems. In application, t h e n u m b e r of supporting points in an optimal design is one of t h e p r i m e i n t e r e s t s ; t h e lesser t h e n u m b e r , t h e simpler i t is t o realize in p r a c t i c e t h e corresponding optimal designs.

(14)

The results on t h e n u m b e r of supporting points can be achieved by switching between t h e following two theorems ( s e e for example, Stone (1958); Fedorov (1972); Denisov, Fedorov, and Khaborov (1981)).

THEOREM 2. In design problem (2) t h e r e exists optimal design containing n o more t h a n m ( m

+

I)/ 2 supporting points.

TEEOREM

3. In design problem (4) t h e r e exists optimal design

containing n o m o r e t h a n (m

+

1) supporting points if

R

is a compact a n d convex set. If additionally in (4),at least k of con- s t r a i n t s a r e active for optimal designs, t h e n t h e r e exists optimal design containing n o more than ( m -k + I ) supporting points.

I t

should be noted t h a t if t h e conditions of meorem 3 a r e fulfilled, it gives m o r e strong r e s u l t t h a n m e o r e m 2.

Ezarnple 1

Consider t h e first case from Theorem 1 concerned with t h e extrapolation problem. There exist some r e s u l t s on the number of supporting points in this case which a r e r a t h e r complicated in proving a n d a r e significantly based on t h e s t r u c t u r e of basic functions f ( z ) (see for i n s t a n c e , Fedorov (1972)). From lheorem 3, it follows t h a t for c r i t e r i a u n d e r consideration, t h e r e exist optimal design containing n o m o r e t h a n m supporting points. To g e t this result, it is necessary t o take i n t o a c c o u n t t h a t t h e design problem

s u p inf y((,tP) C (cTd)%d

(15)

is equivalent to

s u p inf y((,+)

t

^c^T6n6

d u e to evenness, both functions (c T+)2 a n d y((,+) and t h a t for a n y design [, t h e constraint c T+ 2

6

is active.

It is useful t o note t h a t t h e result does n o t depend on t h e dimension of 2 .

Ezample 2

In spite of t h e similarity of t h e model testing c r i t e r i a f r o m point 3 a n d 4 of m e o r e m 1 t o t h e one considered previously, it is n o t possible t o g e t analogous r e s u l t s h e r e .

I t

is t h e m a t t e r of fact t h a t t h e s e t s s u

(f

T(z)IJ)2> d o r gTBBTIJ % d a r e not convex, a n d t h e r e f o r e t h e z E R

r e s u l t of Theorem 3 c a n n o t be applied. Naturally, t h e r e s u l t of m e o r e m 2 happens to be t r u e but t h e bound n,

=

m ( m

+I)/

2 for t h e n u m b e r of supporting points is n o t very eEcient a n d often c a n n o t satisfy a n experi- m e n t e r . In t h e s e cases, m o r e detailed analysis could be don with t h e h e l p of t h e so-called equivalency theorems. These t h e o r e m s c a n be for- m u l a t e d (see Fedorov, 1980) for both s e t s of optirnality c r i t e r i a (for p a r a m e t e r estimation a n d model testing problems).

THEOREM

4 (estimation problem). A n e c e s s a r y a n d sufficient condition for a design ( * to be optimal is t h e fulfillment of t h e inequality

where

(16)

If 1.

^(*(dz)

>

⁰ t h e n t h e function ( z ) achieves i t s lower x'

bound on t h e s e t of XI. Naturally t h e existence of derivatives is suggested.

aM

THEOREM

5 (model testing problem). A necessary and sufficient condition for a design ( * t o be optimal is the existence of s u c h m e a s u r e p*(d19) t h a t

y ( z , ( * ) @((**+*) where

a n d t h e m e a s u r e p* is defined on t h e s e t

If 1 **I * ( & ) >**

0 then t h e function ?(z,[*) achieves i t s upper

x'

bound o n t h e s e t of XI.

Note t h a t in Theorem 5 t h e convexity of

n

is not assumed.

Consider t h e polynomial regression (f T ( z ) = l , z , . . . , z m - l , \ z

1

^S1) and prove by two different ways t h a t the number of supporting points in the optimal design for c a s e 3 f r o m Theorem I equals rn. Let us s t a r t with Theorem 4 repeating t h e well-known approach ( s e e for i n s t a n c e , Fedorov (1972)). It is m o r e convenient t o p u t h e r e +(()

=

-1nI

M([) I.

In this case, -p(z,[)

=

f ' ( z ) ~ - l ( ( ) f ( 2 ) and it is t h e polynomial of degree 2m-2.

(17)

Evidently, this polynomial c a n achieve its maxima on interval

1

^z111 n o m o r e t h a n in m points. So the n u m b e r of supporting points no due t o Theorem 4 c a n n o t exceed m .

If

^no< m , t h e n

1 M(() I

=O. Therefore for t h e optimal design no

=

m .

Apply now t o Theorem 5 to g e t the s a m e result.

I t

is c l e a r t h a t t h e function y ( z . ( ) is a polynomial of degree less or equal 2m-2. Repeating t h e last p a r t O F t h e previous proof, one gets no = m .

Consider now case (4) f r o m Theorem ¹with B

=

I for t h e sake of sim- plicity. meorem 4 c a n n o t be u s e d h e r e without additional considera- tions because generally, optimal designs c a n have nonunique l a r g e s t eigenvalue

A[M-'([*)]

a n d function +(M) is nondifferentiable in t h i s case.

But Ilheorem 5 works h e r e a n d similar to t h e previous case, one g e t s no

=

^mfor one dimensional polynomial regression of degree m

-

1.

Note t h a t l'heorem 5 becomes m o r e convenient t o use particularly when r a n k

B <

m a n d one faces t h e nondfferentiability of

+[MI

in m o s t cases due t o t h e possible singularity of t h e information m a t r i x

**M((*).**

4. NUldERICAL

PROCEDURES

Ilheorem I enables one t o choose between principally different algo- r i t h m s of t h e n u m e r i c a l c o n s t r u c t i o n OF optimal designs. The first s e t of algorithms based on 7heorem 4 a n d their description c a n be found in [Fedorov (1972); Silvy, (1980)l. The algorithms related t o model testing c r i t e r i a were described in [Atkinson a n d Fedorov (1975); Denisov, Fedorov and Khaborov (l981)l.

(18)

REFERENCES

Atldnson, A., a n d V. Fedorov (1975) The Design of Experiments for Discriminating Between two Rival Models. Biometrika 62, 57-70.

Denisov, V., V. Fedorov, a n d V. Khaborov ( 1981) Tchebysheff's Approxima- tion in t h e Problem of Asymptotical Locally-Optimal Designs Con- s t r u c t i o n for Discriminating Experiments, in "Linear a n d Nonlinear Parameterization in Experimental Design Problems". R o b l e m s in Q b e r n e t i c s , V . Fedorov a n d

V.

Nalimov (eds.), pp.3-10 (in Russian).

Moscow.

Fedorov, V. (1972) i%eory o f Optimal Lkperiments. New York: Academic Press.

Fedorov,

V.

(1980) Design of Model Testing Experiments, in 2ymposia Mathematics, pp. 171-180. Bologna.

Fedorov, V. (1980) Convex Design Theory. Math. Operationsforsch, SEa- t i s t . , Ser. Statist. 11:403-413.

Fedorov,

V.

(1981) Active Regression Experiments, edited by

V.

Penenko.

Mathematical Methods i n Ezperimental Uesign, pp. 19-73, Novoski- birsk, Nauka (in Russian).

Kiefer,

J.

(1958) On t h e Nonrandomized Optimality a n d Randomized Nonoptirnality of Symmetrical Designs. Am. Math. Statist. 29:675- 699.

Rao, C.R. (1973) Linear Statisticd h f e r e n c e and i t s Application. New York: J. Wiley a n d Sons.

(19)

Silvy, S.D. (1980) m t i m a l Design. London: C h a p m a n n a n d Hill.

S t o n e , M. (1958) Application of a Measure of Information t o t h e design a n d comparison of Regression Experiments. A m . Math. Stat. 29:55- 70.

Duality of Optimal Designs for Model Discriminating and Parameter Estimation Problems

THE

DUALITY

M)R

- P

V.

V.

CONTENTS

INTRODUCTION

EQUIVALENCY OF DIFFERENT DESIGN CRITERIA

SOME PROPERTIES OF OPTIMAL DESIGNS

NUMERICAL PROCEDURES

-

-

DISCRIMINATING

V.

vij =

+

=

+

-

=

=

= N)

E X c R~

X

R

" t "

X.

X

N,

=

+

+

+(O =

$

I' =

j

=

9((),

OF

=

=

I t

<

c*. I t

=

@(c) =

I t

=

- I t

-

f

-

it

If

Fo

If

Fo

Do

>

M(#)

1 M(#) 1,

@(#I =

J

.$)C(&

*R

I t

>

= I I <

U =

=

1 M(<) k1

1

C

If A,

Am

1 M

I =

- - - A m .

c*. ^{I t}

@(c) ⁼

1 ^M(#) 1,

@(#I ⁼

1 ^M(<) k1

1 ^M

If 1 **I * ( & ) >**

**M((*).**