Unobserved Heterogeneity and Dependent Covariates: A State-Space Model of Infant Growth and Mortality

(1)

NOT FOR QUOTATION WITHOUT THE PERMISSIOPJ OF THE AUTHOR

UNOBSERVED HEXEROGENEITT AND DEPENDENT COVARIATES: A

STATE-SPACE

MODE;L OF

W A N T

GROWTH AND

MOWTAUIT

A n d r e w F o s t e r

December 1 9 8 5 WP-85-87

This r e s e a r c h w a s conducted in conjunction with a summer r e s e a r c h seminar on heterogeneity dynamics, u n d e r t h e direction of James W.

Vaupel and Anatoli 1. Yashin, in t h e Population Program at IIASA led by Nathan Keyfitz. Funding provided by t h e American Academy of A r t s and Sciences.

W o r k i n g P a p e r s are interim r e p o r t s on work of t h e International Institute f o r Applied Systems Analysis and have r e c e i v e d only limited review. Views or opinions e x p r e s s e d h e r e i n d o not necessarily r e p r e s e n t t h o s e of t h e Institute or of i t s National Member Organizations.

INTERNATIONAL INSTITUTE FOR APPLIED SYSTEMS ANALYSIS 2361 Laxenburg, Austria

(2)

Foreword

A group of eleven Ph.D. candidates f r o m seven countries-Robin Cowan, An- drew Foster, Nedka Gateva. William Hodges, Arno Kitts, Eva Lelievre, Fernando Rajulton, Lucky Tedrow, Marc Tremblay, John Wilmoth, and Zeng Yi-worked togeth- e r a t IIASA from June 17 through September 6, 1985, in a seminar on population heterogeneity. The seminar w a s led by t h e two of us with t h e help of Nathan Key- fitz, l e a d e r of t h e Population Program, and Bradley Gambill, Dianne Goodwin, and Alan Bernstein, r e s e a r c h e r s in t h e Population Program, as well as t h e occasional participation of guest s c h o l a r s at IIASA, including Michael Stoto, S e r g e i Scherbov, Joel Cohen, F r a n s Willekens, Vladimir Crechuha, and Geert Ridder. Susanne S t o c k , o u r s e c r e t a r y , and Margaret T r a b e r managed t h e seminar superbly.

Each of t h e eleven s t u d e n t s in t h e seminar succeeded in writing a r e p o r t on t h e r e s e a r c h t h e y had done. With only one exception, t h e students evaluated t h e seminar as "very productive"; t h e exception thought i t was "productive". The t w o of us a g r e e : t h e quality of t h e r e s e a r c h produced exceeded o u r expectations and made t h e summer a thoroughly enjoyable e x p e r i e n c e . W e were p a r t i c u l a r l y pleased by t h e i n t e r e s t and s p a r k l e displayed in o u r daily, hour-long colloquium, and by t h e s p i r i t of cooperation all t h e p a r t i c i p a n t s , both students and more s e n i o r r e s e a r c h e r s , displayed in generously s h a r i n g ideas and otherwise helping e a c h o t h e r .

A p r i z e , t h e P e c c e i Fellowship, i s awarded t o t h e summer s c i e n t i s t s who have excelled both in t h e i r own r e s e a r c h and in helping o t h e r summer scientists with t h e i r r e s e a r c h . This fellowship enables a summer scientist t o r e t u r n t o IIASA f o r t h r e e months t h e following y e a r . T h r e e P e c c e i Fellowships were awarded t h i s summer: Andrew F o s t e r in t h e population seminar w a s one of t h e winners. This work- ing p a p e r summarizes t h e innovative r e s e a r c h F o s t e r c a r r i e d out a t IIASA. The r e s e a r c h is not only mathematically sophisticated and demographically significant, but also policy relevant--quite a n achievement.

James W. Vaupel Anatoli I. Yashin

-

ⁱⁱⁱ

-

(3)

Acknowledgments

I would like t o acknowledge my d e b t t o fellow p a r t i c i p a n t s in t h e Population Program of t h e YSSP. Special t h a n k s go to William Hodges and t o t h e l e a d e r s of t h e summer program, James W. Vaupel and Anatoli I. Yashin. Also t h a n k s t o Susanne Stock who did t h e final formating of t h e manuscript.

(4)

Abstract

A policy oriented model of child growth and mortality is developed in t h e con- t e x t of a stochastic s t a t e s p a c e model. The model incorporates unobserved heterogeneity as a n unmeasured covariate which affects both mortality and a n observed time varying covariate. I t is demonstrated using Monte Carlo simulations t h a t a model ignoring this unobserved heterogeneity will give biased p a r a m e t e r estimates; p a r a m e t e r s are found t o b e unbiased if a model which allows f o r a n unobserved variable is estimated. Monte Carlo simulations are then used to t e s t t h e robustness of t h e model t o misspecification of t h e distribution of t h e unobserved covariate. Estimates of t h e change in child survival are obtained using dynamic equations derived from a Kolmogorov-Fokker-Planck (KFP) equation. I t is shown t h a t t h e model which ignores unobserved heterogeneity produces i n c o r r e c t estimates of t h e change in mortality t h a t would r e s u l t if c e r t a i ~ types of mortality intervention programs were implemented.

-

vii -

(5)

UNOBSERYED H ~ O AND DEPENDENT G ~ COVARIATES: A

STATE-SPACE

MODEL OF

LNFANT

GROWTH AND MORTALITY

Andrew Foster

Graduate Group in Demography University of California a t Berkeley

2234 Piedmont Avenue Berkeley, California 94720

USA

1. Introduction

The work t o d a t e considering t h e e f f e c t of unobserved heterogeneity on p a r a m e t e r estimates of h a z a r d models i s quite d i v e r s e in t e r m s of t h e applications, methodologies, and specifications considered. Nonetheless, one assumption h a s been r e t a i n e d in virtually all c a s e s : unmeasured c o v a r i a t e s are assumed t o b e independent of t h e measured c o v a r i a t e s . From a t h e o r e t i c a l p e r s p e c t i v e t h i s limita- tion is c e r t a i n l y justified-it is not a t all s u r p r i s i n g t h a t r e s u l t s should b e biased when a n unobserved v a r i a b l e is c o r r e l a t e d with t h e observed variables. From t h e p e r s p e c t i v e of someone who wishes t o use p a r a m e t e r estimates t o inform policy, however, t h e independence assumption may b e l e s s d e s i r a b l e . In t h e f i r s t p l a c e , if t h e v a r i a b l e s r e a l l y are independent then t h e biased estimate may b e more infor- mative t o t h e policy maker even when a n unbiased estimate c a n b e obtained. Con- v e r s e l y , if t h e assumption of independence is invalid t h e n a model of heterogeneity which assumes independence may o b s c u r e important relationships and lead t o inap- p r o p r i a t e policy conclusions.

In this p a p e r w e develop a model of child mortality which i n c o r p o r a t e s unob- s e r v e d heterogeneity which i s r e l a t e d t o an o b s e r v e d c o v a r i a t e . The model is p r e s e n t e d in t h e c o n t e x t of t h e stochastic s p a c e model introduced by Woodbury and Manton (1977) and r e c e n t l y extended t o include unobserved heterogeneity.

(Yashin, Manton, and Vaupel, 1985; Yashin, 1984). Before presenting t h e details of t h e model, however, we provide a justification f o r i t s basic f e a t u r e s by r e f e r r i n g t o r e c e n t work on infant and child mortality in t h e demographic and epidemiologi-

(6)

c a l l i t e r a t u r e . Monte Carlo simulations a r e then used t o compare p a r a m e t e r estimates from a model which ignores t h e unmeasured c o v a r i a t e of mortality with estimates from a model which i n c o r p o r a t e s t h e dependence of t h e measured and unmeasured covariates. The robustness of r e s u l t s t o misspecification of t h e distribution of unobserved c o v a r i a t e s i s also explored. Finally, equations describing t h e dynamic p r o p e r t i e s of t h e moments of t h e distribution of t h e c o v a r i a t e s of mortality a r e used t o g e n e r a t e predictions of t h e impact on mortality of various policy al- t e r n a t i v e s using t h e p a r a m e t e r s generated in t h e two models.

2. A General Model of Infant and Child Mortality

A g e n e r a l model of child mortality should consider t h r e e t y p e s of f a c t o r s . First, mortality will b e influenced by observable v a r i a b l e s which tend t o be fixed o v e r t h e f i r s t few y e a r s of a child's life. The most typical example of such a co- v a r i a t e in a demographic study would be mother's education, but o t h e r c o v a r i a t e s describing t h e g e n e r a l f e a t u r e s of a child's household o r community, such as whether t h e r e is a l a t r i n e o r running water, will also b e of this type. This kind of c o v a r i a t e is ignored in t h e model considered h e r e , but when analyzing d a t a , fixed c o v a r i a t e s may b e included in a relatively straightforward fashion.

Second, mortality will b e affected by time varying c o v a r i a t e s t h a t a r e gen- e r a t e d by a random p r o c e s s which i s influenced by t h e values of o t h e r c o v a r i a t e s a s well a s the a g e of t h e child. A typical example of such a c o v a r i a t e would be a n anthropometric measure such as weight o r some transformation of variables such a s weight f o r age.

In a n estimated model one might wish t o include height, which is thought t o b e a measure of long t e r m nutritional deprivation (stunting), and weight f o r height, which is thought to b e a measure of s h o r t t e r m deprivation (wasting). (See, f o r example, Mosely 1985.) In o r d e r t o p r o p e r l y consider t h e e f f e c t of such c o v a r i a t e s on mortality i t is n e c e s s a r y t o have longitudinal studies with anthropometric d a t a taken at relatively s h o r t intervals, p e r h a p s monthly.

Finally, from t h e point of view of t h i s p a p e r , i t i s desirable t o have measures of time varying c o v a r i a t e s which a r e , f o r any individual, stationary (in t h e s e n s e of ARMA models) throughout t h e period of observation. For example, t h e measure w*

=

Ln(w / w,), where w, i s a s t a n d a r d weight f o r a g e t a k e n from a n a p p r o p r i a t e schedule, may b e approximately stationary. In p r a c t i c e one might want t o t r y a number of transformations t o s e e which transformation most closely approximates

(7)

stationarity

The t h i r d set of f a c t o r s affecting mortality a r e underlying unobserved covariates. These f a c t o r s may, in principle, b e e i t h e r fixed or changing, but estimation is much more difficult if one does not believe t h a t t h e unobserved c o v a r i a t e s are fixed. Using t h e language of Vaupel e t al. (1979) w e may refer t o a fixed unob- s e r v e d c o v a r i a t e as f r a i l t y ; however, t h e t e r m f r a i l t y means something quite dif- f e r e n t f o r children growing up in a r u r a l village of a p o o r c o u n t r y t h a n i t does, f o r example, f o r adults in developed countries. In t h e p r e s e n t c o n t e x t i t i s d e s i r a b l e t o use t h e t e r m f r a i l t y t o consider not only unmeasured physical and mental a t t r i - butes of a n individual, b u t also t h e unmeasured components of t h e "disease environment" faced by a child. The motivation for considering f r a i l t y in t h i s light c a n best b e explained by a s h o r t review of a comparative study of child growth and survival in t h e r u r a l areas of Costa Rica and Guatemala (Mata, 1973, 1985).

The basic s u b j e c t of Mata's study i s t h e interaction between disease and nutrition. While i t i t i s generally a g r e e d t h a t poorly nourished children are m o r e likely t o succumb t o infections t h a n t h e i r well nourished c o u n t e r p a r t s , i t i s a l s o thought t h a t t h e level of nourishment of a child is likely t o depend on t h e p a t t e r n s of infection h e f a c e s . Mata's r e s u l t s demonstrate convincingly t h e importance of t h i s second factor. The infant mortality r a t e in CauquB, Guatemala was observed t o b e more t h a t 6 times as high as t h a t in Puriscal, Costa Rica, a circumstance which is a t t r i b u t a b l e , f o r t h e most p a r t , to t h e quite f a v o r a b l e public health environment in P u r i s c a l , p a r t i c u l a r l y t h e almost universal vaccination c o v e r a g e , piped water, and use of of l a t r i n e s and toilets. I t w a s a l s o observed t h a t children of Cauqud had lower b i r t h weights and less f a v o r a b l e growth t h e n t h e i r c o u n t e r p a r t s in Puriscal.

Despite t h e s e v e r y significant d i f f e r e n c e s in t h e two towns, Mata demonstrated t h a t d i e t s of pregnant women and children in CauquB were quite similar in quali- t y and quantity t o t h e d i e t s in Puriscal. A s such i t seems t h a t t h e low weights of t h e Cauqud children, as well as t h e high level of mortality, r e s u l t e d not f r o m inade- q u a t e feeding but f r o m high levels of infection. Using t h e terminology discussed above, we would s a y t h a t t h e children of Cauqud are especially f r a i l because t h e y f a c e a l e s s f a v o r a b l e disease environment. Unfavorable growth and h i g h e r mortality are both symptoms of t h i s f r a i l t y .

In Mata's comparative study t h e disease environment in e a c h town i s reasonably well d e s c r i b e d by t h e s e r i e s of v a r i a b l e s considered. It is likely t h a t a t l e a s t a t t h e level of aggregation considered, t h e towns themselves, t h e unobserved com- ponent of t h e t h e d i s e a s e environment is relatively small. In g e n e r a l , however,

(8)

t h e r e will not be sufficient information f o r t h e model considered h e r e where we wish t o distinguish between t h e disease environment faced by different individuals r a t h e r than different towns.

A t this point, t h e motivation f o r a model incorporating unobserved heterogeneity which is c o r r e l a t e d with t h e observed variable becomes evident. Suppose we consider a policy which h a s t h e e f f e c t of raising a child's b i r t h weight but no effect on t h e disease environment in which h e spends his e a r l y y e a r s . If t h e b i r t h weight i s directly r e l a t e d t o t h e mortality r a t e , and if both mortality and b i r t h weight a r e affected by t h e disease environment, then we will overestimate t h e ef- f e c t of t h e policy.

Before developing t h e model itself, i t i s helpful t o consider t h e r e s u l t s of Trussell and Richards (1985) who used Heckman and Singer semi-parametric distributions t o f i t infant mortality d a t a from Korea using s e v e r a l different time dependent hazard functions and sub-samples of t h e population. They found t h a t parameter estimates f o r t h e mortality d a t a were quite sensitive t o t h e specification of t h e model as well as t h e sample considered. Noting t h a t p a r a m e t e r estimates f o r f e r - tility d a t a from Korea were r o b u s t t o changing specifications, they suggested t h a t t h e volatility of t h e mortality model estimates may have r e s u l t e d from t h e f a c t t h a t relatively few closed intervals were observed in t h e mortality d a t a .

A p a r t i a l explanation i s possible. When one considers a population in which few intervals are closed, t h e r e is relatively little s c o p e f o r selection t o a f f e c t r e s u l t s and t h e r e f o r e estimates of heterogeneity are likely t o b e unstable. Of c o u r s e t h i s f a c t t a k e n by itself should not b e a problem--one would e x p e c t t h a t a maximum likelihood p r o c e d u r e would simply suggest t h a t t h e addition of s u p p o r t points does not i n c r e a s e explanatory power. The a u t h o r s found t h a t they could add up t o t h r e e s u p p o r t points and r e t a i n significance.

One explanation would a t t r i b u t e t h e observed r e s u l t s t o t h e a g e p a t t e r n of mortality, which typically exhibits s t e e p declines in t h e f i r s t few months o r even weeks of life. While t h i s d r o p in p a r t may b e a t t r i b u t e d t o heterogeneity, t h e r e a r e r e a s o n s t o believe t h a t t h e underlying c u r v e is a l s o reasonably s t e e p . Since t h e parametric c u r v e s used by t h e a u t h o r s a r e not a b l e t o c a p t u r e t h i s r a p i d de- cline well and still give a good f i t t o t h e remaining a g e specific mortality rates, a n additional s u p p o r t point i s r e q u i r e d . Since t h e number of individuals dying in t h e f i r s t few months is s m a l l , t h e position of t h e s u p p o r t point i s likely t o b e quite sensitive t o changes in t h e sample; c o v a r i a t e p a r a m e t e r estimates may t h e r e f o r e also b e sensitive. Since different c u r v e s have different abilities t o c a p t u r e t h e r a p i d

(9)

d r o p in mortality c u r v e s , w e would a l s o e x p e c t r e s u l t s t o b e sensitive t o t h e form used f o r observed h a z a r d function.

With t h i s p e r s p e c t i v e i t seems likely t h a t t h e model developed h e r e will not b e s u b j e c t t o t h e same d e g r e e of instability encountered by Trussell and Richards.

F i r s t , any population for which t h i s model will b e applicable i s likely t o have h i g h e r mortality t h a n t h e Korean population Trussell and Richards considered.

Since more e v e n t s are o b s e r v e d t h e r e is more opportunity t o o b s e r v e selection. If t h e t h e estimation of unobserved heterogeneity depends on observing a selection e f f e c t , as i t normally does, a population with h i g h e r mortality i s likely to provide more s t a b l e r e s u l t s . A second fact i s t h a t t h e h i g h e r mortality populations t h a t might b e studied with t h e model discussed h e r e are a l s o likely t o e x h i b i t l e s s r a p i d declines of mortality t h e n t h e Korean population. The tendency for h i g h e r mortali- t y populations t o exhibit mortality c u r v e s which are relatively f l a t c a n b e a t t r i - buted t o t h e p r e v a l e n c e of exogenous o r environmentally determined mortality in high mortality populations in addition t o t h e endogenous mortality which i s experienced in a l l populations. The r e l a t i v e smoothness of infant mortality c u r v e s in high mortality population a r i s e s because exogenous infant mortality i s c o n c e n t r a t - ed at l a t e r a g e s t h a n i s endogenous mortality.

A far more important r e a s o n why t h e model developed h e r e i s not likely t o exhibit t h e same instability i s t h a t unlike t h e more usual models of unobserved heterogeneity, t h e model being developed h e r e provides estimates using factors o t h e r t h a n t h e selection effect. If a substantial portion of t h e population survives to t h e end of t h e s u r v e y , as will b e t h e case even in t h e highest mortality populations, i t is essential t o b e a b l e to r e l y on o t h e r factors t h a n selection t o estimate t h e underlying distributions of heterogeneity.

3. Model Specification

With t h i s background w e can c o n s t r u c t t h e model t h a t i s t h e focus of t h e pa- p e r . A s mentioned above, i t i s helpful t o p r e s e n t t h e model in t h e c o n t e x t of t h e stochastic s p a c e framework f i r s t discussed by Woodbury and Manton (1977), especially when one wishes t o consider t h e impact of various a l t e r n a t i v e policies on child survival, because t h e dynamics of t h e state v a r i a b l e s o v e r time may b e r e p r e s e n t e d by a Kolmogorov-Fokker-Planck equation (KFP).

(10)

The basic model considers a n n dimensional state s p a c e through which individuals a r e postulated t o move o v e r time. Movement is governed in p a r t by d e t e r - ministic and in p a r t by s t o c h a s t i c processes. Moreover, associated with e a c h point in t h i s s p a c e is a c e r t a i n h a z a r d of mortality.' Unfortunately, as a n a l y z e r s of longitudinal s u r v e y d a t a w e are a b l e to o b s e r v e only n --m dimensions; t h u s w e must b e content with analyzing t h e projections of t h e t r u e movement of a n individual o n t o t h i s lower dimensional s p a c e .

The movement of a n individual in t h i s state space c a n b e d e s c r i b e d using two equations, t h e equation of motion and t h e mortality equation. If x is a n n dimensional v e c t o r and p ( t , x ) is t h e mortality of a n individual at point x at time t t h e n t h e t w o equations c a n b e written as follows:

where t h e term d E arises from a Wiener process. If J t ( x ) is t h e joint density function of x and t h e probability of t h e event t h a t a n individual s u r v i v e s t o time t , t h e n t h e following KFP equation governs t h e changing distribution of x o v e r time (Wood- b u r y and Manton, 1977):

In p r a c t i c e it is quite helpful t o p u t s e v e r a l r e s t r i c t i o n s on t h e equations of motion. If we assume t h a t A ( t , x ) is l i n e a r in x , B is independent of x , and C ( t , x ) is a proportional h a z a r d which is q u a d r a t i c in x t h e n i t will b e t h e case t h a t if t h e initial distribution of x is Gaussian t h e n t h e distribution conditional on survival will b e Gaussian at any f u t u r e time. The value of a model which r e t a i n s t h e distributional s h a p e of x h a s been discussed by a number of a u t h o r s (see, f o r example, Hougaard, 1982). In t h i s paper t h e r e s u l t is p a r t i c u l a r l y helpful when we wish t o make predictions a b o u t t h e e f f e c t s on mortality of changing p a r a m e t e r values.

If x consists of one o b s e r v e d and one unobserved variable, and A is l i n e a r and C i s q u a d r a t i c only in t h e unobserved variable, t h e n i t w i l l no l o n g e r b e necessarily t h e case t h a t t h e o b s e r v e d v a r i a b l e will b e Gaussian; however, t h e unobserved v a r i a b l e conditional on t h e values of t h e o b s e r v e d v a r i a b l e up t o time

' ~ c t u a l l ~ Woodbury and Manton's model considered three levels of space, in the f i r s t of which mor- tality was entirely deterministic. For simplicity we focus on the second and third levels of space.

(11)

t will b e Gaussian (Yashin et al., 1985). This i s t h e basic model used in t h e estimation portion of t h e p a p e r . I t w a s selected because i t is less r e s t r i c t i v e t h a n t h e completely Gaussian model discussed above and yet still h a s a likelihood function t h a t can b e evaluated without r e s o r t i n g t o numerical integration.

Using t h i s g e n e r a l framework w e can d e s c r i b e t h e p r o c e s s of growth and mortality as follows: Assume t h a t t h e transformed weight v a r i a b l e ( w ) follows a stationary s t o c h a s t i c p r o c e s s conditional on f r a i l t y ( y ) and t h a t t h e f r a i l t y of a n individual is fixed. Since mortality and weight are not likely t o change v e r y much o v e r t h e c o u r s e of a month, and since t h e c h a n c e s of dying in t h e one month period are quite small, t h e continuous equations of motion are quite r e p r e s e n t a t i v e of t h e d i s c r e t e equations t h a t will b e used in t h e analysis. Equations (1) and (2) become

where et

-

N ( 0 , l ) . In o r d e r t o make w s t a t i o n a r y conditional on y , w e a l s o need t o specify t h e initial conditions:

and

where

and v l

=

^b&

"'1

=

^{- 1 2}

a l l 1 --(I

The distributions have been specified as normal t o t a k e advantage of t h e distributional p r o p e r t i e s of t h e g e n e r a l model.

T h e r e are s e v e r a l f e a t u r e s of t h e above relations which are worthy of note.

F i r s t , a l l is assumed t o b e negative. This assumption i s based on t h e phenomenon of

"catch up growth" which is well recognized in t h e l i t e r a t u r e on human growth and nutrition ( f o r example, Martorell et a l . , 1979). The idea is t h a t children who fall below t h e i r a p p r o p r i a t e weight f o r a g e due t o some random environmental shock tend t o e x p e r i e n c e more r a p i d growth in o r d e r t o "catch up" with t h e a p p r o p r i a t e time path of growth. If a l l i s g r e a t e r t h a n -1, which i t ought t o b e in p r a c t i c e ,

(12)

then t h e equation f o r h w conditional on y can then be written as a stationary AR(1) p r o c e s s with a mean m l.

The dependence of m on t h e f r a i l t y of individuals is incorporated in o r d e r t o account f o r t h e f a c t t h a t t h e biological tendency f o r c a t c h up growth may b e coun- t e r a c t e d by continued a d v e r s e environmental conditions. Children exposed t o high levels of infection may continue t o lose ground with r e s p e c t t o a s t a n d a r d growth path.

The relation between t h e distribution of w,, and f r a i l t y is a d i r e c t r e s u l t of t h e assumption of s t a t i o n a r i t y of t h e p r o c e s s f o r each individual with a fixed frailty y

.

This assumption i s not only helpful in p r a c t i c e , i t i s a l s o plausible. A s mentioned above in t h e discussion of Mata, children from a n a d v e r s e environment have lower b i r t h weights as well as l e s s favorable growth.

The assumption t h a t f r a i l t y i s fixed f o r a n individual a l s o d e s e r v e s some comment. While p e r h a p s t h e most important r e a s o n f o r fixing f r a i l t y i s t h a t a model of varying f r a i l t y f a c e s s e r i o u s identification problems, t h e assumption does not seem unreasonable in t h e c o n t e x t being considered. I t i s unlikely t h a t t h e r e will be substantial changes in t h e disease environment faced by a child o v e r t h e c o u r s e of a few y e a r s . Even when t h e r e a r e changes (as a r e s u l t , p e r h a p s of a development p r o j e c t on t h e one hand, o r a n epidemic on t h e o t h e r ) , i t is likely t h a t t h e changes will be s h a r e d by a l l t h e children in a village. If t h e ranking of f r a i l t y remains unaffected then t h e model is likely t o still f i t reasonably well.

One final note c o n c e r n s t h e specification of t h e mortality function, which h a s been assumed fixed in e a c h one month period. The mortality c u r v e i s thus not assumed t o be a t r u e Gompertz but a d i s c r e t e approximation of t h e Gompertz. A s long as t h e p a r a m e t e r c 2 is not l a r g e in absolute value t h i s assumption will have little effect on t h e estimated p a r a m e t e r s .

4. M o n t e C a r l o S i m u l a t i o n s

Before considering t h e r e s u l t s of t h e Monte Carlo simulations a s h o r t note on t h e a p p r o a c h t h a t w a s used is in o r d e r . First, values were assigned t o t h e various p a r a m e t e r s of t h e model. Although t h e assumed values a r e not necessarily close t o what would b e observed if t h e model were f i t t o d a t a , a n attempt w a s made t o chose coefficients t h a t produced a n observed p a t t e r n of weight dynamics and mortality t h a t w a s at l e a s t r e p r e s e n t a t i v e of what is observed in p r a c t i c e . Second, e a c h

(13)

simulated b i r t h was assigned a random f r a i l t y drawn from a normal distribution.

Using t h i s value of f r a i l t y , a n a p p r o p r i a t e initial weight could b e drawn f o r t h a t individual conditional on his f r a i l t y . An i t e r a t i v e p r o c e d u r e w a s t h e n used t o simu- l a t e t h e s e r i e s of weights and t h e death time of t h a t child. A t e a c h time a probabili- t y of death w a s calculated conditional on

t

,y , and w t . A random number w a s drawn from a uniform distribution t o s e e if t h e individual survived t h e month. If h e did, then a value of et w a s drawn from a normal distribution a n d weight in t h e

t

+1 period w a s calculated. Surviving individuals were c e n s o r e d a t 30 months. The s e r i e s of weights and t h e d e a t h time of e a c h individual were t h e n saved. The value of y was not saved s i n c e y r e p r e s e n t s information which would not b e available in collected d a t a t h a t o n e wished to analyze.

4.1. Estimation

A maximum likelihood p r o c e d u r e w a s t h e n used t o estimate t h e p a r a m e t e r s of t h e model. Since w e wished t o compare t h e r e s u l t s of a model incorporating unob- s e r v e d heterogeneity a n d a model which ignored such heterogeneity, t w o different likelihood functions were maximized using t h e s a m e simulated d a t a .

The f i r s t likelihood function (Model A) c a n b e d e s c r i b e d as t h e "true" model because with one small exception2 i t i s based on t h e model used t o g e n e r a t e t h e data. Consider a n individual i who i s observed f o r Ti months and t h e n dies. The likelihood of observing t h e p a r t i c u l a r values t h a t were simulated f o r e a c h individual conditional on y i s t h e following:

%he mean and variance o f t h e i n i t i a l distribution o f w were estimated separately, although t h e as- sumption o f s t a t i o n a r i t y used i n generating t h e data allows one t o w r i t e t h e s e parameters i n t e r m s o f other parameters i n t h e model. The only real c o s t o f t h i s approach i s t h a t 2 degrees o f freedom are l o s t . With real data, where one i s unsure t h a t t h e observed c o v a r i a t e s are i n f a c t s t a t i o n a r y , one can compare t h e actual estimated values of t h e s e t w o parameter values w i t h t h e values t h e y should have based on t h e e s t i m a t e s o f t h e other parameters. I f t h e t w o e s t i m a t e s are i n c o n s i s t e n t , t h e n I t may be advisable t o t r y an alternative transformation o f t h e t i m e varying covariste.

(14)

If t h e individual was censored at time Ti, then t h e final multiplicand is omitted. In- tegrating over t h e possible values of y , we obtain: 3

Because each of t h e density functions is normal and because t h e mortality function is quadratic in y , t h e above equation can be written in t h e form:

f o r which we can obtain a n analytical expression. Taking t h e log of t h e uncondi- tional likelihood and allowing f o r t h e possibility of censoring w e obtain:

where

+

(Ti+2)Ln(2rr)

+

T i l n ( b h )

+

ln(v2)

+

ln(vl)]

+gi[Ln(cl)

+

c2Ti + c3wTi1

and g, takes t h e value z e r o if an individual is censored, one otherwise.

3 ~ o t e that' the probability of a death between time Ti and time Ticl, is actually 1--ezp(-g(Tt ,y ,wTi)). The expression used here i s a good approximation, however, f o r monthly mortality r a t e s because the monthly probability of death i s quite small. If one uses the exact ex- pression then the likelihood becomes more complicated but remains tractable.

(15)

The second model (Model B) t o be considered i g n o r e s unobserved heterogenei- t y b u t is otherwise t h e same as Model A . ~ Ignoring h e t e r o g e n e i t y amounts t o assuming t h a t t h e distribution of f r a i l t y h a s mean 1 a n d v a r i a n c e 0. The resulting likelihood function i s a bit simpler:

where again g i i s a n i n d i c a t o r of w h e t h e r o r not t h e individual was censored. As Manton a n d Woodbury (1985) h a v e pointed o u t , t h i s t y p e of model is s e p a r a b l e into two p a r t s . As a r e s u l t , it is possible t o estimate t h e weight dynamics equation in- dependently of t h e mortality equation. Since we wished t o compare t h e r e s u l t s of t h e two estimation p r o c e s s e s , however, i t seemed d e s i r a b l e t o ignore t h i s separa- bility by estimating t h e two p a r t s of t h e model simultaneously.

4.2. Basic Results

Tables l a a n d l b p r e s e n t t h e r e s u l t s of t h e Monte Carlo simulations f o r t h e two models along with t h e s t a r t i n g values t h a t w e r e used t o g e n e r a t e t h e data.

E a c h model was f i t t o 45 simulated data sets with 1 0 0 observations in e a c h data set.

S e v e r a l aspects of t h e t a b l e s d e s e r v e comment. F i r s t , Model A, gives v e r y reason- a b l e p a r a m e t e r estimates. Not only are t h e p a r a m e t e r estimates within a 95% con- fidence i n t e r v a l of t h e i r t r u e values, but t h e v a r i a n c e s estimated from t h e i n v e r s e information matrix seem t o be reasonably good estimates of t h e v a r i a n c e s of t h e p a r a m e t e r estimates o b s e r v e d in t h e sample of 45. This r e s u l t i s q u i t e e x p e c t e d . Since Model A is based on t h e t r u e model, we e x p e c t t h e estimates t o be both con-

4 ~ e a r e i n t h e p r o c e s s o f e x p e r i m e n t i n g w i t h an i n t e r m e d i a t e model i n which u n o b s e r v e d h e t e r o - g e n e i t y is i n c o r p o r a t e d but assumed t o be independent of t h e o b s e r v e d c o v a r i a t e s . S o m e s p e c u l a - t i o n is provided i n t h e f i n a l s e c t i o n about w h a t w e e x p e c t t o l e a r n f r o m t h e s e e x p e r i m e n t s .

(16)

sistent and efficient.

It is equally c l e a r t h a t Model B gives biased estimates t o t h e parameters of in- t e r e s t . The parameter relating weight t o mortality (cg) is about twice i t s t r u e value. In addition, t h e parameter estimates in t h e dynamic portion of t h e model a r e biased. The estimated value of all, f o r example, suggests t h a t a n individual who falls below t h e standard weight due t o a random event will be slower t o r e t u r n than is t h e case in t h e t r u e model (e.g. "catch up growth" has been underestimat- ed).

It a p p e a r s , then, t h a t if one really can believe t h a t t h e specified model cap- t u r e s t h e essence of t h e p r o c e s s operating in nature, then Model A is s u p e r i o r t o Model B. Since in p r a c t i c e it is quite unlikely t h a t t h e estimated model will accu- r a t e l y c a p t u r e t h e p r o c e s s which generated t h e d a t a , i t is important t o t e s t t h e robustness of t h e conclusion t h a t Model A is superior.

5. Robustness to Misspecification of Unobsemed Distribution

There a r e reasons one might question any one of t h e assumptions incorporated into t h e model thus f a r . Nonetheless, one of t h e assumption seems particularly heroic: t h a t t h e underlying frailty i s distributed normally a t t h e time of birth.

Moreover, this assumption is p e r h a p s t h e most difficult t o test. If one suspects t h a t

- t h e process of weight growth is not Markovian, then one can rewrite t h e likelihood function t o incorporate a n AR(2) process. Since t h e AR(1) and AR(2) processes a r e nested, one may use a Chi-Square t e s t t o t e s t t h e hypothesis t h a t t h e AR(1) model is appropriate. If one wishes t o t e s t alternative distributional assumptions, i t is r a r e l y possible to construct nested

model^.^

I t may b e difficult t o even fit models incorporating o t h e r distributional function since numerical integration of t h e conditional likelihood function o v e r t h e possible values of frailty may b e necessary.

In o r d e r t o t e s t t h e robustness of these estimation procedures t o misspecifi- cations of the underlying distribution of frailty, we generated d a t a using a two point distribution and then fitted the d a t a assuming a normal distribution. By fixing t h e mean and variance of t h e two point distribution and changing t h e probability associated with t h e f i r s t of t h e two points we could a t least g e t a sense f o r t h e robustness of p a r a m e t e r estimates t o changing distributional assumptions.

% i t h d i s c r e t e p o i n t d i s t r i b u t i o n s , s u c h a s t h o s e o f Heckrnan and S i n g e r (1982), n e s t e d m o d e l s a r e p o s s i b l e .

(17)

Table 1 . Results from repeated simulations of model (100 individuals, 45 simulations).

Table l a : Model A

Average Input Average Sample asymptotic Parameter value estimate variance variance

Table l b : Model B

Average Input Average Sample asymptotic Parameter value estimate variance variance

(18)

I t is evident from Table 2 t h a t t h e conclusion t h a t Model A gives more accu- r a t e p a r a m e t e r estimates is r e t a i n e d even when t h e underlying distribution is misspecified. It a l s o seems t h a t t h e p a r a m e t e r estimates a r e biased, especially f o r a two point distribution with a high probability associated with t h e f i r s t point (and t h u s a strong negative skewness). Nonetheless, given t h e estimated variances t h a t a r e obtained f o r sample sizes of 250, i t i s generally not possible t o r e j e c t t h e hypothesis t h a t t h e a c t u a l value of t h e p a r a m e t e r i s -0.5 even with t h e most extreme distribution tested. In any c a s e , t h e Model B p a r a m e t e r estimates of c 3 a r e still severely biased in a negative direction.

Of c o u r s e t h e two point distribution i s not r e p r e s e n t a t i v e of all possible distributions of underlying f r a i l t y . One cannot conclude on t h e basis of t h e s e r e s u l t s t h a t t h e estimates will b e equally r o b u s t , f o r example, when confronted with d a t a generated by extreme value distributions of frailty. Nonetheless, t h e r e s u l t s obtained thus f a r a r e encouraging.

6. Prediction Equations

Up until t h i s point i t h a s been assumed t h a t a model t h a t produces biased p a r a m e t e r estimates is always i n f e r i o r t o a model t h a t produces unbiased parameter values. In p r a c t i c e this does not always follow. F i r s t , although i t may be t h e c a s e t h a t c e r t a i n of t h e p a r a m e t e r s are biased, it i s a l s o possible t h a t some combi- nation of those p a r a m e t e r s will b e w e l l estimated by a simpler model. For example, t h e r a t i o

-

is a c c u r a t e l y estimated by Model B despite t h e f a c t t h a t t h e esti-

all

mates of a12 and a l l a r e both biased towards zero. Second, as w a s mentioned ear- l i e r , i t i s possible t h a t t h e biased p a r a m e t e r estimates will provide a policy maker with more a c c u r a t e information than t h e unbiased ones.

A simple example of this second idea can be constructed. Suppose t h e r e a r e two groups of people with mortality c u r v e s t h a t a r e r e l a t e d as follows. For t h e f i r s t group ~ ( t )

=

F e z p ( y ) and f o r t h e second group ~ ( t )

=

F ezp(A

+

y ). I t i s f u r t h e r assumed t h a t at time 0 t h e distributions of y are t h e same in t h e two groups. However, due t o t h e p r o c e s s of selection individuals in group A will have a different distribution of y at some l a t e r time than will those not in group A. A s a r e s u l t a model which ignores f r a i l t y will give a n estimate of A which is biased to- w a r a z e r o as suggested by t h e Monte Carlo work of Ridder and Verbakel (1983).

Now suppose we wish t o know t h e mortality c u r v e t h a t will b e experienced by a n in-

(19)

Table 2. Effect of misspecified distribution of unobserved variables on estimate of c (250 individuals, 12 simulations p e r probability).

Table 2a: Model A Probability

of point 1 0.2 0.4

Simulation c 3 Variance c 3 Varianoe c 3 Varianoe c 3 Varianoe

Mean:

Estimated Variance:

Table 2b: Model B Probability

of point 1 0.2 0.4

Simulation c 3 Varianoe c 3 Varianoe C 3 Varianoe c 3 Varianoe 1 -0.94 0.01 -0.93 0.01 -1.11 0.02 -0.79 0.02 2 -1.17 0.01 -1.08 0.01 -0.72 0.01 -0.69 0.01 3 -0.64 0.01 -1.05 0.01 -1.00 0.01 -1.59 0.01 4 -1.01 0.01 -1.25 0.01 -0.90 0.01 -0.92 0.01 5 -0.96 0.01 -1.08 0.01 -0.73 0.01 -1.15 0.01 6 -1.05 0.01 -0.74 0.01 -1.21 0.02 -0.97 0.02 7 -1.07 0.01 -0.94 0.02 -0.90 0.01 -1.00 0.02 8 -0.69 0.01 -0.82 0.01 -0.97 0.01 -0.88 0.02 9 -0.82 0.01 -0.99 0.01 -0.91 0.01 -0.97 0.02 10 -1.18 0.01 -0.95 0.01 -0.85 0.01 -0.89 0.01 11 -0.94 0.01 -0.90 0.01 -0.97 0.01 -1.05 0.01 12 -0.86 0.01 -0.81 0.01 -1.31 0.01 -0.81 0.01

Mean: -0.94

Estimated

Variance: 0.029

(20)

dividual who moves into group A from t h e o t h e r group at time 0. Since t h e initial distribution of f r a i l t y in t h e two populations is t h e s a m e , i t is evident t h a t t h e b e s t estimate of his new mortality p a t t e r n will be t h e o b s e r v e d mortality p a t t e r n of t h e group h e h a s joined. Since t h e biased estimate is based o n t h e o b s e r v e d mortality i t is likely t o b e a more a c c u r a t e r e p r e s e n t a t i o n of t h e e f f e c t of moving into group A t h e n t h e unbiased estimate. The unbiased p a r a m e t e r will give a n unbiased estimate of t h e r e l a t i v e r i s k of members of t h e two populations only at time 0.

With t h i s p e r s p e c t i v e in mind i t seems essential t o have a way of determining what t h e e f f e c t of a biased p a r a m e t e r estimate will b e on t h e p r e d i c t e d r e s u l t of some policy which s e r v e s t o a l t e r t h e value of one or more of t h e p a r a m e t e r s in t h e model. One way t h a t t h i s may b e done i s t o simply change a p a r a m e t e r and t h e n g e n e r a t e a new set of d a t a using t h e simulation p r o c e d u r e . By then comparing t h e o b s e r v e d mortality rates in t h e two c a s e s w e will b e estimating t h e e f f e c t of some policy. In p r a c t i c e , however, t h i s a p p r o a c h may b e impractical s i n c e a l a r g e s a m - ple size is r e q u i r e d t o obtain estimates of t h e o b s e r v e d mortality which have small variance. (Figure 1 , f o r example, i l l u s t r a t e s t h e size of t h e variation which a r i s e s from simulated data with a sample size of 1000.) Moreover, in most cases i t i s difficult t o determine t h e v a r i a n c e of t h e estimated mortality s o t h a t a n a p p r o p r i a t e l y l a r g e sample size may b e s e l e c t e d in t h e f i r s t place.

An a l t e r n a t i v e a p p r o a c h is t o use t h e equations describing t h e dynamics of t h e moments of t h e underlying distribution and t h e observed mortality in t h e context of t h e s t o c h a s t i c state s p a c e model. These equations f o r t h e multivariate case are derived in Woodbury and Manton (1977). Yashin, Manton, and Vaupel (1985) gen- e r a l i z e t h e equations by including a n o b s e r v e d and a n unobserved v a r i a b l e in t h e model t o g e t h e r . Yashin (1984) provides t h e mathematical underpinnings of t h e g e n e r a l model and a l s o p r e s e n t s a n elegant derivation of t h e multivariate prediction equations.

The basic r e s u l t s c a n b e e x p r e s s e d as follows: if t h e initial distribution of unobserved v a r i a b l e s is multivariate normal, if t h e equations of motion are l i n e a r , and if t h e mortality function i s quadratic in t h e state v a r i a b l e s , e.g.

then t h e following equations are t r u e :

(21)

where m ( t ) and V ( t ) a r e t h e v e c t o r of means and matrix of covariances, r e s p e c - tively, of t h e mortality determinants conditional on survival t o time t .

If t h e t r u e model m e t t h e conditions of t h i s theorem completely t h e n t h e r e would b e no hesitancy in applying t h e s e equations in o r d e r t o p r e d i c t t h e e f f e c t on observed mortality of c e r t a i n policy experiments. A s may b e remembered, howev- e r , t h e specification of t h e mortality c u r v e w a s exponential r a t h e r t h a n q u a d r a t i c in w . A s a r e s u l t i t is n e c e s s a r y t o use a n approximation of t h e a c t u a l mortality c u r v e s o t h a t t h e a b o v e equations may b e applied. The e f f e c t of making t h i s simpli- fying assumption on r e s u l t s c a n b e estimated through t h e use of simulations.

A Taylor expansion of & t , w , y ) about t h e means m l and m 2 of w and y at time z e r o c a n b e written as follows:

If t h i s equation is t h e n used t o g e n e r a t e a s e r i e s of

E,

t h e n w e will have at least a n approximation of t h e e x p e c t e d o b s e r v e d mortality c u r v e . In o r d e r t o test t h e ex- t e n t t o which t h e approximation deviates from t h e t r u e r e s u l t s w e simulated 1000 d a t a points using t h e "true" exponential model and t h e q u a d r a t i c approximation of t h a t model. In Figure 1, th e s e two c u r v e s are plotted along with t h e values of z ( t ) obtained using t h e prediction equations f o r t h e q u a d r a t i c model, (7), ( 8 ) , and ( 9 ) . The d a t a g e n e r a t e d from t h e q u a d r a t i c mortality function should h a v e as i t s expec- tation l ( t ) . If t h e approximation is r e a s o n a b l e , then t h e c u r v e g e n e r a t e d from t h e exponential model should b e similar t o t h e c u r v e g e n e r a t e d f r o m t h e q u a d r a t i c model. I t seems from Figure 1 t h a t t h e approximation i s a c c e p t a b l e .

Even if l a r g e d i f f e r e n c e s were o b s e r v e d i t would not b e n e c e s s a r y t o abandon t h e a p p r o a c h a l t o g e t h e r , since even if t h e mortality c u r v e s tend t o b e quite dif- f e r e n t , t h e r e l a t i v e mortality a r i s i n g from a change in one of t h e coefficients may b e well estimated using t h e approximation. This issue needs t o b e investigated in g r e a t e r detail by comparing simulated r e s u l t s with p r e d i c t e d r e s u l t s arising from t h e dynamic equations.

(22)

Figure 1

M ~ n t h s

a-a Prcd- Eqn b -b S i m - True c-c Sirn. Q u o d

Table 3 provides a printout g e n e r a t e d by t h e prediction equations when t h e t r u e (e.g. input) p a r a m e t e r s are known. The o b s e r v e d rising mean weight and fal- ling mean f r a i l t y are c e r t a i n l y expected due to selection. The relatively small changes o b s e r v e d in t h e moments of t h e c o v a r i a t e s suggests t h a t t h e p r o c e s s of selection is having relatively little effect on p a r a m e t e r estimation. This r e s u l t is not surprising since 70% of t h e individuals survive until t h e i r 30th month.

Nonetheless, t h e c o v a r i a n c e term, which is negative and d e c r e a s e s in magni- tude o v e r time, provides insight into t h e likely r e s u l t of a model which incor- p o r a t e s unobserved heterogeneity but assumes t h a t t h e unobserved v a r i a b l e s are u n c o r r e l a t e d with t h e o b s e r v e d v a r i a b l e s ( s e e footnote 4). Since t h e e f f e c t of selection i s t o make t h e two c o v a r i a t e s less dependent on e a c h o t h e r o v e r time, and since t h e bias o b s e r v e d in Model B is a d i r e c t r e s u l t of t h e c o v a r i a n c e between them, w e may postulate t h a t a model of unobserved heterogeneity which c o n t r o l s

(23)

Table 3. Prediction estimates based on Model A .

Month

-

&.

Lz

Mean ( w ) Variance (W ) Mean (Y ) Variance (Y ) Covariance ( w ,Y )

0 0.0147 1.000 -1.000 0.333 1.000 0.100 -0.100

1 0.0144 0.985 -0.995 0.349 0.997 0.100 -0.099

2 0.0142 0.971 -0.991 0.349 0.994 0.099 -0.099

3 0.0140 0.958 -0.988 0.348 0.992 0.099 -0.099

4 0.0138 0.944 -0.985 0.348 0.989 0.099 -0.098

5 0.0135 0.931 -0.982 0.347 0.986 0.099 -0.098

6 0.0133 0.919 -0.980 0.347 0.984 0.098 -0.098

7 0.0131 0.907 -0.977 0.347 0.981 0.098 -0.097

8 0.0129 0.895 -0.975 0.347 0.979 0.098 -0.097

9 0.0127 0.883 -0.972 0.346 0.976 0.097 -0.097

10 0.0126 0.872 -0.970 0.346 0.974 0.097 -0.097

11 0.0124 0.861 -0.967 0.346 0.971 0.097 -0.096

12 0.0122 0.851 -0.965 0.346 0.969 0.097 -0.096

13 0.0120 0.840 -0.963 0.345 0.966 0.096 -0.096

14 0.0118 0.830 -0.960 0.345 0.964 0.096 -0.096

15 0.0116 0.821 -0.958 0.345 0.962 0.096 -0.095

16 0.0115 0.811 -0.956 0.345 0.960 0.096 -0.095

17 0.0113 0.802 -0.954 0.344 0.957 0.095 -0.095

18 0.0111 0.793 -0.952 0.344 0.955 0.095 -0.095

19 0.0110 0.784 -0.950 0.344 0.953 0.095 -0.094

20 0.0108 0.775 -0.948 0.344 0.951 0.095 -0.094

2 1 0.0107 0.767 -0.945 0.344 0.949 0.094 -0.094

22 0.0105 0.759 -0.943 0.343 0.947 0.094 -0.094

23 0.0104 0.751 -0.941 0.343 0.945 0.094 -0.094

24 0.0102 0.743 -0.940 0.343 0.943 0.094 -0.093

25 0.0101 0.736 -0.938 0.343 0.941 0.094 -0.093

26 0.0099 0.728 -0.936 0.342 0.939 0.093 -0.093

27 0.0098 0.721 -0.934 0.342 0.937 0.093 -0.093

28 0.0097 0.714 -0.932 0.342 0.935 0.093 -0.093

29 0.0095 0.707 -0.930 0.342 0.933 0.093 -0.092

only f o r t h e effect of selection would give parameter estimates with more bias than a model of infant growth and mortality which ignores unobserved variables alto- gether. If t h i s hypothesis is supported by f u t u r e work then t h e r e will be addition- a l reason f o r giving careful consideration t o possible relationships between unobserved variables and observed ones in t h e process of model estimation.

6.1. Prediction Equation R e s u l t s

Table 4 r e p r e s e n t s , in some sense, t h e motivation f o r t h e e n t i r e discussion up t o t h i s point. In i t we compare t h e effect of f o u r policy alternatives on t h e estimated probability of surviving t o t h e 30th month f o r t h e two different models. Es- timates were generated by simulating a l a r g e d a t a s e t (1000 children) and then fitting t h e two models (Table 5). Parameter estimates were then used in t h e dynamic

(24)

equations described above in o r d e r t o produce estimated survival c u r v e s . The dynamic e q ~ a t i o n s were similar in t h e two c a s e s , with t h e mean of y set t o one and t h e v a r i a n c e t o z e r o in Model B. For e a c h policy we estimated t h e elasticity of t h e probability of dying by t h e 30th month by increasing t h e a p p r o p r i a t e p a r a m e t e r by 10% and noting t h e p e r c e n t a g e change in t h e estimated probability of dying t h a t t h e prediction equations produced. Since A is t h e fitted version of t h e t r u e model t h e values obtained from Model A are good estimates of t h e t r u e impact of a given change in a p a r a m e t e r . Model B , then, gives p o o r estimates of t h e effect of specif- i c polices on survival c h a n c e s when r e s u l t s of Model B d i f f e r by a l a r g e amount from those of Model A.

The f i r s t policy considered involves raising t h e mean b i r t h weight and assuming t h a t all o t h e r p a r a m e t e r s remain fixed. A program which provides nutritional suppiementation, p r e n a t a l care, o r o t h e r benefits t o p r e g n a n t women would produce t h i s r e s u l t and is in s o m e ways relatively easy t o administer (compared, for example, t o making s u r e t h a t infants a r e adequately fed). In any case, given a program of t h i s s o r t t h e r e a r e two r e a s o n s t h a t we might e x p e c t t h e estimates f r o m Model B t o b e overly optimistic. F i r s t , since t h e e f f e c t of weight on mortality w a s overestimated in Model B , w e will overestimate t h e e f f e c t of raising t h e weight at b i r t h on mortality during t h e f i r s t month of life. Second, since Model B produces an estimate of a l l which i s biased towards z e r o , suggesting a f i r s t o r d e r a u t o r e - gressive p r o c e s s which a d j u s t s t o shocks reiatively slowly, w e will overestimate t n e mean weights t h a t will b e o b s e r v e d during t h e months immediately following b i r t h . This effect i n c r e a s e s t h e e x t e n t t o which mortality declines will b e overestimated in t h e model which ignores heterogeneity. Table 4 confirms t h e bias.

While Moael A suggests t h a t t h e policy will have little e f f e c t , Model B p r e d i c t s a small fall in mortality if b i r t h weights r i s e . Specifically, Model B p r e d i c t s t h a t a 10% ri s e in mean weight a t b i r t h will lead t o a 0.9% fall in t h e probability of dying by a g e 30 months. A t t h e p r e d i c t e d level of mortality, this i n c r e a s e s t h e number of surviving children p e r 1000 b i r t h s by 2.

A second policy involves a d e c r e a s e in t h e e f f e c t of weight on mortality. A p r a c t i c a l example of a policy t h a t might g e n e r a t e t h i s r e s u l t would b e a program t h a t provides medical care t o low weight children. Again w e e x p e c t Model B t o overestimate t h e mortality declines t h a t will b e o b s e r v e d if such a policy i s implemented because i t assumes t h a t t h e only f a c t o r affecting mortality i s weight. .Since in t h e t r u e moael mortality is also affected by a n individual's f r a i l t y , a program targeting low weight children will not necessarily b e reaching t h e ones with t h e

(25)

Table 4. Predicted changes in mortality

Model A Model B

Estimated Estimated

1

-

elasticity1 1

-

elasticity1

Base values 0.224

-

^0.217

-

tnl:

=

m l * l . l 0.224 0.00 0.219 0.09

c3:

=

c 3 * l . l 3.235 0.49 0.240 1.06

a 12:

=

^a12*1.1 0.234 0.46 0.233 0.74

m2: =tn2*1.P 0.248 1.07

- -

'Percent change in t h e proportion dying f o r a 1% change in t h e specified p a r a m e t e r . In e a c h c a s e mortality will r i s e if t h e parameter value r i s e s .

Table 5. Large sample size estimates (1000 individuals).

Model A Model B

Input P a r a m e t e r Variance P a r a m e t e r Variance P a r a m e t e r value estimate estimate estimate estimate

(26)

h i g h e s t mortality. (They will, of c o u r s e , b e r e a c h i n g some of t h e o n e s with high moriality b e c a u s e of t h e c o r r e l a t i o n between weight a n d f r a i l t y ) . Again, as ex- p e c t e d , Table 4 i n d i c a t e s t h a t Model B o v e r e s t i m a t e s t h e e f f e c t of t h e specified p ~ i i c y

.

A third, possible poiicy involves a r i s e in t h e mean weight at a l l a g e s . An example might b e a feeding p r o g r a m t h a t r e a c h e s c h i l d r e n of t h e a p p r o p r i a t e a g e s . From t h e p e r s p e c t i v e of t h e model c o n s i d e r e d h e r e t h i s policy amounts t o changing t h e value of a12. Alternatively, o n e might i n c o r p o r a t e a c o n s t a n t t e r m i n t o t h e equations of motion. As b e f o r e we e x p e c t a n o v e r e s t i m a t e of t h e e f f e c t on mortali- t y from model B. In t h i s case Model A p r e d i c t s t h a t a 10% in c r e a s e in mean weight at a l l a g e s will i n c r e a s e t h e number of surviving c h i l d r e n p e r 1 0 0 0 by 1 0 . Model B o v e r e s t i m a t e s t h i s i n c r e a s e by 6.

A final policy invoives changing t h e d i s e a s e environment o r f r a i l t y , p e r h a p s t h r o u g h building l a t r i n e s , educating women, providing running water, o r u n d e r t a k - ing e f f e c t i v e vaccination a n d o r a l r e h y d r a t i o n campaigns. S i n c e Model B i g n o r e s f r a i l t y , o n e c a n n o t give a p r e d i c t i o n b a s e d o n Model B. I t i s c l e a r f r o m t h e r e s u l t s in Table 4 , t h a t t h i s sort of policy i s likely t o h a v e a l a r g e e f f e c t , at least if o n e a c c e p t s t h e p a r a m e t e r e s t i m a t e s used t o g e n e r a t e t h e d a t a as r e a l i s t i c . Estimation of p a r a m e t e r s from real d a t a will p r o v e helpful in determining t h e r e l a t i v e mortal-

ity r e d u c t i o n s t h a t will b e o b t a i n e d u n d e r v a r i o u s policies.

One final n o t e i s n e c e s s a r y . In p r a c t i c e , a n y p r o p o s e d policy c a n n o t b e d e s c r i b e d in terms of a c e t e r i s p a r i b u s c h a n g e in a single p a r a m e t e r . F o r example a policy designed t o r e d u c e t h e level of f r a i l t y in t h e environment may a l s o r e d u c e t h e v a r i a t i o n of f r a i l t y o b s e r v e d in t h e population. While c a r e f u l t h o u g h t c a n b e helpful in p r e d i c t i n g what t h e r e s u l t s of a given p r o g r a m will b e , i t i s c e r - tainly n o s u b s t i t u t e f o r estimation. If t h e model i s r e p e a t e d l y a p p l i e d in d i f f e r e n t communities e x p e r i e n c i n g d i f f e r e n t sorts of intervention p r o g r a m s t h e n some rea- s o n a b l e generalizations are likely t o e m e r g e a b o u t t h e e f f e c t on p a r a m e t e r s of t h e model of c e r t a i n kinds of p r o g r a m s .

(27)

7 . Conclusion

i n conclusions, i t seems most a p p r o p r i a t e t o point t o t h e additional work t h a t n e e d s t o b e done b e f o r e t h e model discussed h e r e c a n b e a d e q u a t e l y understood.

F i r s t , t h e model b e g s t o b e a p p l i e d ; although t h e simulation r e s u l t s are c e r t a i n l y encouraging t h e r e a l worth of t h e model c a n only b e t e s t e d by fitting t h e model t o available d a t a . Second, more complicated s p e c i f i c a t i o n s may b e c o n s i d e r e d . The v a r i a n c e of t h e e r r o r t e r m in t h e equations of motion in weight might, f o r example, b e allowed, t o depend on t h e level of f r a i l t y . T h i r d , t h e r o b u s t n e s s of t h e r e s u l t s t o o t h e r t y p e s of d i s t r i b u t i o n s t h a n t h e two point d i s t r i b u t i o n need t o b e t e s t e d . Robustness t o o t h e r assumptions s u c h as t h e l i n e a r a n d Markovian assumptions in t h e equations of motion must a l s o b e e x p l o r e d . F o u r t h , t h e r o b u s t n e s s of t h e p r e d - iction equations t o misspecification needs t o b e c o n s i d e r e d more c a r e f u l l y . If t h e s e equations p r o v e t o b e r e a s o n a b l y r o b u s t t h e n t h e y c a n b e powerful tools f o r expioring policy a l t e r n a t i v e s .

Despite t h e s u b s t a n t i a l amount of additional work t h a t remains t o b e done, t h e significance of t h e r e s u l t s o b t a i n e d s o f a r d e s e r v e t o b e u n d e r s c o r e d . I t seems c l e a r from t h e a b o v e r e s u l t s , t h a t it is possible t o c o n s t r u c t a model of child mor- t a l i t y which explicitly i n c o r p o r a t e s u n o b s e r v e d environmental f r a i l t y . Moreover, at l e a s t u n d e r c e r t a i n assumptions, a model of t h i s sort l e a d s t o p a r a m e t e r estimates which p r o v i d e more a c c u r a t e information t o policy m a k e r s t h a n d o models which i g n o r e u n o b s e r v e d h e t e r o g e n e i t y .