Environmental Modeling Under Uncertainty: Monte Carlo Simulation

(1)

ENVIRONMENTAL MODELING UNDER UNCERTAINTY:

MONTE CARLO SIMULATION

K. Fedra

I n t e r n a t i o n a l I n s t i t u t e f o r A p p l i e d S y s t e m s A n d y s i s , L n z e n b u r g , A u s t r i a

RR-83-28 November 1983

INTERNATIONAL INSTITUTE FOR APPLTED SYSIEMS ANALYSIS Laxenburg, Austria

(2)

International Standard Book Number 3-7045-0061-5

Research Reports, which record research conducted a t IIASA, a r e independently reviewed before publication. However, t h e views and opinions they express a r e not necessarily those of t h e Institute or t h e National Member Organizations t h a t support it.

Copyright O 1983

International Institute for Applied Systems Analysis

All rights resewed. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage o r retrieval system, without permission in writing from t h e publisher.

Cover design by Anka James

Printed by Novographic, Vienna, Austria

(3)

FOREWORD

In r e c e n t y e a r s , t h e r e h a s been considerable i n t e r e s t i n developing models for environmental systems, and for aquatic s y s t e m s in p a r t i c u l a r . Much of t h i s effort h a s been directed toward large and complex simulation models. However, t h i s t r e n d h a s given rise t o a n u m b e r of c o n c e r n s , notably those of accounting for t h e effects of u n c e r t a i n t y . Testing model s t r u c t u r e s , calibrating complex simulation models u n d e r u n c e r t a i n t y , and propagating t h i s u n c e r t a i n t y in t h e predictions of models a r e essential s t e p s in establish- ing model validity and credibility for practical applications.

The International Institute for Applied Systems Analysis (IIASA) is addressing s u c h concerns i n its work on environmental quality control and m a n a g e m e n t , one of t h e principal t h e m e s being t o develop a framework for modeling poorly defined environmental s y s t e m s .

This r e p o r t , based on a series of earlier papers on t h e subject, discusses t h e u s e of Monte Carlo methods when t h e available field d a t a a r e s p a r s e and u n c e r t a i n . I t examines t h e problem of c o n s t r u c t i n g , calibrating, evaluating, and applying a model for prediction

-

a n d ultimately for m a n a g e m e n t (K.

Fedra (1980) Mathematical modelling - a m a n a g e m e n t tool for aquatic ecosystems? Helgolander Meeresuntersuchungen 34:221-235, also r e p r i n t e d a s IIASA Research Report RR-81-2). In p a r t i c u l a r , i t emphasizes t h e impor- t a n c e of model testability (K. Fedra (1981) Hypothesis testing by simulation:

a n e n v i r o n m e n t a l example. IIASA Working P a p e r WP-81-74) and t h e close rela- tionship between t h e processes of model calibration a n d t h e predictions obtained subsequently (K. Fedra, G. van S t r a t e n , and M.B. Beck (1981) Uncer- tainty a n d a r b i t r a r i n e s s in ecosystems modelling: a lake modelling example.

Ecological Modelling 1387-110, also reprinted as IIASA Research Report RR- 81-26).

Thus, u n c e r t a i n t y a n d t h e reliability of models a n d forecasts based on Monte Carlo simulation a r e t h e key c o n c e r n s of t h i s r e p o r t .

Janusz Kindler Chairman of t h e former Resources a n d Environment Area

(4)

(5)

The s t u d y of e n v i r o n m e n t a l s y s t e m s a s e c o l o g i c a l a n d p h y s i c o c h e m i c a l as w e l l a s s o c i o e c o n o m i c e n t i t i e s r e q u i r e s a h i g h d e g r e e of s i m p l i f y i n g j o r - m a l i s m . H o w e v e r , a d e t a i l e d u n d e r s t a n d i n g of a s y s t e m s j u n c t i o n a n d r e s p o n s e t o v a r i o u s c h a n g e s f o r t h e e x p l i c i t p u r p o s e of s y s t e m s m a n a g e m e n t a n d p l a n n i n g s t i l l r e q u i r e s f a i r l y c o m p l e z h y p o t h e s e s , o r m o d e l s . % c h m o d e l s c a n h a r d l y b e s u b j e c t e d to r i g o r o u s t e s t s w i t h o u t t h e aid o f c o m p u t - e r s . & s t e m s s i m u l a t i o n is a p o w e r f u l t o o l w h e n s u b j e c t i n g c o m p l e x h y p o t h e s e s t o c r i t i c a l t e s t s o f t h e i r l o g i c a l s t r u c t u r e a n d t h e i r p e r f o r m a n c e o v e r t h e r a n g e o f p l a u s i b l e i n p u t c o n d i t i o n s .

B a s e d o n a f o r m a l i z e d t r i a l - a n d - e r r o r a p p r o a c h u s i n g Monte Carlo m e t h o d s , this r e p o r t p r e s e n t s a n d d i s c u s s e s a n a p p r o a c h t o s i m u l a t i o n m o d e l i n g u n d e r u n c e r t a i n t y . A n i n t r o d u c t i o n to t h e c a u s e s a n d i m p l i c a t i o n s of t h e p r o b l e m , n a m e l y u n c e r t a i n t y , a n d a s h o r t f o r m a l p r e s e n t a t i o n of t h e m e t h o d o l o g y p r o p o s e d a r e f o l l o w e d b y s o m e m o r e t e c h n i c a l r e m a r k s o n Monte Carlo s i m u l a t i o n . U s i n g t h r e e d i f f e r e n t a p p l i c a t i o n e x a m p l e s , t h e a u t h o r d i s c u s s e s t h e r o l e of u n c e r t a i n t y in t h e f o r m a l t e s t i n g of m o d e l s t r u c - t u r e s , in p a r a m e t e r e s t i m a t i o n , a n d in p r e d i c t i o n . I n t h e l a s t e x a m p l e , t h e l i m i t s of e s t i m a t i o n a n d , with i t , p r e d i c t i o n a r e d e m o n s t r a t e d . I n a c o m - p a r i s o n of M o n t e Carlo s i m u l a t i o n with a l t e r n a t i v e a p p r o a c h e s t o i n c l u d i n g a n d e v a l u a t i n g u n c e r t a i n t y in s i m u l a t i o n m o d e l i n g , t h e d i s c u s s i o n s e c t i o n e x a m i n e s t h e i m p l i c a t i o n s of u n c e r t a i n t y f o r m o d e l a p p l i c a t i o n in a b r o a d e r f r a m e w o r k .

(8)

1 INTRODUCTION

Environmental modeling may conveniently be understood as a tool - a tool for t h e study of systems t h a t are large, complex, difficult to observe, a n d experimentally more or less inaccessible. It is a formal way of organizing knowledge (or t h e lack thereof) a t t h e intersections of ecology a n d t h e life sciences, geography and the earth sciences, the social and political sciences, economy and engineering, and usually a few more of t h e classical disciplines.

Environmental modeling and simulation is also a tool for developing and testing the hypotheses on which any organization of knowledge is based, a n d is therefore just one instrument of scientific research. This tool may be used for making "predictions," experiments with possible f u t u r e s , exploring alternative courses of action. It thus has potential t o aid management a n d decision making and t o help design and explore policies.

In t h e core of any comprehensive environmental system, t h e r e is usually a n ecological system or an ecosystem in t h e more classical sense (Haeckel 1870, E.P. Odum 1971); and a close look a t t h e kinds of data t h a t are available on ecosystems shows mainly uncertainties, variability, and sampling e r r o r s (more often t h a n not of undetermined magnitude). In addition, ecological theory (and whatever p a r t of it may be relevant within t h e more comprehensive framework of environmental science) is full of contradictory hypotheses, and i t is mostly impossible t o rule o u t any of those because of lack of reliable and sufficient data. Consequently, t h e coexistence of competing and eventually contradictory model formulations (contradictory in the s e n s e t h a t they will produce significantly different predictions from t h e s a m e s e t of inputs) is notorious. A nice illustration is given by Simons and Lam (1980), when they observe in t h e i r critique of models used in t h e Great Lakes studies t h a t

"these results illustrate quite clearly t h a t one can accommodate a wide range of primary production formulations in a model a s long a s t h e r e a r e additional degrees o:F freedom to 'play with,' in t h i s case t h e uncertainty associated with respiration and other forms of nutrient regeneration." This phenomenon, by the way, can also be observed in the social or political sciences a s well as in economics, which, unfortunately but significantly, are also basic components of applied environmental research.

Experimental evidence, as a rule, s t e m s from microscale pfiysiological approaches, contradictory in their very design t o t h e richness and variety of ecosystems, a n d deliberately neglecting a main feature of any even moderately complex ecosystem, which is t h e simultaneous interaction of large n u m b e r s of variables. Traditional concepts and approaches a r e merely extrapolations of ideas t h a t proved to be successful in physics and chemistry.

However, ecosystems are quite different from elect.rica1 networks, t h e fric- tionless pendulum, and controlled chemical reactions of some compounds. All these incompatibilities can seemingly be overcome only with numerous more or less arbitrary assumptions, often enough implicitly hidden jn a hypothesis, or model formulation. The information available is of a jigsaw puzzle s t r u c - t u r e , a n d a t best we can deduce fuzzy patterns, semiquantitative relationships, ranges, and constraining conditions, unless we blindly believe in n u m b e r s once they a r e printed, preferably by the computer.

(9)

Chance, or random variability, plays a n important and sometimes dom- inant role in environmental systems. This is t r u e not only for t h e micro- scopic, elementary level (Monod 1970), b u t also for living, evolving, dissipa- tive systems and s t r u c t u r e s in general (e.g. Eigen and Winkler 1975). All these features, including t h e consequences of haphazard h u m a n interfer- ence, contribute t o one prominent aspect of environmental systems and t h u s modeling: uncertainty. Clearly, under these circumstances t h e applicability of traditional, fully deterministic techniques, with all t h e i r implicit and explicit assumptions on t h e distributions and functional properties of t h e variables observed (or r a t h e r sampled), and a firm belief in n u m b e r s have to be questioned. Forcing environmental systems into a mathematical framework developed for vastly different systems, for t h e sake of t h e ease and elegance of t h e analysis, seems to me not only a futile b u t also a dangerous line of work. And a s a consequence, many model-based predictions on environmental systems a r e either trivial or false o r , a t best, computerized intuition of t h e analyst.

Alternative approaches are needed, if environmental modeling is t o improve its so far meager record of impact on environmental decision making a n d public reasoning. One possibility is a formal and computer-based application of probably t h e simplest and most straightforward approach, b u t maybe also the only possible approach t o scientific research: trial and error.

1.1 Monte Carlo Methods: Computerized Trial and Error

"Our whole problem is t o make t h e mistakes fast enough

..."

(Wheeler 1956) Monte Carlo methods, as used a n d discussed in this report, a r e nothing more t h a n computerized trial a n d error. It is a technique, however, t o make extremely high numbers of errors, and to make t h e m very fast - and, i t is hoped, t o learn from t h e s e errors. As indicated by the n a m e , i t is a form of gambling - picking random n u m b e r s from appropriate distributions and using t h e m for numerous trials (and errors). A system of filters is then used to separate t h e solutions - if t h e r e a r e any winning numbers

-

from t h e failures.

The method is characterized by a very appealing simplicity. This may be best exemplified by t h e fact t h a t this report is written by a n ecologist, not a mathematician. No implicit, abstruse statistical assumptions have to be made, e i t h e r on t h e available d a t a describing t h e system t o be modeled, or on the concept of a g r e e m e n t or "goodness of fit" between model output and t h e observations modeled, which is t h e deviation or error t o be minimized in

"classical" approaches (Section 4.2). Arbitrary assumptions have t o be made, like in all other approaches, but t h e simplicity of t h e method allows for an explicit s t a t e m e n t and t r e a t m e n t of all t h e assumptions. None of t h e assumptions are hidden within t h e method, they can all be made "externally." A high degree of flexibility in constructing a n appropriate estimation scheme for a given application problem allows one t o s t r u c t u r e the tool according t o t h e

(10)

problem - and not force t h e problem into t h e constraints of the method.

Any simulation model can, with a minimum amount of programming skills, be easily incorporated into an appropriate framework for t h e Monte Carlo estimation, including t h e generation of trial runs, their monitoring, and the most crucial part, the evaluation of t h e trials. The model can be as complex and nonlinear as deemed necessary by its builder, and t h e r e is no limit, in principle, to the number of parameters for simultaneous estimation.

The price for all these advantages has to be paid in t e r m s of computer time: excessive trial and error, when done simply (blindly and "unintelli- gently," i.e. without learning from the errors within a series of trials), requires a comparatively large amount. In addition, the time requirements grow exponentially with the dimensionality of the problem, t h a t is, t h e number of parameters estimated simultaneously. Computer time, however, is becoming cheaper and cheaper, and in many cases is no real constraint for t h e analysis, as compared with, for example, the much more demanding and expensive collection of field or laboratory data.

1.2 The Theoretical Framework: Models. Knowns, and Unknowns

Some conceptual clarifications seem to be unavoidable in order to intro- duce t h e terminology used in the following sections. Calibration, in a non- technical definition, is the tuning of a model in order to improve the agree- m e n t of model-generated output with the observations from the system t o be modeled. Tuned or adjusted a r e coefficients describing the relationships between t h e model elements, i.e. s t a t e variables, inputs, and outputs ( t h e boxes and cycles i-n flow diagrams), and auxiliary values such as thresholds, carrying capacities, stoichiometric constants, or any other "adjustable"

values. If a model deals with "simple" systems arid well established laws of nature, no tuning should be necessary, since all the parameters required a r e well known constants. If we want to model the fall of a pebble, we certainly would not a t t e m p t to calibrate the constant of gravity, but would take it from t h e literature.

In epistemological t e r m s , the mode1in.g process involves:

a. a theory or universal s t a t e m e n t ( t h e model structure), together with b. a s e t of initial conditions (the initial conditions s e n s u s t r i c t o , i.e. t h e

state of the elements of t h e system a t time t = O ; the parameters, i.e.

measures quantitatively describing the relationships of these systems elements and any auxiliary coefficients; and, in the case of dynamic models, inputs into the system, or forcings or driving variables, which can be viewed as a time series extension of a certain subset of t h e initial conditions), t o derive

c. a s e t of singular statements (the model output), which t h e n has to be compared with appropriate observations.

In a pragmatic (ab)use of t h e usual terminology, I will split t h e union s e t of parameters, initial conditions, and inputs (forcings) into two

(11)

complementary subsets, namely the "knowns" (e.g. site constants, such as t h e volu.me of a lake or the length of a river r e a c h , or any number in which we can place enough confidence to consider it "known") and the "unknowns."

The l a t t e r have t o be estimated, and will, for simplicity, be referred to a s parameters; the "knowns" 1 will call constants.

1.3 Model Structure, Parameters, Inputs, a n d Observations: Some Implica- tions of Uncertainty

If a system or process to be modeled is'well known, as, for example, in classical mechanics, if the initial conditions can be manipulated or observed without error, and if t h e elements of t h e system and t h u s t h e outcome of an experiment can be observed directly and without (or with very small) error, calibration would, if a t all necessary, be a simple undertaking. One could, to exploit a simple example given by Popper (1959), calibrate a material con- s t a n t for a thread. However, one would r a t h e r call this process a direct experimental determination of the magnitude in question, as i t can usually directly and analytically be inferred from t h e experiment. If, however, t h e required value for t h e material constant would have t o be found by iteration.

one might call this calibration.

In environmental modeling, however, t h e problems are much more mud- dled and diffuse, and we have neither a well established theoretical framework (allowing us to s e t up an indisputable model s t r u c t u r e a p r i o r i ) nor known constants. Even t h e observations available in situ or from experiments a r e difficult to use, since they a r e generally made on a different level of complexity and on a different scale t h a n used in our models. There are several generic problems associated with ecological modeling, or any large-scale modeling of systems and processes t h a t a r e complex, difficult to observe, and almost impossible to manipulate.

The first and probably most important problem is in t h e discrepancy between t h e scale of model conceptualization and the scales of measurement, observation, and experimentation. Our knowledge of large and heterogeneous systems is always derived from "samples," and even these samples, generally associated with a certain error, a r e always ranges. Observations and experiments are usually made on a micro-scale, involving individual cells, monospecific cultures, or extremely small samples from t h e system (just consider the proportion of the volume of a sampling bottle to t h a t of a lake).

There exists, of course, a well established theory of sampling, and statistics will tell the observer or experimenter how .many and which size of samples should be drawn to reach a certain level of confidence for t h e resulting estimates. However, for reasons t h a t can only partly be attributed to logistic problems and resource limitations, sampling statistics seem to be one of the most neglected fields in ecological research.

A somewhat different interpretation of t h e discrepanc:y between theory and observations - anathema to the pure empiricist

-

could be a claim t h a t t h e relevant observational and experimental techniques a r e just insufficient or unreliable (e.g. Feyerabend 1975, Lakatos 1978, and Section 4.1). Empirical

(12)

evidence and theory can eventually be even incommensurable.

The units dealt with in formal conceptualizations of environmental systems, i.e. t h e models, on t h e other hand, are usually large, lumped, and inaccessible to direct experimentation. They a r e idealized functional entities, whereas experiment and observation usually concentrate on entities t h a t are systematic (in the biological or chemical sense). The units in t h e models a r e lumped and heterogeneous, such as "primary producers," "zooplankton," or

"available nutrients." Therefore, their functional characteristics, described by t h e "parameters," can only crudely be estimated from the eventually measurable characteristics of t h e i r elements, e.g. an individual species (ignoring t h e additional complications of age groups, sexes, physiological states, etc.). As these functional attributes cannot be measured directly, and there is no way of reliably deriving t h e m from t h e properties of t h e microscale components, they have t o be calibrated, i.e. adjusted t o values t h a t result in a n acceptable performance of t h e model. Such heterogeneous assemblages tend t o exhibit a fairly, and .sometimes surprisingly, simple behavior. This phenomenon, often referred t o a s t h e "linear" response of highly nonlinear systems (in t e r m s of t h e i r microelements), allows -one t o t r e a t such heterogeneous elements as functional units.

It is important to recognize t h a t neither model s t r u c t u r e s , nor initial conditions, inputs, a n d parameters, nor the observations used as t h e testing ground for a model are without error. They are all uncertain, usually to a n uncertain degree, and all ought to be formulated in t e r m s of ranges or probability distributions. P a r a m e t e r estimation, as a consequence, is mostly a n a r t . Seemingly exact approaches t h a t reduce t h e problem t o the minimiza- tion of an objective function are based on numerous simplifying a n d often implicit arbitrary assumptions. Since almost everything, including t h e refer- ence values (the observations) used for calibration, is somewhat fuzzy a n d error-corrupted, derived from subjective interpretation of information r a t h e r than indisputable measurements and experimental design, a n exact and

"best" solution to the parameter estimation problem is only obtained when a t least parts of t h e uncertainty a r e ignored, thereby reducing t h e number of unknowns, although in a disputable a n d arbitrary fashion.

Both parameters and model s t r u c t u r e are uncertain, and intimately depend on each other. 'Their estimation should therefore be made con- currently. This will be demonstrated in t h e first application example (Section 3.1), based on a marine pelagic food-web simulation for t h e German Bight in t h e southern North Sea. This example illustrates t h e close dependency of parameter estimates on the model s t r u c t u r e chosen and, vice versa, attempts to show how parameter space characteristics can be utilized t o modify a model s t r u c t u r e .

In t h e next step, t h e simple application of Monte Carlo methods for parameter estimation can be extended for predictions. Obviously, predictions and especially prediction uncertainty will depend on model and parameter uncertainty. The second example of application (Section 3.2), based on a lake water quality model, denlonstrates how t h e uncertainty in the parameter estimates obtained by Monte Carlo estimation c a n be preserved, and included in t h e predictions, in order to estimate t h e reliability of predictions.

(13)

Finally, in a third application example, t h e interdependence between p a r a m e t e r estimates and performance c r i t e r i a , or objective function (which is derived from t h e available observations) used in t h e estimation procedure, will be shown (Section 3.3). By use of a simple example based on a rain-runoff model, two alternative p a r a m e t e r vectors, both minimizing plausible objective functions b u t resulting in quite different model behavior, can be gen- e r a t e d . These obvious limits to calibration can only be resolved with additional information from t h e system, t h a t is to say, with an additional s e t of (specific) observations.

2 THE METHOD

The basic principle of Monte Carlo methods, as used and discussed h e r e , is a trial-and-error procedure for t h e solution of t h e inverse problem, i . e . estimating the "unknowns" in t h e i n p u t of t h e model ( t h e parameters) from t h e required output. Since complex dynamic simulation models cannot be solved analytically, t h e solution of t h e inverse problem demands a m o r e com- plicated procedure.

The basic steps of this estimation procedure are a s follows (Figure 1): for a given model s t r u c t u r e , performance criteria describing t h e expected, satisfactory behavior of the model, based on t h e available data, a r e formulated.

For all t h e unknowns t o be estimated, allowable ranges or probability density functions are defined. From these ranges or distributions a sample vector is drawn randomly and substituted in t h e model for one trial r u n . The perfor- m a n c e criteria of this trial r u n a r e then compared with, or classified according to, t h e predefined target values or ranges of t h e performance criteria.

The process is t h e n repeated for a sufficient number of trials. After some initial trials and their analysis, t h e ranges t o be sampled may be redefined, criteria may be added o r deleted, or t h e model s t r u c t u r e changed. This whole process is repeated iteratively until t h e model performance is satisfactory, in light of t h e original problem t o be sol.ved, or until t h e user's c o m p u t e r account is exhausted.

2.1 The Concepts of Behavior Space and Model Response Set: Defining a Problem-oriented Objective Function

From a model r u n , a simulation, one obtains a vector of o u t p u t values, a singular s t a t e m e n t , or prediction, which h a s to be testable, i.e. comparable (and compared) with corresponding observations from t h e system in order t o determine whether or not t h e model (and its piirameter set) is acceptable under the constraints of t h e predefined performance criteria.

If one recognizes t h a t the entities used in a simulation model and those measured i n t h e field or i n a laboratory experiment a r e quite different, i t is obvious t h a t they cannot be compared directly, and t h e n used to e s t i m a t e one from t h e other. One h a s t o take into account the differences in scale and aggregation, and t h e resulting uncertainties. Models, because of their high

(14)

FIGURE 1 Flow diagram of the approach

degree of abstraction, simulate average p a t t e r n s or g e n e r a l f e a t u r e s of a sys- t e m ( a s conceptualized in the model). These p a t t e r n s have t o be derived from t h e available information a t a n appropriate level of abstraction a n d aggregation. Only s u c h derived m e a s u r e s c a n t h e n be compared with t h e magnitudes generated with the model, in order t o t e s t a n d improve model performance.

The original s e t of observations of t h e s y s t e m t o be reproduced by t h e model output c a n conveniently be t h o u g h t of as a region in an n-dimensional behavior vector space. Clearly, each observable property of t h e system c a n form one dimension. Time, in t h e case of dynamic s y s t e m s and models, c a n be thought of as just one a t t r i b u t e of a n observation; t h a t is, algal biomass a t a certain t i m e , say spring turnover of a lake, m i g h t form one dimension, and algal biomass a t another t i m e , say s u m m e r solstice, could be a n o t h e r . Also,

(15)

observable properties could be independent of time, such as algae biomass maximum, whenever i t was observed during the year. Another class of observable properties comprises integrated properties, such a s total yearly primary production, or relational properties, such as the ratio of maximum t o minimum algal biomass. Each of these properties - and many more, certainly depending on the kind of system in question - can be used in defining the behavior space of t h e system, and with it, as a subset, the set of desired,

"realistic" model responses.

Obviously, the great flexibility in these constraint conditions allows for tailoring a very detailed and problem-specific s e t of constraint conditions.

Violating none of them can be understood as analogous to minimizing an objective function. Besides criteria t h a t can be easily and directly derived from t h e available set of specific observations on a given system, one might want to constrain more and other elements of model response, such as flows and relationships between integrated flows or efficiencies in ecological jargon.

Since such magnitudes a r e usually not observed, one would have to resort t o the ecological or environmental literature for appropriate ranges. However, such additional constraints can only help to rule o u t ecologically or pfiysi- cally implausible behavior of a model, but not to identify t h e parameters for a given, specific system as such.

The concepts of system behavior space and model response s e t a r e quite versatile and, in fact, can even accommodate measures such as t h e s u m of squares of deviations of model output from corresponding observations. A traditional squared-error criterion can be understood as a measure of dis- tance, in t h e response vector space, between any singular model response and t h e required target, t h e behavior region of t h e system. The latter, however, is represented by a singular point (Section 4.1 includes a discussion of the different concepts and their relationships).

Along each of the relevant dimensions of the behavior space, the s e t of available observations can now be used to define a range, or a probability density distribution, within which t h e (observed) s t a t e of t h e system was found and, consequently, within which the simulated s t a t e ought to be. Each of the ranges in t h e model response space therefore constitutes a constraint condition imposed on an allowable model output. The defined allowable model response set can be understood as a filter t h a t will separate the class of all model responses into allowable ones - contained in t h e allowable model response s e t

-

and its complementary "unrealistic" subset (Section 2.3). Fig- ure 2 gives a n example of projections of model response on to planes of two constraining response variables, with each allowable range forming a dark- ened rectangle in the projection plane.

2.2 The Concept of Model Parameter Space

Similar to the behavior vector space and response set associated with the output side of t h e model, one might conceive an input or parameter vector space on t h e input side. Each of t h e unknowns to be fed into t h e model for a simulation r u n again deflnes one dimension in this vector space. The

(16)

FIGURE 2 Model response space projection on to a plane of two response variables, indicating the position of the constraint conditions for (a) a pair of uncritical condi- t.ions. (b) a pair of critical conditions.

allowable values of this unknown define a range or probability density function on each of t h e coordinate axes. To define s u c h ranges requires t h a t e a c h of t h e unknowns is physically meaningful, or measurable i n principle, so t h a t such a finite range will exist. Only if all t h e unknowns (and the classical parameters in particular) have a physical function t h a t c a n be i n t e r p r e t e d well, c a n they be reasonably constrained. The ranges within which a c e r t a i n p a r a m e t e r h a s t o be a r e

-

in t h e worst case - given by physical limits, e.g. a lower limit of zero for most r a t e constants, or a n upper limit of one for a

(17)

limiting factor. In many c a s e s , however, one will be able t o find p a r a m e t e r values in t h e appropriate l i t e r a t u r e (e.g. Jdrgensen e t al. 1978), o r even information from specific experimentation or observations from the specific sys- t e m modeled. They c a n all be utilized t o define allowable ranges for t h e unknowns.

The rationale for defining t h e s e ranges a s narrowly a s can be justified, without too m u c h a r b i t r a r i n e s s , is twofold. On one h a n d , narrow r a n g e s increase t h e sampling density a n d r e d u c e t h e n u m b e r of trials necessary t o explore t h e p a r a m e t e r space sufficiently. On t h e o t h e r h a n d , if n o satisfactory solution c a n be found within t h e ranges deemed plausible a p r i o r i , this will indicate t h a t t h e model does not function a s t h e analyst thought i t would.

Obviously, t h e p a r a m e t e r s do n o t influence t h e model behavior a s a s s u m e d , t h e y function differently from t h e analyst's perception of their function; in other words, t h e r e is something wrong with t h e model s t r u c t u r e (Section 3.1).

Besides s u c h straightforward information t o be derived from t h e relationships between p a r a m e t e r s e t a n d response s e t , t h e subregions of t h e p a r a m e t e r space corresponding t o c e r t a i n subregions in t h e response space c a n give valuable insight i n t o t h e model behavior. P a r a m e t e r correlations, o r any s t r u c t u r a l properties of p a r a m e t e r space regions for a certain class of response, can be i n t e r p r e t e d i n t e r m s of a sensitivity analysis (Section 3.2).

Figure 3 gives examples of projections of p a r a m e t e r space regions with cer- t a i n response c h a r a c t e r i s t i c s o n t o planes of two p a r a m e t e r s .

2.3 A Formal Presentation of the Method

Let u s suppose t h a t a given model s t r u c t u r e is a s s u m e d . The model c a n be r e p r e s e n t e d by a vector function j with domain D ( j ) and range R(f ). If RD is a s u b s e t of R , t h e n t h e inverse image of RD u n d e r j is t h e subset of D ( j ) given by

f - ~ ( R D ) = Iz: j ( Z ) E R D ~

This s u b s e t will be called PM, a n d r e p r e s e n t s t h e s e t of all p a r a m e t e r vectors resulting in t h e defined, acceptable model responses RD.

To identify P M , we have t o define RD by a series of constraint conditions, which can include m o r e classical objective functions, e.g. a least-squares criterion (Section 3.1). From t h e plausible ranges for e a c h of the p a r a m e t e r s t o be e s t i m a t e d , t h e s e t of allowable p a r a m e t e r vectors, PD, is f o r m e d a s t h e d i r e c t or Cartesian product. Random s a m p l e s a r e t h e n drawn from PD, forming a s e t of t r i a l p a r a m e t e r v e c t o r s . Each of t h e s e v e c t o r s is used for one t r i a l r u n of t h e model, a n d t h e resulting model response is classified according to t h e s e t of c o n s t r a i n t conditions i n t o those giving t h e defined behavior:

RS' = fRSi:(RSieRD)] n ( R S 1 ) = M

a n d t h o s e violating a t least one of t h e c o n s t r a i n t conditions, t h u s n o t giving t h e defined behavior:

RS" =[Rq:(RSia RD)] n(RS") = N - A 4

(18)

FIGURE 3 P a r a m e t e r vector space projection for a behavior-giving s e t of p a r a m e t e r vectors. Projection from t h e 22-dimensional p a r a m e t e r vector s p a c e on t o a plane of two model p a r a m e t e r s ; extension of t h e individual axes indicates t h e r a n g e u s e d for sampling.

(19)

FIGURE 4 Set diagram of t h e relationships between parameter space a n d response space. D ( f ) i s the s e t of all possible parameter vectors (domain o f f ); R ( f ) is the set of all possible model responses (range o f f ); f i s t h e model (vector function); PD is t h e defined s e t of plausible parameter vectors; RD is t h e defined realistic response region; PM is the inverse image of RD; PS represents the character vectors sampled in t h e Monte Carlo procedure; RS is t h e direct image of PS; PS' is the subset of PS t h a t generates plausible realistic response RS'; PS" is t h e subset of PS resulting in an unrealistic response RS"; PS. is the mo&fied PS' used for prediction, resulting in RSL

.

The

N

parameter vectors used for the trials are thus split into the complementary subsets PS' and PS" with

M

and

N-M

elements, respectively. The s e t of parameter vectors PS', resulting in the defined model behavior, is t h e n the solution to t h e estimation problem. It is a subset sampled from t h e parameter space region

PM.

These relationships a r e summarized in Figure 4.

2.4 A Very Simple Illustrative Example: Estimating Regression Coefficients To illustrate t h e method very simply, let u s consider a data s e t (Figure 5(a)), with only one dependent s t a t e variable (y) plotted as a function of a n independent one, which could, for example, be time. Let us also assume t h a t a p r i o r i information about the system represented allows us to construct a model for it. To make the example a s simple a s possible, I will propose a model of t h e form

y ( t ) =at

(the reader might try to find a meaningful ecological example for this) with only one parameter ( a ) , to be estimated from the data. Let me assume t h a t , for reasons of "ecological plausibility," a can be constrained to the range

(20)

FIGURE 5 (a) Data set indicating the positions of the constraint conditions c(1) and c(2); thin lines represent envelopes over t h e responses of models 1 and 2; broken lines show allowable model response for model 2 (note t h e divergence outside t h e constraining bars). (b) Projection of model I response space, defined by t h e two constraint conditions; the box delimits the allowable response. (c:) Projection of model 2 response space, indicating t h e positions of successful trials in t h e parameter space;

the parameter box represents t h e ranges sampled.

0

.

^(c)

0 0

O o

"0

Model 2

0 0 0 0

18- 16.

14- 12.

N

E ^10.

.-

4-

8.-

-

3

U

- -

^U

^{- - -}

6 . ; '

I

00 I

I I l b

2

-

2.

I I

¹ ² ³

I

^I ^Parameter¹

0 2 4 6 8 1 0 0 2 4 6 8 1 0

Constraint 1 Constraint 1

(b) 18.

0

o 16.

14.

8

0 12.

Model 1

N

0 10.

.-

(21)

which, in fact, defines a region in the one-dimensional parameter space or a s e t of plausible, allowable a ' s . On t h e output side, we can formulate a number of constraint conditions or performance criteria based on t h e data in Figure 5(a), which explicitly include t h e uncertainty around t h e observations (thought of as samples from a real-world system), indicated by t h e bars extending from two of t h e points. (The points without bars represent t h e typi- cal singular observations or measurements without replica, which a r e somewhat difficult to interpret.)

The constraint conditions or performance criteria c ( i ) t o be met a r e summarized as:

c(1): 2.5

<

y(2)

<

5.0 c(2): 7.0

<

y (8)

<

9.0

According to t h e terminology introduced above, t h e two criteria define a region in a two-dimensional behavior space (Figure 5(b)), or a s e t of allowable model responses.

To estimate values for a , we simply draw random samples from t h e interval defined around a , substitute these values in t h e model, "simulate"

for t h e desired range of t h e independent variable, and determine t h e values of t h e two performance criteria, namely the values of y a t t = 2 and t =8. To no surprise of t h e reader, none of the values of a t h a t can be sampled from the predefined interval will satisfy both of t h e constraint conditions. Conse- quently, t h e model will be rejected. Similarly, other one-parameter alterna- tives t o t h e proposed model, namely

y ( t ) = t a and

also fail to m e e t t h e constraint conditions imposed on t h e model output.

Modifying t h e model by introducing one more parameter will lead to an alternative two-parameter model, which is then subjected to t h e s a m e t e s t and estimation procedure. The simplest form of t h e model would be

with a simple additive second parameter ( 6 ) . This could be thought of as, for example, the initial s t a t e of y a t t = 0, which, in the first models, was implicitly forced to take values of 0 and 1, respectively. Constraining b to t h e range

we repeat the sampling and simulation procedure. This t i m e , some of t h e simulation r u n s will m e e t both constraint conditions (Figure 5(c)). As can also be seen from Figure 5, t h e corresponding parameter vectors a r e found clustered in t h e parameter space region sampled randomly. The two parame- t e r s are clearly correlated (Table I), indicating their m u t u a l dependency or, in other words, the fact t h a t a change in one: of t h e m can, within a certain

(22)

range, be balanced by a corresponding change in t h e o t h e r . Table 1 s u m m a r - izes s o m e basic statistics of t h e p a r a m e t e r subset PS' (Figure 4 ) , i.e. t h e sub- s e t resulting in acceptable model behavior.

TABLE 1 Parameter statistics and prediction (example); 5000 runs evaluated.

Range Standard

sampled Mean Min. Max. deviation

Parameter a 0.500

-

1.500 0.05 0.630 1.080 0.097

Parameter b 0

-

2.000 1.33 0.340 2.000 0.407

Prediction (i! 12) ^11.45 ^9.54 ^13.30 ^0.924

C o m e l a t w n m a t h

1 2

2 Significant (p

<

0.05) negative correlation

3 Significant (p

<

0.05) Significant (p

<

0.05) positive correlation negative correlation

In a final step, t h e s e t of allowable p a r a m e t e r vectors c a n now be used for predictions of y , for example, for t = l 2 . A s e t of e s t i m a t e s r e s u l t s . If enough vectors a r e used, a frequency distribution o r a probability density function c a n be c o n s t r u c t e d for t h e prediction, allowing for a probabilistic i n t e r p r e t a t i o n (Figure 5(a)). The variability of t h e p a r a m e t e r s r e s u l t s directly from t h e uncertainty in t h e observations, and is again reflected in t h e o u t p u t variability.

2.5 Some Technical Details

One of t h e major drawbacks of Monte Carlo m e t h o d s is t h e i r insatiable demand for computer t i m e . Although t h e y a r e very efficient in t e r m s of t h e t i m e required by t h e analyst o r modeler to s e t u p a n appropriate s c h e m e for estimation a n d evaluation, this efficiency is t r a d e d against c o m p u t e r t i m e and, eventually, storage capacity.

There a r e a few basic rules t h a t c a n help t o make Monte Carlo techniques m o r e efficient in t e r m s of c o m p u t e r use.

a. M i n i m i z e the n u m b e r of trials

A reduction of t h e n u m b e r of trial ru.ns necessary t o identify a s e t of p a r a m e t e r vectors for a certain class of model response c a n be achieved in several ways. First, a given estimation problem c a n be split i n t o several cycles of trial r u n s in a n iterative way. Each cycle is analyzed before the next one is s t a r t e d . This eventually allows corrections t o be made, t h e ranges t h a t a r e t o be sampled t o be redefined, c o n s t r a i n t conditions t o be modified, e t c . After a relatively small n u m b e r of t r i a l r u n s (which certainly will depend o n t h e n u m b e r of unknowns estimated simultaneously) one m i g h t , for example, find a clear clustering of t h e "good" vectors in t h e p a r a m e t e r space already.

If, consequently, certain regions i n t h e p a r a m e t e r space s e e m "empty" ( i n

(23)

t e r m s of solutions), they c a n be discarded (by redefining t h e ranges sampled) to improve t h e efficiency of t h e sampling. Another example would be con- s t r a i n t conditions, which a r e always violated. This should lead t o t h e recon- sideration of t h e s e conditions and t h e p a r a m e t e r ranges sampled (here they might have t o be extended), o r a modification of t h e whole model s t r u c t u r e itself. Clearly, if after a first screening of t h e p a r a m e t e r space all model responses are off t h e i r t a r g e t in a systematic way (as in t h e example above), a n increase in t h e n u m b e r of t r i a l s will probably n o t be worth while.

Some intelligent check on t h e n u m b e r of r u n s c a n be made by defining complex stop rules for a cycle instead of simply using a fixed n u m b e r of trials. S u c h stop rules, for example, can monitor t h e m e a n s , standard deviations, and ranges of parameters of a c e r t a i n response class, and stop t h e estimation if new samples no longer change these values, i.e. when the e s t i m a t e s converge. Table 2 refers t o t h e example described above.

TABLE 2 Convergence of parameter estimates with increasing number of samples (independent cycles).

Number a b

of samples Mean Minimum Maximum Mean Minimum Maximum

b. * e e d u p the trial r u n s

Since a simulation program m a y r u n several thousand times in a Monte Carlo framework, streamlining t h e code will pay off. This includes, for example, t h e inactivation of all s t a t e m e n t s t h a t are n o t essential for t h e determination of performance criteria. Examples might be auxiliary output variables, which a r e not used i n t h e testing procedure. Also, parts of t h e model t h a t a r e unchanged within a cycle of trial r u n s (for instance, setting up the geometry of t h e lake in t h e second application example, Section 3.2) should not be exe- cuted m o r e t h a n once in such a cycle. This, of course, requires more programming effort t h a n simply calling t h e entire model a s a subroutine of t h e Monte Carlo program - a compromise between programming effort and computer resource utilization has t o be found.

(24)

A somewhat simpler possibility is to abandon a r u n as soon as i t is obvious (even during run-time) t h a t a given constraint condition will be violated.

Since this may happen within the first few time steps, savings in computer time can be considerable.

c. Reduce input/output

As even a small simulation program, when r u n several hundred or thousand times, can produce an absolutely incomprehensible mountain of output, the reduction of output is essential for more than one reason. First, t h e r e will rarely be enough space t o s t o r e i t all; second, nobody is going t o look a t i t all anyway; and third, 1/0 is time-consuming. Therefore, i t is essential to reduce output t o a minimum and do whatever processing has to be done with the output (e.g. classification, and calculation of certain statistics) within the Monte Carlo program. Again, t h e r e is a trade-off between the size a program can have on a certain machine, setting an upper limit t o what can be done simultaneously, on-line, and storage capacity. Designing "intelligent"

programs for the automatic analysis of Monte Carlo runs is probably t h e most demanding - and most challenging

-

p a r t of t h e technique.

Similarly, input should clearly also be reduced to the absolute minimum. The most obvious examples a r e time-variable inputs or forcings t o a dynamic simulation model, which should n o t be read a t each time step of each trial, but only once for a cycle of trials, and then stored in an appropri- a t e form within the program. Again, this calls for a compromise between time and core requirements.

d. Think f i r s t

As trivial as this last "rule" might seem, i t is probably t h e most impor- t a n t one. It is most tempting to just l e t t h e program r u n (specifically when computer t i m e is a free commodity) - and then to discover a little bug, somewhere, t h a t makes thousands of r u n s worthless. Time spent in carefully considering the estimation scheme will certainly pay off in the long r u n . For example, if the parameter ranges sampled are fairly large, most complex models a r e bound t o "crash" sooner or l a t e r - unless care is taken of zero divides, overflows, and underflows. Also, since operating systems tend to fail sometimes, provisions should be made t h a t , in case of t h e unavoidable crash, only a minimum amount of information is lost, and an estimation cycle can be restarted. The Morite Carlo approach is very forgiving and helpful in this respect, as sample runs can always be pooled.

(25)

3 APPLICATION EXA?dPLES

3.1 Hypothesis Testing: A Marine Pelagic Food-web Example*

The study of environmental systems as ecological a n d physicochemical as well a s socioeconomic entities requires a high degree of simplifying for- malism. However, a detailed understanding of a systems function and response t o various changes for t h e explicit purpose of systems management and planning still requires complex hypotheses, or models, which can hardly be subjected t o rigorous t e s t s without the aid of computers. Systems simulation is a powerful tool for subjecting complex hypotheses t o rigorous t e s t s of their logical s t r u c t u r e , as well as a possible means for rejecting or corrob- orating t h e underlying hypotheses.

The complexity and variability of environmental systems, the scarcity of appropriate observations and experiments, problems in t h e interpretation of empirical data, and the lack of a well established, comprehensive theoretical background make i t difficult to t e s t any possible conceptualization, o r hypothesis, describing a given system. A formal approach t o hypothesis testing, based on numerical simulation and Monte Carlo methods, which explicitly considers t h e above constraints, is proposed in this section.

Based on a data s e t from the North Sea, a series of hypotheses on t h e s t r u c t u r a l relations and t h e dynamic function of t h e pelagic food web is formulated in t e r m s of numerical models. Hypotheses of various degrees of aggregation and abstraction a r e tested by comparing singular statements (predictions) deduced from t h e proposed hypotheses ( t h e models) with t h e observations. The basic processes of primary production, consumption, and remineralization, driven by light, heat, and advection/diffusion, a r e described in systems models ranging in complexity from two compartments to many compartments and species groups. Yearly cycles of systems behavior are simulated with each of t h e proposed models. A comparative analysis of t h e response of each of t h e models allows conclusions t o be drawn on t h e adequacy of t h e alternative hypotheses, including t h e i r "unknowns" or initial conditions (i.e. This analysis also allows one t o reject inadequate constructs, and provides some guidance on how t o improve a certain hypothesis, even in t h e presence of a high degree of uncertainty.

Universal s t a t e m e n t s , describing those properties of a system t h a t are invariant in space and time, may be called models, whether they are of a n informal (e.g. verbal o r mental) or a formalized mathematical s t r u c t u r e . Such models, viewed a s scientific theories, have t o be t e s t a b l e . When one feeds or substitutes a s e t of specific singular s t a t e m e n t s into t h e model ( t h e initial conditions, which, in t h e case of a mathematical model, also include t h e model parameters in a general sense, as discussed in Section 2.2), i t m u s t be possible t o deduce or predict testable singular s t a t e m e n t s (i.e. possible observations or the outcome of possible experiments). Disagreement between t h e prediction deduced from t h e hypothesis or model a n d t h e available

*This section is largely based on Fedra (1981a, b).

(26)

observations would then require rejection of the hypothesis, modification and improvement, or the search for alternative hypotheses, which would then have to be subjected to the same procedure. This method, which would basi- cally represent the strategy of scientific research proposed by Popper (e.g.

1959), labeled falsificationism by critics such as Feyerabend (1975) and Laka- tos (1978), however, has a major drawback when applied to complex simulation models or dynamic hypotheses describing ecological systems, in t h a t t h e so-called initial conditions to be used with the basic s t r u c t u r e of t h e theory to deduce the testable predictions are not exactly known. In one simple example given by Popper (1959). where he refers to a mechanical experiment (breaking a piece of thread), the initial conditions to be specified are simple enough: a weight and the characteristics of t h e thread (e.g. material, diame- t e r etc.), which a r e measurable without considerable error (it is significant t h a t many examples used in epistemological analyses refer t o relatively simple physical systems). Measurements "without" error, however, are not usually possible when we a r e dealing with the complex aggregates conceptualized as "units" in large-scale systems thinking and models. This can certainly be seen as the result of two basic shortcomings, one in t h e measurement techniques available, another in the formulation of the models themselves: if t h e models require unknowns as inputs, they are not well formulated. The l a t t e r is certainly a generic shortcoming of environmental models and the underlying theoretical understanding.

The same line of argument can be followed with regard to the observation used for comparison with model output in hypothesis testing. The breaking of a thread, t h e singular prediction in Popper's example, is readily observable. It either happens, or does not. In most environmental applications, however, we have to compare predictions with measurements (as a rule, samples) of t h e system, which always include some measurement e r r o r , t h a t i s t o say, these are ranges. Also, in environmental systems the degree of abstraction and aggregation is quite different for measurements and for model conceptualization. Therefore, t h e observations and measurements can serve only a s samples of the properties or the state of the units conceptualized. As these units a r e generally heterogeneous (in terms of their measurable properties) and a r e generally characterized by a high degree of variability, further uncertainty h a s to be dealt with in the hypothesis-testing procedure.

Retaining t h e logical s t r u c t u r e of testing a proposed hypothesis, b u t including a t t h e same tirne t h e appropriate (or r a t h e r unavoidable) way of describing uncertain "initial conditions" as well as the expected outcome of t h e experiment, involves the following procedure. It is possible to describe t h e initial conditions or inputs by several numbers (forming a vector, deter- mining a point i n t h e n-dimensional input vector space) and to do the same for the expected result of the experiment ( t h e observed behavior of t h e sys- t e m ) , resulting again in a point in an n-dimensional output or behavior space.

In the presence of uncertainty, t h e two points will have to be extended to regions in their respective spaces. Instead of the two vectors, we have to deal with s e t s of vectors with certain statistical properties and probability struc- tures.

(27)

To t e s t any specific hypothesis, we now examine whether, for a s e t of admissible initial conditions (i.e. the parameters), predictions (members of the s e t of allowable outcomes) can be made. The rejection of a hypothesis, whenever no allowable outcome can be generated, is based on a statistical argument, as the number of possible initial conditions forming the admissible s e t is infinite, and only samples can be examined. Also, t h e s e t of admissible initial conditions will rarely be well defined on t h e basis of a p r i o r i knowledge ( a p r i o r i in relation to the specific experiment to be carried out). Generally, i t will be possible to specify allowable ranges for the individual initial conditions. The admissible s e t , however, is also characterized by t h e correlation s t r u c t u r e , which determines t h e "shape" of t h e admissible region in t h e parameter vector space.

This method of testing a given hypothesis does not indicate how such a hypothesis can be arrived a t in t h e first place - by "conjecture." Popper's rejection of inductive reasoning does not provide much help, but in practice hypotheses (and simulation models) a r e rarely generated randomly but a r e always based on empirical knowledge. However, the process of testing and rejecting a given hypothesis can also provide some diagnostic information about t h e causes of failure and about possible ways to improve the hypothesis.

One possibility is strict parsimony: t o s t a r t with the simplest possible conceptualization, or the least complex model one can formulate bona fide, which still may capture the relevant features of t h e system in view of t h e problem studied. Certainly, each hypothesis tested should be an honest candi- date for success: "What then is the point of setting up a [Poisson] model like a skittle, just to knock it down again?" (Finch 1981). If this simple version fails to give an acceptable behavior over t h e allowable parameter ranges, t h e model s t r u c t u r e is modified. Complexity is increased by adding elements and more complex process descriptions to the model (Figure 6), until a satisfactory behavior can be achieved. However, t h e r e is in any case more than one way t o increase t h e complexity of a model. A general formalization of this

"adding of complexity" seems to be most difficult, if not impossible. Some guidance for this process can be expected from t h e analysis of a series of errors, a s will be shown below. Also, as I a m only considering "conceptual" models (as opposed to purely statistical models, they are based on physical processes and only include t e r m s directly interpretable in t h e "real w o r l d ) , additional observations can be exploited in many cases. Knowledge accumulated from t h e study of similar systems may also be helpful in changing a given model s t r u c t u r e .

Building up complexity and iteratively subjecting each version or level of the model to extensive tests should allow one t o learn about t h e way struc- tural changes influence model response. At t h e same time, t h e intricate con- nection between s t r u c t u r e and t h e parameters has t o be emphasized, since model behavior is certainly responsive to both. As changes in the model s t r u c t u r e will, in almost every case, al.so necessitate changes in the parame- t e r s (their numbers, admissible ranges, and interpretation), comparisons of different versions a r e quite difficult. Although t h e approach described below is clearly far from being ideal, any a t t e m p t a t a formalization of t h e modeling

(28)

Model 1

Model 2

Model 3

FIGURE 6 Flow diagrams for t h e models compared: P, phosphate; A, phytoplankton; D.

detritus; 2 , zooplankton; Z1, herbivores; Z2, carnivores.

process seems preferable to a purely arbitrary and subjective procedure.

3 . 1 . 1 7he Empirical Background: &scribing the Environmental S y s t e m Considering t h e above constraints, t h e direct use of t h e raw data available on any ecosystem seems to be r a t h e r inappropriate and difficult for t h e testing of complex and highly aggregated dynamic hypotheses. Consequently, we have to derive from t h e available data a description of t h e system and t h e processes we want to study a t a more appropriate level of abstraction and aggregation. This description, which already has to be formulated in t e r m s of t h e hypothesis to be tested, should take advantage of all t h e available information, and a t the same time provide a n estimate of t h e reliability of this information a t the required level of abstraction.

To illustrate t h e approach, a data s e t from t h e southern North Sea was used. Most of t h e information utilized s t e m s from t h e yearly reports of t h e Biological Station Helgoland, and describes physicochemical as well as biological variables a t t h e sampling station "Helgoland-Reede" for t h e period

1964-79.

Figure 7 summarizes t h e data used. The driving environmental variables, water t e m p e r a t u r e and radiation, were found suffiekntly smooth and well

Environmental Modeling Under Uncertainty: Monte Carlo Simulation

-

CONTENTS

..."

-

-

-

-

.

N

M

N-M

PM.

.

-

- -

- - -

I

I I l b

-

I I

I

<

<

<

<

-

-

<

<

<

-

^{- - -}