A Statistical Model of Background Air Pollution Frequency Distributions

(1)

W O R K I N G P A P E R

A mATISIICAL MODEL

OF

BACKGROUND

AIR

POLLUTION FREQUESCY DISI'RIBUTIONS

M. Ya. Antonovski

VM.

Bukhshtaber E.A. Zateniuk

November 1988 WP-88-102

Zsi

...

^l

1 lASA

I n t e r n a t i o n a l I n s t i t u t e for Applied Systems Analysis

(2)

A S T A m C A L

HODEL OF BACKGROUND AIR POLUJTIm

FREQUMCY

DlSIXIBIJTiONS

M. Ya. Antonovski KM. Bukhshtaber E.A. Z d e n t u k

November 1988 WP-68-102

Working Papers are interim reports on work of the International Institute f o r Applied Systems Analysis and have received only limited review. Views or opinions expressed herein do not necessarily represent those of the Institute or of its National Member Organizations.

INTERNATIONAL INSTITUTE

FOR

APPLIED SYSTEMS ANALYSIS A-2361 Laxenburg, Austria

(3)

PREFACE

The a u t h o r s of t h i s p a p e r d e s c r i b e a n a p p r o a c h f o r identifying s t a t i s t i c a l l y stable c e n t r a l tendencies in Lhe frequency distributions of time s e r i e s of o b s e r v a - tions of background atmospheric pollutants. The d a t a were collected as daily mean values of c o n c e n t r a t i o n s of s u l f u r dioxide a n d suspended p a r t i c u l a t e m a t t e r at five monitoring s t a t i o n s

-

t h r e e in t h e USSR, o n e in Norway, a n d o n e in Sweden.

In Lheir a p p r o a c h , t h e a u t h o r s u s e well-developed s t a t i s t i c a l techniques and t h e usual method of constructing multimodal distributions. The problem i s subdivid- ed into two p a r t s : f i r s t , a decomposition of t h e o b s e r v a t i o n s in o r d e r

to

obtain a description of e a c h season s e p a r a t e l y and second, a n investigation of t h i s d e s c r i p - tion in o r d e r t o d e r i v e statistically s t a b l e c h a r a c t e r i s t i c s of t h e e n t i r e d a t a set.

The main hypothesis of t h e investigation i s t h a t dispersion p r o c e s s e s i n t e r a c t in such a way t h a t in t h e zone of influence of o n e p r o c e s s ( n e a r i t s mode) t h e "tails"

of t h e o t h e r p r o c e s s are not o b s e r v e d . This permits illumination of i n t e r r e l a t i o n s between t h e physics a n d t h e chemistry of t h e atmosphere.

During t h e last 15-20 y e a r s , a wide r a n g e of monitoring p r o g r a m s h a s been in- itiated at national a n d international levels including, f o r example, t h e European Monitoring and Evaluation P r o g r a m (EMEP) under t h e a u s p i c e s of t h e ECE, and t h e Background Air Pollution Monitoring Network (BAPMoN) u n d e r t h e a u s p i c e s of t h e WMO.

The flow of d a t a from t h e system of monitoring s t a t i o n s h a s led

to

national and international p r o j e c t s f o r t h e development of extensive environmental d a t a b a s e s such as NOAANET (NDAA), GRID/GEMS/UNEP/NASA, e t c . The d e g r e e of information obtained should b e sufficient f o r t h e goals of t h e analysis but o f t e n t h e r e i s a n overabundance of such d a t a . The methods discussed in t h i s p a p e r t h e r e f o r e help in a i r pollution assessments, p a r t i c u l a r l y with r e s p e c t t o distinguishing t h e base- line components, and t h e i r t r e n d s o v e r decades.

R.E. Munn

L e a d e r , Environment P r o g r a m

(4)

CONTENTS 1. INTRODUCTION

1.1. Problems of Background Air Pollution Monitoring

1.2. Probabilistic Approach t o Investigations of Background Air Pollution 2. PRESENT STATUS OF STATISTICAL ANALYSIS OF AIR POLLUTION

2.1. Review of t h e Application of Probabilistic Methods

to

Descriptions of Air Pollution

2.2. Descriptive Air Pollution Models

2.3. The Use of Descriptive Models f o r Studies of Background Air Pollution Monitoring Data

3. CONSTRUCTION OF A STATISTICAL MODEL SIMULATING BACKGROUND AIR POLLUTION FREQUENCY DISTRIBUTIONS

3.1. Estimates of Background Concentration Levels 3.2. S t a t i s t i c a l Analysis of Background Monitoring Data

3.3. Construction of a S t a t i s t i c a l Model f o r Background A i r Pollution Monitor- ing Data

4. ASSESSMENT OF BACKGROUND AIR-POLLUTION MODEL PARAMETERS

4.1. Discussion of t h e Possibilities of a Credible I n t e r p r e t a t i o n of t h e Model P a r a m e t e r s

4.2. Theoretical Principles Underlying t h e S t a t i s t i c a l Model of Background Air Pollution

4.3. Assessment of Model P a r a m e t e r s in Terms of t h e Simulation Data 5. DISCRIMINATION OF THE COMPONENTS OF BACKGROUND AIR POLLUTION

5.1. Estimation of Central Tendencies of Multi-modal Frequency Distributions of Seasonal D a t a S e r i e s

5.2. Estimates of S e l e c t i v e Grouping I n t e r v a l s

5.3. Methods of Construction of Statistically S t a b l e Estimates of Pollution Components

5.4. Analysis of Components of Background Air Pollution Components

REFERENCES APPENDIX

ACKNOWLEDGEMENT

(5)

A STATISL'ICAL MODEL OF BACKGROUND AIR POLLUTION FREQUENCY DISJXIBUTIOEIS hi. Y a A n t o n o v s k i , EM. B u k h s h t a b e r * a n d E.A. ZaLeniuk**

1. INTRODUCTION

1. Problems of B a c k g r o u n d A i r PoLlution Monitoring

The natural environment experiences ever-increasing anthropogenic effects.

In o r d e r

to

estimate t h e magnitude of t h e s e effects and

to

prevent d i r e conse- quences, i t is necessary, f i r s t of all,

to

have unbiased information concerning the actual state of t h e natural environment.

The results of investigations on t h e distribution of pollutants from various sources are present in Izrael and Novikov (1985). According

to

o u t e r s p a c e ex- ploration data, a i r pollution o c c u r s in t h e form of "aerosol fields" and e n t i r e zones of anthropogenic effects can b e distinguished a t distances of hundreds t o thousands of kilometers from t h e s o u r c e of pollution. The d e g r e e of anthropogenic pollution in such areas can be determined only by using estimates of normal, o r background pollutant concentrations in t h e atmosphere of t h e respective regions.

The National System f o r monitoring background values. adopted in t h e USSR (Izra- el, 1984) calls f o r investigations and observations on t h e composition, transforma- tion and migration of pollutants. Among t h e pollutants of priority are ozone, dust, sulfur and nitrogen compounds, lead, mercury, and some o t h e r substances. At present special attention i s centered on t h e system of observations on background a i r pollution. In t h e USSR t h e f i r s t phase of establishing a network of monitoring stations has been completed (Rovinskii and Buyanova, 1982). A special global sys- t e m of stations f o r background a i r pollution monitoring is being realized within the framework of t h e program of t h e World Meteorological Organization (Izrael. 1984).

The projects and recommendations in r e g a r d

to

national and global background monitoring systems are widely discussed in t h e l i t e r a t u r e (see. f o r example, Wiers- ma, 1985; Lynn, 1976).

The major problems

to

be solved by global background monitoring have been formulated in Wiersma (1985) as follows:

1. Establishment of t h e relative concentration levels f o r pollutants, capable of estimating global distributions.

2. Early warning of t r e n d s in global pollutant distributions.

3. Establishment of normal concentration levels f o r parameters of ecosystems and t h e i r comparison with concentration levels of impact zones.

The problems bearing on t h e detection of t h e continental and global behavior of pollutants are t r e a t e d in Rovinskii and Buyanova (1982), where i t i s pointed out t h a t i t i s necessary

to

analyze regional background a i r pollution processes. A t

* All-Union Research I n s t i t u t e of Physicotechnical and Radiotechnical Measurements, USSR.

** Natural Environment and Climate Monitoring Laboratory, USSR.

(6)

p r e s e n t i t is considered t h a t t h e most satisfactory s i t e s f o r t h e location of background monitoring stations are biosphere r e s e r v e s and o t h e r natural r e s e r v e s . Among t h e o t h e r c r i t e r i a recognized in Rovinskii and Cherkhanov (1982) f o r t h e selection of station s i t e s are geographical zonality, distance from t h e s o u r c e of pollution, and t h e d e g r e e of representativeness of t h e derived d a t a . Representa- tiveness h a s a special meaning

-

t h e absence of any obvious anthropogenic e f f e c t s on t h e measured normal pollutant concentrations, and t h e comparability of t h e a e r o m e t r i c d a t a with d a t a derived from o t h e r stations. Such a comparison in certain cases i s f r a u g h t with difficulties, on account of t h e high variability of t h e a e r o m e t r i c data. resulting from measurement e r r o r s as well as from t h e influence of physical and geographical f a c t o r s . The l a t t e r involve, f i r s t of all. t h e location of stations in regions t h a t d i f f e r according

to

t h e d e g r e e and n a t u r e of anthropogenic e f f e c t s , and according

to

t h e processes determining t h e pollutant concentration variations. An idea of t h e d e g r e e of variability of t h e estimate of background concentration levels can b e conceived from t h e data presented in I z r a e l (1984) on t h e lead concentrations in t h e lower atmosphere: f o r t h e lowland areas of Western Europe

-

o v e r 100 ng/m3; f o r normal regions of t h e USSR

-

2-40 ng/m3; f o r mountainous areas of t h e USSR

-

2-6 ng/m3, f o r mountainous areas of North America

-

4.6-2lng/roman m3. In o r d e r

to

r e d u c e t h e variability of t h e d a t a and of t h e derived estimates of background concentration levels, it is commonly suggested (see, f o r example, Rovinskii and Buyanova. 1982) t h a t observational data. averaged in time and space, should b e used. In

term

of spatial averaging, t h e global, semi-global, continental and regional t y p e s of backgrounds can b e distinguished.

The concept of regional background allows o n e

to

a t t r i b u t e p a r t of t h e variability of t h e background level estimates

to

t h e specific f e a t u r e s of t h e region and of t h e locality of t h e observing station. In t h i s case t h e background value is determined (Rovinskii and Buyanova. 1982) as t h e mean of t h e minimal content values of t h e given substance during a definite time-interval. Such a n a p p r o a c h makes i t possible t o a t t r i b u t e t h e e f f e c t s of high "abnormal" concentrations

to

local sources, o r

to

associate them with anomalous meteorological conditions.

Hence, t h e problem of t h e extraction of 'background" information from a s e r i e s of monitoring data i s associated with t h e development and application of statistical assessment of a e r o m e t r i c data. and c a n b e formulated as t h e problem of determining statistically s t a b l e c h a r a c t e r i s t i c s of t h e derived data.

1.2. P r o b a b i l i s t i c Approach to I n v e s t i g a t i o n s o f B c k g r o u n d A i r PoUution.

The p r e s e n t study i s devoted

to

t h e statistical analysis of background a i r pollution monitoring d a t a , having as i t s objective t h e design of a statistical model of background a i r pollution and its application f o r t h e determination of statistical c h a r a c t e r i s t i c s describing t h e probability l a w s governing t h e behavior of impuri- t i e s in t h e atmosphere.

Statistical models of a i r pollution distribution have been widely discussed in t h e l i t e r a t u r e (see, f o r example, Augustinyak and Sventz (1982); Berlyand (1975);

Berlyand (1984); Benarfe (1982); Mage (1981). However. background monitoring d a t a possess c e r t a i n specific f e a t u r e s , c r e a t i n g difficulties in t h e use of tradi- tional models (such a s , f o r example, t h e two-parameter lognormal distribution LN2 (Harris and Tabor, 1956; Larsen, 1961). Measurements of background a i r pollution levels are conducted in areas where t h e d i r e c t e f f e c t s of s t r o n g pollution sources are practically excluded. This implies t h a t t h e observed d a t a variability is t o a considerable d e g r e e due

to

t h e e f f e c t s of large-scale atmospheric processes, t h a t determine t h e mode of o c c u r r e n c e of different concentration levels in t h e area, r a t h e r than

to

t h e e f f e c t s resulting from point sources of pollution. Most of t h e a i r pollution models employed are designed f o r use under t h e assumption of t h e

(7)

existence of point sources. Studies of t h e probability concentration distribution laws f o r t h e atmosphere of normal regions allow one t o g e t a n idea of t h e qualitative mechanisms governing t h e formation of different concentration levels. Statis- tics, describing t h e s e laws, r e f l e c t c e r t a i n regularities in t h e formation mechanisms and c a n be used for assessment of background a i r pollution. Such a n a p p r o a c h enables one

to

validate statistically t h e intuitively derived concepts of t h e normal (background) level as t h e mean of t h e minimal measurements f o r a given time-interval (Rovinskii and Buyanova, 1982). or as t h e minimal but m o s t distinctly expressed concentration level, typical of t h e region (Izrael. 1984). The derived s t a t i s t i c s r e p r e s e n t a n informative description of t h e time s e r i e s of background air pollution monitoring d a t a and, in t u r n , can b e used to obtain explicit inferences bearing on t h e n a t u r e of t h e measurements and t h e i r behavior.

The major stages in designing. analysis and appLication of t h e statistical model of background air pollution are as follows:

1. Statistical analysis of background a i r pollution monitoring data. Studies of t h e logarithmic concentration distribution functions f o r data series of dif- f e r e n t time-intervals.

2. Investigation of t h e possibilities of describing t h e logarithmic concentration s e r i e s by multimodal distributions, and t h e physical p r e r e q u i s i t e s f o r t h e o r i - gin of multimodality.

3. Simulation of d a t a series in

terms

of composite distributions of a specific type, and development of graphical methods f o r estimation of performance parameters.

4. Description of seasonal observational data series by central tendencies of multimodal frequency distributions. Development of techniques f o r identifica- tion of statistically s t a b l e grouping intervals.

5. Analysis of statistically s t a b l e grouping intervals and t h e i r manifestations in seasonal and multiyear d a t a series.

6. Analysis of t h e air pollution components described by statistically stable grouping intervals; comparative analysis of t h e components and t h e i r manifestations a t d i f f e r e n t background monitoring stations; development of recorn- mendations f o r t h e assessment of background concentration levels.

2. PRESENT STATUS

OF

STATETICAL ANbLYSIS

OF AIR

POLLUTION

2.1. Review of the Application of Probabilistic Methods to Descriptions of Air Pollution.

Probabilistic models are o f k n used f o r t h e description of aerometric d a t a , and provide t h e basis f o r obtaining estimates and approximate descriptions of t h e distribution of air-pollutants. The models are used in many a s p e c t s of a i r quality planning, when prediction is one of t h e main aims of t h e study. The application of theoretical-probability and statistical methods

to

solving such problems h a s been discussed in a number of publications. For instance. one review (Hunter. 1981)

treats

many c h a r a c t e r i s t i c a s p e c t s of t h e application of statistical methods

to

problems of environmental control. The major problem on which t h e s e a u t h o r s c e n t e r t h e i r attention c o n c e r n s t h e existence of t h e g a p between t h e demands f o r a good descriptive model, connecting observed processes with t h e environmental p a r a m e t e r s introduced into t h e model, and t h e r e a l possibilities f o r assessment and measurement of such parameters. A typical systematization of t h e applied models can b e found in Benarie (1982). Dividing a i r pollution models into descriptive.

computational and predictive. Benarie assigns time-series analysis to t h e f i r s t

(8)

method. In t h e second and third cases, including regression and simulation, t h e methods demand information on t h e s t a t e of independent parameters t h a t have a d i r e c t bearing on pollution dispersion. Examining t h e possibility of application of the f i r s t method, t h e author has presented characteristics examples illustrating t h e i r limited applicability. For instance, experiments with t h e Box-Jenkins model (analysis of time-series) f o r the extrapolation of a i r pollution data f r o m 100 days of observations, showed t h a t estimates of model concentrations f o r t h e l o l s t , 102nd. etc., days were n o b e t t e r than any random predictions. To improve t h e forecast, i t is necessary

to

introduce into t h e model some assumptions concerning meteorological o r o t h e r conditions affecting t h e pollubnt distribution and

to

assume continuity of these conditions, both for initial and extrapolated data. A similar, if not even greater appeal

to

t h e development of p r e c i s e concepts on processes occurring in t h e atmosphere is

to

be found by Benarie (1982) in various computational models. In citing examples of t h e parameters employed in these models, Benarie (1982) expresses his doubts as

to

t h e possibflity of predetermining many of t h e matching parameters in t h e context of a logical description of natural pollution conditions. In o r d e r

to

define t h e bounds within which these methods can be applied. Benarie (1982) proceeds on t h e basis of t w o considerations. The f i r s t concerns t h e objective of t h e study. If f o r deriving mean

estimates

of c e r t a i n pollution characteristics, o r f o r a general description of pollutant distributions, statistical methods a r e

to

be useful, they should reflect certain general o r typical characteristics of atmospheric processes. The second consideration directly concerns those characteristics of atmospheric processes. t h a t r e n d e r impossible t h e application of certain detailed analytical schemes, and the prediction of t h e behavior of impurities in t h e atmosphere. The design of such models commonly proceeds under t h e assumption of monotypic behavior of t h e parameters and mode of pollution distribution within an area t h a t should be l a r g e enough to p r e s e r v e certain common properties in t h e

course

of a period long enough f o r investigations, but should b e small enough t h a t i t might b e attributed

to

s o m e common properties reflected

in

t h e parameters introduced into t h e model. In Benarie (1982) certain characteristic time-periods are presented f o r t h e existence of such a r e a s , within which adequate functioning of most of t h e proposed models is ensured. For an area with a side of 300 km, this time-period is. according

to

different estimates, between 1 2 and 75 hours with a pronounced mode in t h e histogram a t about 4 5 hours.

Quite a l a r g e number of examples can b e offered of t h e successful application of mathematical models t o t h e description of pollution in different environments (see. f o r example, Anokhin and Ostromogil'skii. 1978; Ostromogil'skii, 1982). Ber- lyand (1975.1984) demonstrates t h e possibility of using mathematical models. based on equation-solving of turbulent diffusion in t h e atmosphere, f o r t h e prediction of a possible s h a r p increase in concentrations during a period lasting from several hours up

to

several days, under unfavomble climatic conditions. Examples illustrating the successful application of regression models f o r t h e prediction of air- pollutant concentrations are presented in SingpurwaLla (1972); t h e model parame-

ters

are chosen not on t h e basis of physical considerations, but s o as to derive t h e best forecasts, and Singpurwalla (1972) finds i t necessary

to

produce evidence jus- tifying t h e

use

of such "non-physical" models. An example of a model designed in t e r m s of probability considerations

o n

t h e behavior of pollutants o v e r long time intervals, and t h a t serves

to

estimate t h e dynamics of pollutant distributions in s p a c e and time, is presented in Augustinyak and Sventz (1982) based on data derived from several stations.

Thus w e see that, notwithstanding t h e serious difficulties encountered in t h e application of air-quality models due

to

t h e complicated n a t u r e and rapid occurrence of atmospheric processes, interesting r e s u l t s can nevertheless be

(9)

derived. In e a c h c a s e this can b e achieved by clearly defining t h e class of problems t h a t should b e solved by t h e model, and t h e choice of adequate mathematical o r statistical methods f o r t h e i r realization, taking into account t h e difficulties cited at t h e beginning of t h i s c h a p t e r . Concentrating t h e i r attention on analytical treatment and comprising d a t a derived from s e v e r a l background monitoring stations, t h e p r e s e n t a u t h o r s recommend t h e use of a body of statistics, reflecting c e r t a i n g e n e r a l c h a r a c t e r i s t i c s in t h e behavior of atmospheric pollutants, impos- ing a minimum of assumptions on t h e usage of t h e d a t a , and not offering any d i r e c t meaningful conclusions; such a design philosophy i s feasible f o r t h e description of t h e data. Such a description in itself often furnishes t h e basis f o r t h e development of new hypotheses concerning t h e d a t a and leads

to

important r e s u l t s from testing t h e s e hypotheses. The use of applied statistical techniques and analytical t r e a t - ment of t h e d a t a f o r solving problems bearing on t h e assessment and description of air pollution, i s exemplified in t h e construction of statistical m o d e l s describing t h e behavior of pollutants in t h e atmosphere according

to

t h e i r frequency distributions of concentrations.

2.2. Descriptive Air-Pollution Models

Descriptive air-quality models have been employed in r o u t i n e investigations since t h e 1950s. A review of existing models can b e found, f o r example in Mage (1981).

One of t h e e a r l i e s t models

to

b e used i s t h e two-parameter lognormal distribution LN2, with a density function:

P a r t i c l e sizes formed during crushing, also dust particles, are well-described by t h e LN2, and i t w a s assumed t h a t use of t h i s function could b e expanded t o d e s c r i b e particulate matter

in

t h e atmosphere, not only according

to

size, but also according

to

concentration distributions (Zimmer and Tabor, 1959). The major conclusion drawn in Zimmer and Tabor (1959) on t h e basis of t h e r e s u l t s of suspended particulate measurements performed o v e r cities and beyond u r b a n a r e a s , is that.

notwithstanding c e r t a i n deviations, t h e concentrations r e v e a l a lognormal distribution. I t should b e mentioned t h a t t h e widespread applicability of lognormal distributions w a s illuminated by Aitchison and Brown (1957). In 1 9 6 1 t h e LN2 distribution was also used f o r t h e description of gaseous air-pollutant concentrations (R.J.

Larsen, 1969a). I t w a s established t h a t t h e CO concentrations in t h e area of Los Angeles

"...

r e v e a l a tendency t o w a r d s a lognormal distribution" (Larsen, 1969a).

L a t e r t h e LN2 model was widely used by t h e same a u t h o r

to

d e s c r i b e all t y p e s of a i r pollution (Larsen, 1969b). The wide application of t h e LN2 model, t h a t r e n d e r s possible i t s use f o r estimation of t h e air-quality under practically any measurement conditions, furnishes t h e basis f o r designing techniques f o r t h e assessment of a i r pollution c h a r a c t e r i s t i c s as statistical p a n m e t e r s of t h e proposed model. The use of t h e s e p a n m e t e r s in setting national s t a n d a r d s is described in Larsen (1 969 b)

.

Let t h e concentrations of pollutants measured during successful time- i n t e r v a l s b e denoted a s Co,C1,C2,

. ^. .

^,^C,. These values r e s u l t from many meteorological, geophysical and o t h e r f a c t o r s , and Khan (1973) suggests using t h e assumption t h a t changes in concentrations from o n e time i n t e r v a l

to

a n o t h e r c a n b e described in t h e form:

c, - c,

^-1

=

pj ' C, -1 (2.1)

(10)

where Cj and Cj are t h e concentrations measured during t h e time intervals j and j -1. The random variable pj r e p r e s e n t s t h e impact from many effects t h a t form t h e random realization of t h e concentration value during t h e time j -1. Equa- tion (2.1) i s commonly known as t h e law of proportional effect. For concentrations

to

obey t h i s law, i t i s postulated t h a t t h e change in t h e concentration during any time interval i s proportional t o t h e concentration t h a t h a s been attained up t o this moment.

Equation (2.1) can b e rewritten as

Then

Assuming t h a t t h e changes at any time instant are small, we g e t

from which i t follows t h a t

In Cn

=

ln C,

+

pl

+

p 2

+...+

pn

.

The c e n t r a l limit theorem permits o n e to state than In Cn is of asymptotic normal distribution, r e g a r d l e s s of t h e pj distribution and, consequently. t h e random value C, i s of lognormal distribution. This r e s u l t i s a l s o given by Aitchison and Brown (1957).

A s i s indicated in Aivazyan

et

al. (1983). C, can b e r e g a r d e d as a ''true" value C in a n idealized scheme, when t h e e f f e c t s of all random f a c t o r s have been elim- inated, and t h e p l , p Z , .

. .

,pn quantities are t h e numerical expression of t h e effects of t h e above-mentioned random factors. In t h i s connection. i t is noted in Aivazyan et al. (1983) t h a t although t h e values of t h e logarithmic distribution of t h e random quantity are formed as random e r r o r s of a c e r t a i n "true" value C. t h e latter emerges, in t h e long r u n , not in t h e r o l e of a mean value, but as t h e median.

This s e r v e s

to

define t h e r o l e of t h e median as t h e b e s t estimate of c e n t r a l tendency air-pollutant concentrations.

I t i s interesting

to

note also t h a t t h e d i r e c t application of t h e c e n t r a l limit theorem presumes independence of t h e random variables pj

.

Khan (1973) d o e s not claim t h a t such independence c a n b e proved, although h e considers t h a t indirect proof can b e found in t h e r e s u l t s of empirical investigations.

Real observational data seldom show p r e c i s e correspondence to t h e LN2 l a w , even in cases when i t s application can b e s t r i c t l y proved. Data on pollutant con- c e n t r a t i o n s in t h e atmosphere a l s o include deviations from t h e "pure" LN2. This h a s served and still s e r v e s as t h e basis f o r t h e c r i t i c a l analysis of t h e LN2 model and f o r t h e use of a l t e r n a t i v e models. Examples of t h i s c a n b e found in Khan (1973), Mage (1980, 1981) and Mage and O t t (1975); related problems are discussed in Horowitz and B a r a c a t (1979). Roberts (1979) and Soeda and Sawaragi (1979).

(11)

Lynn (1976) was among t h e f i r s t t o study t h e applicability of s e v e r a l probabilistic models

to

a i r pollution data. The analysis involved t h e normal law, LNZ, t h e three-parameter lognormal distribution L N 3 , t h e I and IV types of t h e Pearson distribution and t h e Gamma-distribution. The conclusion was drawn t h a t the LNZ was t h e best of all t h e above-cited distributions. Here a situation occurring fre- quently in statistical analysis was observed. Namely, in many cases a distributfon can b e selected (even among those cited above) t h a t most closely approximates t h e distribution of t h e sampled data. However, not one of these distributions can be applied

to

t h e description of all types of samples of aerometric data. For t h e i r description, several distributions s h d d b e employed. However, t h e LN2 distribution i s of g r e a t e s t value.

During t h e 1960s-1970s many case studies were accumulated concerning t h e application of LN2

to

t h e description of a i r pollution data. The observed deviations from t h e LN2 and t h e regularities perceived in them were used by s e v e r a l a u t h o r s

to

design models t h a t could ensure a high d e g r e e of applicability f o r t h e description of t h e available data, as good as t h a t of t h e LN2 model. Such an approach is exemplified in Mage (1980, 1981) and Mage and O t t (1975). where several types of distributions suitable f o r t h i s purpose are proposed. and in de Nevers

et

al. (1979), where t h e possibility of describing t h e data by employing combined distributions is discussed. This method of describing t h e data i s charac- terized by t h e s e a r c h f o r t h e best statistical design f o r t h e description of event- data, secured at t h e expense of general model applicability.

For instance, in Mage and O t t (1975) t h e authors conclude t h a t all a i r pollution d a t a studied by them reveal a common behavior in t h e i r deviations from t h e LN2

-

t h e i r distribution functions plotted on lognormal probability p a p e r demonstrate c h a r a c t e r i s t i c "curving". In o r d e r

to

t a k e account of t h i s effect. they suggest using t h e LN3 model

-

a three-parameter lognormal distribution with a density

1 exp

[ ^--. ;

^{(Ln ( z}^-A)

^-

~ n t x ) ~

1

'(=

= 4%

⁼

-

⁽^,^t -A)

'?

The fact t h a t this is not t h e only means f o r describing such deviations from SN2 is a p p a r e n t from Mage (1981). In t h i s work concerning t h e best description of t h e data, i t is proposed t o use t h e limited distribution models, with t h e introduction of nonstatistical prerequisites concerning t h e probable origin of such distributions in t h e problems under study.

In d e Nevers

et

al. (1979). a f t e r analytical treatment of a Large number of event-data on atmospheric particulate matter, t h e a u t h o r s distinguished not one (as in t h e former example) but f o u r types of deviations from Ule s t r a i g h t line, typical of distribution functions plotted on lognormal probability p a p e r . These four types are depicted in Figure 2.1. The a n t h o r s analyzed in detail t h e reasons f o r such deviations and proposed

to

describe them by a combination

of

two LN2 distributions. In t h e s a m e work. a n example i s given illustrating how in reality such a meteorological situation leading to a "composite" distribution can a r i s e , and an analytical treatment i s presented of real data corresponding to such a situation. It i s obvious t h a t , from t h e point of view of increasing model applicability, t h e last line of a t t a c k on t h e problem i s best. By retaining t h e well-studied and convenient LN2 distribution as t h e base-distribution. one may perform a uniform description of practically all observed deviations from LN2 by postulating t h a t s e v e r a l dif- f e r e n t types of meteorological processes a f f e c t t h e concentrations.

(12)

L i v e r m o r e

.

^./.^,^-

..*/-

/

,,"

/.

,*/

'

o r L. ^w-

P s e u d o

b

6 '

;

.-

*/

" i '

..,

5 ,ii . -

.

- 8: Y? 39

PERCENTILL

S a n D i e g o , CA

F r e s n o , CA

. ,if-

5 ZC 5 0 8C 9: 93

P E R C E N T I L E

Figure 2 . 1 Four types of deviations from t h e LN2 distribution f o r data on con- centrations of aerosol dust (de Nevers et al. 1979).

(13)

2.3. The Use of D e s c r i p t i v e Models pr S t u d i e s o f B a c k g r o u n d A i r P o l l u t i o n Monitoring Data.

Studies of different a i r pollution probability models gives ever-more convinc- ing evidence t h a t analytical treatment of a i r pollution d a t a should b e performed by means of thorough analysis of t h e chosen statistical model, a n d by t h e use of t h a t model t h a t from t h e point of view of t h e statistical c r i t e r i a provides t h e b e s t descriptions of t h e event-data. F o r t h i s purpose a set of automatic facilities i s proposed in Bencala and Seinfeld (1976) which performs t h e choice of t h e b e s t distribution from t h e point of view of t h e maximal similitude principle. Such a n a p p r o a c h which, probably, is applicable

to

t h e analysis of a i r pollution data at t h e impact level, can hardly b e used f o r t h e description of background monitoring d a t a . The construction of such statistical air pollution m o d e l s leads

to

a loss of g e n e m t y in t h e physical presentation of pollutant concentrations since. o n t h e basis of s t a t i s t i c s proposed by different models,

it

becomes impossible

to

establish any common f a c t o r s controlling the formation of pollution concentration frequency distributions.

In t h e majority of problems using statistical air pollution models. t h e a u t h o r s are interested, f i r s t of all, in the possibility of t h e application of t h e model t o obtain extreme value statistics. F o r instance,

m o s t

of t h e studies mentioned in t h e p r e s e n t c h a p t e r relate

to

t h e a i r quality s b n d a r d s adopted in t h e USA. a n d t h e formulation i s in

t e r m s

of t h e frequency of exceeding maximum permissible concen- t r a t i o n s during a given period of time (week, month, y e a r ) and f o r a given averaging time. F o r example, t h e CO concentrations a v e r a g e d f o r o n e h o u r might b e p e r - mitted

to

exceed 35 ppm only o n c e a y e a r , which i s equivalent

to

the statement t h a t t h e 1-hour time-averaged concentrations of CO may exceed 35 ppm in n o more t h a n 0.011% of t h e observations. I t is obvious t h a t if t h e hourly concentration distributions of CO are known. and a probability model of t h i s distribution exists. t h e n from a number of observations with specified o c c u r r e n c e , i t is possible t o define t h e distribution p a r a m e t e r s and

to

evaluate whether they conform

to

t h e s t a n d a r d distribution under t h e specific conditions.

The formulation of a i r pollution background monitoring problems in such a context h a s not been encountered (Rovinskii a n d Buyanova. 1982). Our attention i s c e n t e r e d mainly on t h e determination of the mechanisms governing t h e formation of d i f f e r e n t pollution concentration distributions; and t h e s t a t i s t i d design employed should b e sufficiently g e n e r a l t h a t i t could b e applied to d i f f e r e n t air pollution background monitoring time-series. The choice of t h e model from among numerous statistical models should b e prompted by t h e problems

to

be solved; and t h e d e g r e e of generality of t h e model should c o r r e s p o n d to t h e degree of generality of t h e results.

W e have chosen t h e two-parameter l o g ~ m r m a l l a w of pollutant concentration distributions

-

t h e LN2. Of all t h e l a w s studied, t h i s i s most widely used, owing

to

t h e f a c t t h a t it performs well f o r all pollutants within a n y observational a r e a , and f o r various time a v e r a g e s and,

m o s t

likely, r e f l e c t s c e r t a i n g e n e r a l conditions in t h e formation of d i f f e r e n t air pollutant concentration levels. Taking into account t h e f a c t t h a t w e are o f t e n confronted with t h e necessity of studying distributions t h a t deviate from LN2, we adopt h e r e t h e hypothesis postulating a n i n c r e a s e of model applicability by t h e u s e of combined LN2 distributions.

(14)

3. CONSTRUCTION OF A STATISTICAL MODEL SMULBTING BACKGROUND AIR POLLUTION FTZEQUENCY DISTRIBUTIONS

3.1. E s t i m a t e s o f B a c k g r o u n d Concentration LeveLs

Measurements of background atmospheric pollution concentrations have been obtained o v e r a long period of time merely f o r t h e general evaluation of air quality, and have been episodic in nature. The areas chosen f o r such measurements were usually located far-from industrial pollution sources, outside of urbanized districts. They included at time mountainous areas, located

at

g r e a t heights above sea level. In Burtseva et al. (1982) some data are given on lead concentrations in Western Europe and North America: in non-urbanized areas of Norway in 1971- 1972. at heights of 3600

m

above sea level

in

Switzerland, and during a four-year (1968-1971) observational cycle in California at heights of 3800 and 1860

m

above sea level. These data provided t h e basis for determfning t h e mean concentration values of lead f o r t h e USA and Central Europe. For t h e USA, t h e mean h a s been taken equal

to

8 ng/m3; for Central Europe

-

4 ng/m3. A t t h e same time c e r t a i n specific features w e r e noted in t h e behavior of lead in different physical- geographical a r e a s ; f o r instance, t h e Californian data revealed a seasonal t r e n d in concentration, with summer maximum and winter minimum, and t h e absence of a correlation between lead concentrations and suspended particulate matter, while in England a maximum was apparent in t h e winter concentrations of lead and o t h e r heavy metals.

In Rovinskii, Burtseva et al. (1982) and Rovinskii. Egorov et al. ( 1982). a n attempt is made t o analyze and summarize t h e data available in t h e world l i t e r a t u r e on t h e distribution of t h e major pollutants in nonindustrial areas. Due

to

t h e geographical position of t h e a r e a s under study, i t i s assumed t h a t these are background data. It i s pointed out t h a t t h e existing data are related

to

episodic observations performed

at

different time-intervals and in different localities. The r e a - son f o r this i s t h a t only at t h e end of t h e 1960s and t h e beginning of t h e 1970s did background a i r pollutant concentristions begin

to attract

attention, when i t was realized t h a t anthropogenic effects are of large-scale global importance. Because of this, answers to many questions concerning t h e long-term state of t h e atmo- s p h e r e cannot b e obtained. For instance. i t i s impossible

to

answer t h e question raised in Rovinskii, Burtseva et al. (1982) as

to

whether t h e r e are upward trends in t h e concentristions of air pollutants. In general. i t c a n be stated t h a t t h e world data studied in Rovinskii, Burtseva et al. (1982); Rovinskii, Egorov et al. (1982) r e v e a l very g r e a t variability in s p a c e and time. In Rovinskii and Buyanova (1982) i t i s suggested t h a t different types of backgrounds should b e distinguished: global, hemisphere. continental and regional. I t is suggested that t h e minimal mean values for d i f f e r e n t time intervals should b e used for estimates of normal (background) concentrations. This idea conveys implicitly t h e concept of t h e nature of background a i r pollution. As a matter of fact, t h e proposed types of data-averaging, allowing for data smoothing o v e r given time-periods and given spatial areas, and eliminating t h e effects of concentration increases in local zones and during short- time intervals, enable one

to

derive an integrated picture of t h e background setting. The construction of such a picture r e q u i r e s local measurements continuously conducted o v e r a long time. In Rovinskii and Buyanova (1982) stress is laid on t h e particular importance of regional background investigations f o r different regions taken together. The regional background regularities a r e , seemingly, t h e only predictors of regional, continental and global long-term behavior of pollution concentrations.

(15)

Air-pollution background monitoring stations have been established in t h e USSR and in many o t h e r countries within biosphere r e s e r v e s , also in localities not subjected to t h e influence of any apparent sources of pollution. These programs involve measurements of air-pollutant concentrations. Since 1976 such aerometric d a t a have been accumulated in t h e USSR which makes i t possible

to

estimate background concentration levels for particular regions, to analyse t h e data f o r dif- f e r e n t regions and f o r t h e world as a whole, to study t h e principles governing t h e formation of different concentration levels, and

to

obtain estimates of normal a i r pollution concentrations o v e r continents (Burtseva, Lapenko

et

al. 1982; Burtseva, Volonseva et al. 1982; Pastukhov

et

al. 1982). Annual data publications have begun (see for example, Bulletin of background pollution of t h e natural environment in t h e region of East-European Members-Countries of CMEA, 1982, 1983).

The data on heavy m e t a l concentrations in t h e area of t h e "Borovoe" s h t i o n are discussed in Burtseva, Lapenko

et

al. (1982). In t h e case of lead. t h e Lower limit of measurement

error

was found

to

b e 0.5 ng/m3, t h e coefficient of variation not exceeding 20%. According

to

t h e data presented ln Bnrtseva, Volosnea

et

al.

(1982). lead concentration measurements at background monitoring stations are performed within an accuracy of about 10%. The data r e p r e s e n t daily

m e a n

concentrations in t h e lower atmosphere. Analysis of t h e histograms of daily mean values for lead concentrations measured o v e r a four-year period, 197'7-1980, shows a strong asymmetry in t h e frequency distribution, with a pronounced concentration maximum in t h e l e f t lower quartile and a long "tail" in t h e r i g h t upper quartile. Burtseva, Lapenko

et

al. (1982) used t h e histograms f o r simple statistical inferences on t h e possibility of obtaining relatively stable estimates of lead concentration levels, t h e major

maxima

in t h e frequency distribution being chosen.

For t h e samples in Burtseva, Lapenko et al. (1982), such an i n t e n d included 65- 85% of t h e observations. The upper limit of t h e interval w a s taken as t h e upper estimate of t h e background concentration level; thus, according

to

t h e authors' estimates, t h e background concentration level in t h e atmosphere f o r lead in t h e area of t h e "Borovoe" station is between 0.5

to

30 ng/m3. F o r t h e f o u r y e a m studied, no clearly evident time changes in t h e concentration distributions occurred;

during 230-310 days p e r y e a r , t h e concentrations varied within t h e Limits typical of normally p u r e continental areas.

The proposed method f o r estimation of the background concentration level has a number of shortcomings. One of these i s t h a t t h e method does not explain t h e behavior of t h e concentrations in t h e frequency distribution. For instance, in Burtseva, Lapenko et al. (1982), t h e a u t h o r s could not offer a plausible explana- tion f o r t h e increase in t h e frequency of lead concentrations in t h e interval of 30- 60 n /m3 in 1979, or t h e presence of a r s e n i c concentrations in the interval of 3-6 ng/m

S

) f o r 30% of t h e observations ln 1980 ( t h e a r s e n i c backgrormd level being defined at 1-3 ng/m3). Analysis of the possible various t y p e s of e f f e c t s of meteor*

logical and o t h e r conditions on concentration variations fails

to

explain t h e observed events (Burtseva, Lapenko et al. 1982). Analysis of background m o n i t o r - ing data for sulfur dioxide w a s performed in Pastukhov

et

al. (1982). t h e a v e m g e monthly concentrations varying between 0.3

to

18.9

pg/m3

during the period of investigations

-

from 19'77

to

1981. The highest values were recorded during t h e winter, t h e lowest

-

during t h e summer. which i s a general

result

found also

Ln

data from t h e Repetek and Berezin B.Z. background monitoring stations. The annual cycle i s associated with t w o f a c t o r s

-

t h e considerable i n c r e a s e fn anthropogenic emissions from fuel-burning during t h e coLd periods of the y e a r , on t h e one hand, and t h e d r o p in t h e

rate

of oxidation of sulfur dioxide, on the other hand. Analysis of t h e monthly concentrations of sulfur dioxine, separately performed for t h e warm and cold seasons. made i t possible f o r t h e a u t h o r s (Pastukhov, 1982)

to

estimate t h e sulfur dioxide concentration level in t h e area of t h e "Borovoe" station at

(16)

0.5-1.0 p g / m 3

-

f o r the warm period and at 3.2-13.7 pg/m3 f o r t h e cold period.

Similar analysis of t h e average monthly values a t the "Berezin B.Z." and "Repetek B.Z." background monitoring stations gives t h e values 1.0-2.4. 1 0 pg/m3

-

f o r the f i r s t and 0.3, 1 . 0

-

f o r t h e second. Analysis of meteorological conditions and tra- jectories indicated t h a t t h e extreme concentmtion values cannot b e unambiguously correlated with t h e vector wind directions in t h e 'Borovoe" station area. The derived estimates f o r different observational areas are incommensurate and doubt a r i s e s concerning t h e i r possible use in estimating c h a r a c t e r i s t i c s of continental and global background concentration levels.

In Szepesi (1962) estimates are presented on air pollution characteristics, plotted on different scales. I t i s suggested t h a t t h e horizontal extent of t h e dis-

tricts

should be determined from two meteorological considerations: t h e lower measure i s specified by t h e distance. within which t h e background level i s deter- mined by the mixing processes

in

t h e atmospheric boundary layer. whereas t h e upper one

-

^by^therelative extent of t h e fetch o v e r which t h e meteorological parameters remain constant. By such estimates. boundarfas were defined (Szepesi, 1982) t h a t delimit t h e area of action of t h e estimates of t h e regional air pollution level; under t h e assumption of retgional uniformity, they should o p e r a t e wer a radius of between 20

to

300 km. Ndwithstandtng t h e rough

nature

of these estimates,

it

is possible to formulate t h e problem of determining the magnitudes of background concentration levels by means of comparison of data from s e v e d stations. In Szepesi and Fakete (1987) i t is assnmed Chat continental and global a i r pollution background concentration levels are subject to t h e influence of processes occurring o v e r thousands and tens of thonsands of kilometers. I t i s obvious t h a t t h e mutual influence of such processes leads to

an

intricate picture of formation of pollutant concentration levels. If t h e station network covering t h e continent i s sufficiently dense, i t might b e possible

to

define differences in background a i r pollution levels.

to

distinguish zones where the f a c t o m affecting the formation of different concentration levels are uniform, and

to

determine a certain i n t e g d c h a r a c t e r i s t i c describing t h e mean background level of pollutants f o r t h e e n t i r e continent. I t might b e interesting

to

compare such mean levels. derived daily at many stations f o r different time-periods, in order

to

check the hypothesis postulating t h a t lengthy periods o c c u r when t h e backgronnd level does not undergo changes across t h e continent as a whole, although daily variations are registered in t h e aerometric data f o r each s h t i o n . This hypothesis underlies t h e assumption of t h e existence of a continental and a global background value. In a somewhat dif- f e r e n t formulation, this hypothesis can b e found in Izxael(1984).

In Augustinyak and Sventz (1982) approximate estimates are given f o r t h e number of observing stations t h a t permit one

to

p b t t h e area describing t h e behavior of t h e pollutants through time within a c e r t a i n t e r r i t o r y . When t h e linear Law is used, the minimal number of measurement points is 9, f o r t h e s q u a r e l a w

-

^18,

t h e cubic l a w

-

^30,

^etc.

At present. t h e density of background monitoring is not sufficient

to

apply such models.

3.2. Statistical Amdysis ofBackground Monitoring Drrta

The d a t a

to

b e used are from t h r e e background monitoring stations

-

Borovoe, Berezin biosphere r e s e r v e , and Repetek biosphere reserve in t h e USSR. Descrip- tions of t h e data are given in bulletins (Bulletin of background pollution of t h e natural environment in t h e region of East-European MembersCountries of CMEA, 1982 and 1983). The techniques used

to

d e r i v e t h e data and a discussion of t h e i r reliability can b e found in Burseva, Lapenko

et

al. (1982). Burseva. Volosneva et al. (1982) and Pastukhov et al. (1982).

(17)

In t h e present study, t h r e e pollutants have been selected

-

sulfur dioxide, lead, and

total

suspended particulates, f o r which daily observations were available during 1976-83 at t h e Borovoe station and 1980-83 a t t h e Berezin and Repetek stations. The t h r e e pollutants differ according

to

t h e i r physical-chemical behavior, and the stations a r e located in different physical-geogmphical areas. A joint analysis of t h e sampled data with a view

to

finding common statistical characteris- t i c s can enable one

to

define some common principles governing t h e behavior of air pollutants, and can provide a basis f o r designing techniques f o r evaluation of background pollutant concentration levels on a wide scale

-

both in space and time.

The f i r s t stage of statistical data analysis should b e t h e construction of t h e statistical data model. Then, t h e statistical characteristics describing t h e data series can b e investigated, and t h e i r applicabfflty f o r obtaining non-statistical conclusions can b e explored. Techniques f o r designing statistical models and t h e

use

of t h e statistical information in h y d r o m e b o m l o g i d and g e o p h y s i d applica- tions

are

described in Aivazyan

et

al. (1983). Gruza and Reitenbakh (1982) and Kleiner and Gradel (1980). In Aivazya et al. (1983). some general techniques used:

in designing statistical models are presented. In p r a c t i c e two different methods of analysis are

used:

mathematical, relying on theoretical-probabilistic considerations, and computational

-

by way of d i r e c t r e p r o d u d i o n of t h e model function on a PC. The f i r s t method calls f o r hypotheses and a priori assumptions concerning t h e d a t a t h a t should serve

to

validate t h e choice of model; t h e second r e q u i r e s sMae

preliminary formalized knowledge of t h e data, t h a t could b e reflected

in

algo- rithmic form, and could be used

to

develop or refine t h e theoretical-probabilistic method. In t h e present study, both of these mutually complementary methods are employed: t h e f i r s t stage, presumably, should involve t h e development of c e r h i n g e n e m l theoretical-probabilistic concepts of t h e model.

In Burtseva, Lapenski (1982) s e v e r a l histograms were examined t h a t describe t h e heavy metal frequency distribution at the Borovoe background monitoring sta- tion. These histograms exhibit a lognormal distrfbntion, with t h e mode shifted

to

t h e left and a long "tail" at t h e right. Histogrruns of t h i s type can be perceived in t h e distribution of all t h r e e pollutants, sampled f o r statistical analysis at alI sta- tions and f o r any period. I t is t h e r e f o r e possible already

to

utilize t h e logarithmic form in t h e analysis and f o r checking t h e hypothesis of a lognormal distribution.

In Figures 3.1, 3.2, 3.3, plots are shown t h a t c h a r a c t e r i z e t h e lead concentration distributions

at

t h e "Borovoe" station during t h e four-year period of observations. Because much of t h e subsequent analysia i s based on studies of t h e s e plots, w e shall dwell upon them. These plots p o r t r a y graphically t h e empirical density and cumulative distribution functions (3.1 and 3.2), and depict the deviation of t h e empirical density function from t h e theoretical one (3.3). Methods f o r graphical assessment of t h e distribution parameters

are

discussed in Mmzewski and Sowa (1978-1979) and problems bearing on graphical estimates

are

treated also in Rnbin (1976), Aivazyan

et

al. (1983) and o t h e r publications. Kleiner and Grade1 (1980), note t h a t t h e

use

of graphical methods is generally t y p i d of !&atistical an?4sis of geographical data. Those a u t h o r s consider t h a t t h e reason is t h a t geophysical data usually involve daily, seasonal. annual and i n t e r - ~ ~ o a l variations, apart from o t h e r more pronounced effects, c h a r a c t e r i s t i c of short-time intervals, and, inasmuch as t h e major objective of t h e s e methods fs

to

illuminate these relation- ships and s t r u c t u r e s , representation of t h e d a t a in t h e

m o s t

recognizable form becomes particularly important. For evaluation of the d e g r e e of agreement of t h e d a b with t h e chosen LN2 distribution. various methods can b e used. Methods of evaluation, in particular f o r t h e lognormal distribution are discussed in Rovinskii and Cherkhanov (1982) while in Selvin (1976) and Gnanadesican and Kettering (1972), s e v e r a l methods are examined f o r numerical estimation of the model distribution under conditions of different types of d a t a

errors.

Many of t h e methods

(18)

(19)

N O S M ~ L P L O ~ O F - V A R I A B L E c r a

SYMBOL COUNT M E L N S T I D E V

1 1 7 8 7 9 8 0 8 1 B 1 0 3 9 2 , 8 5 4 0 t h 7 8

, * , . . , + . ~ , ~ * , . ~ . * . , , , + ~ . , , + , l l , * l , l . * l , ~ , + , , . l + , , , , * l ~ . * * ~ ~ ~ , + l ~ ~ l * 1 l t ~ + ~ f 1 ~ * ~ ~ 1 ~ * @

3 , 7 5

I 8 I

I B

, ,

b BB

I 0 0 ^I

2 , 1 5 8 8

:

I 8 8 ^I

n 8 8

9 8 0 ⁱ

8 8 8 ^B

1 , 5 0 0 b ;

I B B

I I B 0 ¹

I PBBB

.

8 8

, 7 5 0

:

^{~ i a} ⁺

? 0 0 P

8 B ^I

I 8 0 ^I

0 8 8 ^I

0 , o o

:

OBBB ^I

I Y 8

I 8 0 b ^I

I 0 0 0 ^I

0 B B ^I

- , 7 5 0 ; B B B

I b b +

6 0 e

I 8 8 8 ^I

E B B 0

- 1 ,SO

:

^{BB 8} ;

I R

I BBBBB ^I

, _B ^BE ^B

- Z , L 5

:

^R ;

a B

8 , a

1 d

- s , o o + a

1 0

*

1

I

, ^II

- 3 , ? >

:

1

,*....*..,.+..!.*.,,,*,,.. + . . C , + . . . . * . . * . + , . , . + . . l a +

' ' v i : h J , , + , . I * b . f . ~ * * ! , * * . . l . ' ,

, 3 5 0 1 1 0 5 1 1 7 5 2 1 L I . 1 ' 1 5 ~ , S S s t 2 5

9 , C O , 7 0 0 1 9 C O 2 , l O 2 , 8 0 3.50 4 , Z O & , P O 5 , b O

Figure 3.2 Normal plot of cumulative logarithmic concentnttions of lead. Boro- voe station, 1978-81.

(20)

D E V I A T I O N S F R O M N O R M A L

B B B B B

I B

0 B

B 0 8

B 8 B

B B

' 6 9 6

0 0 8 8 0 0 0 RB

BBBBB BBB B E B B

8 B

Figure 3.3 Deviations from normal plot of logarithmic concentrations of lead, Borovoe station, 1978-1981.

(21)

discussed in these works are designed t o derive numerical statistics t h a t best describe t h e empirical distributions. Graphical qualitative evaluations of t h e distribution pattern are also used. In o r d e r t o determine how much t h e observed distribution differs from a given theoretical distribution, various c r i t e r i a of good- ness of fit can be used. However, according

to

Kleiner and Grade1 (1980), the numerical result derived from t h e i r use does not indicate in what places and f o r what reasons t h e observed distribution deviates from t h e model one. In t h e case of a normal distribution. t h e r e would b e an exactly symmetrical bell-shaped c u r v e in Figure 3.1, a straight line in Figure 3.2, and a very n a r r o w s p r e a d in Figure 3.3.

Histograms are often constructed when t h e number of observations becomes large. The length of t h e interval i s taken equal

to

where z,, and

zd,

are t h e maximal and minimal points

on

t h e logarfthmic concentration scale f o r t h e given sample. N

-

t h e number of observations in t h e sample.

The distribution function is plotted on normal probability p a p e r as distribution quantiles against t h e observed variable.

where n is t h e number of t h e variable

z ,

in t h e variational s e r i e s , a r r a n g e d in ascending o r d e r . The value of t h e F ( z n ) function corresponds to t h e probability

( 3 n -1) / (3N + 1 ) of t h e centered and normalized normal distribution

t

Q ( t )

= /

N ( z ; O , l ) & ,

- a

where

Equation 3.4 r e p r e s e n t s Equation 3.2 with Linear t r e n d removed:

where h and

5

denote t h e sample a v e r a g e and variance, respectively. This equation shows t h e deviation from t h e s t r a i g h t line, specified by estimates of parameters h and

E

, and thereby gives a qualitative display of t h e d e g r e e of agreement between t h e e v e n t d a t a and a LN2 distribution, graphically revealing t h e n a t u r e of inconsistencies with t h e theoretical distribution.

Discussions of t h e problems concerned with plotting and evaluation of t h e distributions by employing g r a p h s of this type c a n b e found in Aivaeyan (1983) and Kleiner and Grade1 (1980).

As can b e seen from Figures 3.1, 3.2, 3.3, t h e empirical density and distribution functions, as expected, differ from t h e theoretical ones. The question as to how

to

proceed in t h e case of such deviations is discussed at length in Aivazyan