W O R K I N G P A P E R
A mATISIICAL MODEL
OF
BACKGROUNDAIR
POLLUTION FREQUESCY DISI'RIBUTIONSM. Ya. Antonovski
VM.
Bukhshtaber E.A. ZateniukNovember 1988 WP-88-102
Zsi
...
l1 lASA
I n t e r n a t i o n a l I n s t i t u t e for Applied Systems Analysis
A S T A m C A L
HODEL OF BACKGROUND AIR POLUJTIm
FREQUMCYDlSIXIBIJTiONS
M. Ya. Antonovski KM. Bukhshtaber E.A. Z d e n t u k
November 1988 WP-68-102
Working Papers are interim reports on work of the International Institute f o r Applied Systems Analysis and have received only limited review. Views or opinions expressed herein do not necessarily represent those of the Institute or of its National Member Organizations.
INTERNATIONAL INSTITUTE
FOR
APPLIED SYSTEMS ANALYSIS A-2361 Laxenburg, AustriaPREFACE
The a u t h o r s of t h i s p a p e r d e s c r i b e a n a p p r o a c h f o r identifying s t a t i s t i c a l l y stable c e n t r a l tendencies in Lhe frequency distributions of time s e r i e s of o b s e r v a - tions of background atmospheric pollutants. The d a t a were collected as daily mean values of c o n c e n t r a t i o n s of s u l f u r dioxide a n d suspended p a r t i c u l a t e m a t t e r at five monitoring s t a t i o n s
-
t h r e e in t h e USSR, o n e in Norway, a n d o n e in Sweden.In Lheir a p p r o a c h , t h e a u t h o r s u s e well-developed s t a t i s t i c a l techniques and t h e usual method of constructing multimodal distributions. The problem i s subdivid- ed into two p a r t s : f i r s t , a decomposition of t h e o b s e r v a t i o n s in o r d e r
to
obtain a description of e a c h season s e p a r a t e l y and second, a n investigation of t h i s d e s c r i p - tion in o r d e r t o d e r i v e statistically s t a b l e c h a r a c t e r i s t i c s of t h e e n t i r e d a t a set.The main hypothesis of t h e investigation i s t h a t dispersion p r o c e s s e s i n t e r a c t in such a way t h a t in t h e zone of influence of o n e p r o c e s s ( n e a r i t s mode) t h e "tails"
of t h e o t h e r p r o c e s s are not o b s e r v e d . This permits illumination of i n t e r r e l a t i o n s between t h e physics a n d t h e chemistry of t h e atmosphere.
During t h e last 15-20 y e a r s , a wide r a n g e of monitoring p r o g r a m s h a s been in- itiated at national a n d international levels including, f o r example, t h e European Monitoring and Evaluation P r o g r a m (EMEP) under t h e a u s p i c e s of t h e ECE, and t h e Background Air Pollution Monitoring Network (BAPMoN) u n d e r t h e a u s p i c e s of t h e WMO.
The flow of d a t a from t h e system of monitoring s t a t i o n s h a s led
to
national and international p r o j e c t s f o r t h e development of extensive environmental d a t a b a s e s such as NOAANET (NDAA), GRID/GEMS/UNEP/NASA, e t c . The d e g r e e of informa- tion obtained should b e sufficient f o r t h e goals of t h e analysis but o f t e n t h e r e i s a n overabundance of such d a t a . The methods discussed in t h i s p a p e r t h e r e f o r e help in a i r pollution assessments, p a r t i c u l a r l y with r e s p e c t t o distinguishing t h e base- line components, and t h e i r t r e n d s o v e r decades.R.E. Munn
L e a d e r , Environment P r o g r a m
CONTENTS 1. INTRODUCTION
1.1. Problems of Background Air Pollution Monitoring
1.2. Probabilistic Approach t o Investigations of Background Air Pollution 2. PRESENT STATUS OF STATISTICAL ANALYSIS OF AIR POLLUTION
2.1. Review of t h e Application of Probabilistic Methods
to
Descriptions of Air Pollution2.2. Descriptive Air Pollution Models
2.3. The Use of Descriptive Models f o r Studies of Background Air Pollution Monitoring Data
3. CONSTRUCTION OF A STATISTICAL MODEL SIMULATING BACKGROUND AIR POLLUTION FREQUENCY DISTRIBUTIONS
3.1. Estimates of Background Concentration Levels 3.2. S t a t i s t i c a l Analysis of Background Monitoring Data
3.3. Construction of a S t a t i s t i c a l Model f o r Background A i r Pollution Monitor- ing Data
4. ASSESSMENT OF BACKGROUND AIR-POLLUTION MODEL PARAMETERS
4.1. Discussion of t h e Possibilities of a Credible I n t e r p r e t a t i o n of t h e Model P a r a m e t e r s
4.2. Theoretical Principles Underlying t h e S t a t i s t i c a l Model of Background Air Pollution
4.3. Assessment of Model P a r a m e t e r s in Terms of t h e Simulation Data 5. DISCRIMINATION OF THE COMPONENTS OF BACKGROUND AIR POLLUTION
5.1. Estimation of Central Tendencies of Multi-modal Frequency Distributions of Seasonal D a t a S e r i e s
5.2. Estimates of S e l e c t i v e Grouping I n t e r v a l s
5.3. Methods of Construction of Statistically S t a b l e Estimates of Pollution Components
5.4. Analysis of Components of Background Air Pollution Components
REFERENCES APPENDIX
ACKNOWLEDGEMENT
A STATISL'ICAL MODEL OF BACKGROUND AIR POLLUTION FREQUENCY DISJXIBUTIOEIS hi. Y a A n t o n o v s k i , EM. B u k h s h t a b e r * a n d E.A. ZaLeniuk**
1. INTRODUCTION
1. Problems of B a c k g r o u n d A i r PoLlution Monitoring
The natural environment experiences ever-increasing anthropogenic effects.
In o r d e r
to
estimate t h e magnitude of t h e s e effects andto
prevent d i r e conse- quences, i t is necessary, f i r s t of all,to
have unbiased information concerning the actual state of t h e natural environment.The results of investigations on t h e distribution of pollutants from various sources are present in Izrael and Novikov (1985). According
to
o u t e r s p a c e ex- ploration data, a i r pollution o c c u r s in t h e form of "aerosol fields" and e n t i r e zones of anthropogenic effects can b e distinguished a t distances of hundreds t o thousands of kilometers from t h e s o u r c e of pollution. The d e g r e e of anthropogenic pollution in such areas can be determined only by using estimates of normal, o r background pollutant concentrations in t h e atmosphere of t h e respective regions.The National System f o r monitoring background values. adopted in t h e USSR (Izra- el, 1984) calls f o r investigations and observations on t h e composition, transforma- tion and migration of pollutants. Among t h e pollutants of priority are ozone, dust, sulfur and nitrogen compounds, lead, mercury, and some o t h e r substances. At present special attention i s centered on t h e system of observations on background a i r pollution. In t h e USSR t h e f i r s t phase of establishing a network of monitoring stations has been completed (Rovinskii and Buyanova, 1982). A special global sys- t e m of stations f o r background a i r pollution monitoring is being realized within the framework of t h e program of t h e World Meteorological Organization (Izrael. 1984).
The projects and recommendations in r e g a r d
to
national and global background monitoring systems are widely discussed in t h e l i t e r a t u r e (see. f o r example, Wiers- ma, 1985; Lynn, 1976).The major problems
to
be solved by global background monitoring have been formulated in Wiersma (1985) as follows:1. Establishment of t h e relative concentration levels f o r pollutants, capable of estimating global distributions.
2. Early warning of t r e n d s in global pollutant distributions.
3. Establishment of normal concentration levels f o r parameters of ecosystems and t h e i r comparison with concentration levels of impact zones.
The problems bearing on t h e detection of t h e continental and global behavior of pollutants are t r e a t e d in Rovinskii and Buyanova (1982), where i t i s pointed out t h a t i t i s necessary
to
analyze regional background a i r pollution processes. A t* All-Union Research I n s t i t u t e of Physicotechnical and Radiotechnical Measurements, USSR.
** Natural Environment and Climate Monitoring Laboratory, USSR.
p r e s e n t i t is considered t h a t t h e most satisfactory s i t e s f o r t h e location of back- ground monitoring stations are biosphere r e s e r v e s and o t h e r natural r e s e r v e s . Among t h e o t h e r c r i t e r i a recognized in Rovinskii and Cherkhanov (1982) f o r t h e selection of station s i t e s are geographical zonality, distance from t h e s o u r c e of pollution, and t h e d e g r e e of representativeness of t h e derived d a t a . Representa- tiveness h a s a special meaning
-
t h e absence of any obvious anthropogenic e f f e c t s on t h e measured normal pollutant concentrations, and t h e comparability of t h e a e r o m e t r i c d a t a with d a t a derived from o t h e r stations. Such a comparison in cer- tain cases i s f r a u g h t with difficulties, on account of t h e high variability of t h e a e r o m e t r i c data. resulting from measurement e r r o r s as well as from t h e influence of physical and geographical f a c t o r s . The l a t t e r involve, f i r s t of all. t h e location of stations in regions t h a t d i f f e r accordingto
t h e d e g r e e and n a t u r e of anthropo- genic e f f e c t s , and accordingto
t h e processes determining t h e pollutant concentra- tion variations. An idea of t h e d e g r e e of variability of t h e estimate of background concentration levels can b e conceived from t h e data presented in I z r a e l (1984) on t h e lead concentrations in t h e lower atmosphere: f o r t h e lowland areas of Western Europe-
o v e r 100 ng/m3; f o r normal regions of t h e USSR-
2-40 ng/m3; f o r moun- tainous areas of t h e USSR-
2-6 ng/m3, f o r mountainous areas of North America-
4.6-2lng/roman m3. In o r d e r
to
r e d u c e t h e variability of t h e d a t a and of t h e derived estimates of background concentration levels, it is commonly suggested (see, f o r example, Rovinskii and Buyanova. 1982) t h a t observational data. aver- aged in time and space, should b e used. Interm
of spatial averaging, t h e global, semi-global, continental and regional t y p e s of backgrounds can b e distinguished.The concept of regional background allows o n e
to
a t t r i b u t e p a r t of t h e variability of t h e background level estimatesto
t h e specific f e a t u r e s of t h e region and of t h e locality of t h e observing station. In t h i s case t h e background value is determined (Rovinskii and Buyanova. 1982) as t h e mean of t h e minimal content values of t h e given substance during a definite time-interval. Such a n a p p r o a c h makes i t possi- ble t o a t t r i b u t e t h e e f f e c t s of high "abnormal" concentrationsto
local sources, o rto
associate them with anomalous meteorological conditions.Hence, t h e problem of t h e extraction of 'background" information from a s e r i e s of monitoring data i s associated with t h e development and application of sta- tistical assessment of a e r o m e t r i c data. and c a n b e formulated as t h e problem of determining statistically s t a b l e c h a r a c t e r i s t i c s of t h e derived data.
1.2. P r o b a b i l i s t i c Approach to I n v e s t i g a t i o n s o f B c k g r o u n d A i r PoUution.
The p r e s e n t study i s devoted
to
t h e statistical analysis of background a i r pol- lution monitoring d a t a , having as i t s objective t h e design of a statistical model of background a i r pollution and its application f o r t h e determination of statistical c h a r a c t e r i s t i c s describing t h e probability l a w s governing t h e behavior of impuri- t i e s in t h e atmosphere.Statistical models of a i r pollution distribution have been widely discussed in t h e l i t e r a t u r e (see, f o r example, Augustinyak and Sventz (1982); Berlyand (1975);
Berlyand (1984); Benarfe (1982); Mage (1981). However. background monitoring d a t a possess c e r t a i n specific f e a t u r e s , c r e a t i n g difficulties in t h e use of tradi- tional models (such a s , f o r example, t h e two-parameter lognormal distribution LN2 (Harris and Tabor, 1956; Larsen, 1961). Measurements of background a i r pollution levels are conducted in areas where t h e d i r e c t e f f e c t s of s t r o n g pollution sources are practically excluded. This implies t h a t t h e observed d a t a variability is t o a considerable d e g r e e due
to
t h e e f f e c t s of large-scale atmospheric processes, t h a t determine t h e mode of o c c u r r e n c e of different concentration levels in t h e area, r a t h e r thanto
t h e e f f e c t s resulting from point sources of pollution. Most of t h e a i r pollution models employed are designed f o r use under t h e assumption of t h eexistence of point sources. Studies of t h e probability concentration distribution laws f o r t h e atmosphere of normal regions allow one t o g e t a n idea of t h e qualita- tive mechanisms governing t h e formation of different concentration levels. Statis- tics, describing t h e s e laws, r e f l e c t c e r t a i n regularities in t h e formation mechan- isms and c a n be used for assessment of background a i r pollution. Such a n a p p r o a c h enables one
to
validate statistically t h e intuitively derived concepts of t h e normal (background) level as t h e mean of t h e minimal measurements f o r a given time-interval (Rovinskii and Buyanova, 1982). or as t h e minimal but m o s t distinctly expressed concentration level, typical of t h e region (Izrael. 1984). The derived s t a t i s t i c s r e p r e s e n t a n informative description of t h e time s e r i e s of background air pollution monitoring d a t a and, in t u r n , can b e used to obtain explicit inferences bearing on t h e n a t u r e of t h e measurements and t h e i r behavior.The major stages in designing. analysis and appLication of t h e statistical model of background air pollution are as follows:
1. Statistical analysis of background a i r pollution monitoring data. Studies of t h e logarithmic concentration distribution functions f o r data series of dif- f e r e n t time-intervals.
2. Investigation of t h e possibilities of describing t h e logarithmic concentration s e r i e s by multimodal distributions, and t h e physical p r e r e q u i s i t e s f o r t h e o r i - gin of multimodality.
3. Simulation of d a t a series in
terms
of composite distributions of a specific type, and development of graphical methods f o r estimation of performance parameters.4. Description of seasonal observational data series by central tendencies of multimodal frequency distributions. Development of techniques f o r identifica- tion of statistically s t a b l e grouping intervals.
5. Analysis of statistically s t a b l e grouping intervals and t h e i r manifestations in seasonal and multiyear d a t a series.
6. Analysis of t h e air pollution components described by statistically stable grouping intervals; comparative analysis of t h e components and t h e i r manifes- tations a t d i f f e r e n t background monitoring stations; development of recorn- mendations f o r t h e assessment of background concentration levels.
2. PRESENT STATUS
OF
STATETICAL ANbLYSISOF AIR
POLLUTION2.1. Review of the Application of Probabilistic Methods to Descriptions of Air Pollution.
Probabilistic models are o f k n used f o r t h e description of aerometric d a t a , and provide t h e basis f o r obtaining estimates and approximate descriptions of t h e distribution of air-pollutants. The models are used in many a s p e c t s of a i r quality planning, when prediction is one of t h e main aims of t h e study. The application of theoretical-probability and statistical methods
to
solving such problems h a s been discussed in a number of publications. For instance. one review (Hunter. 1981)treats
many c h a r a c t e r i s t i c a s p e c t s of t h e application of statistical methodsto
problems of environmental control. The major problem on which t h e s e a u t h o r s c e n t e r t h e i r attention c o n c e r n s t h e existence of t h e g a p between t h e demands f o r a good descriptive model, connecting observed processes with t h e environmental p a r a m e t e r s introduced into t h e model, and t h e r e a l possibilities f o r assessment and measurement of such parameters. A typical systematization of t h e applied models can b e found in Benarie (1982). Dividing a i r pollution models into descriptive.computational and predictive. Benarie assigns time-series analysis to t h e f i r s t
method. In t h e second and third cases, including regression and simulation, t h e methods demand information on t h e s t a t e of independent parameters t h a t have a d i r e c t bearing on pollution dispersion. Examining t h e possibility of application of the f i r s t method, t h e author has presented characteristics examples illustrating t h e i r limited applicability. For instance, experiments with t h e Box-Jenkins model (analysis of time-series) f o r the extrapolation of a i r pollution data f r o m 100 days of observations, showed t h a t estimates of model concentrations f o r t h e l o l s t , 102nd. etc., days were n o b e t t e r than any random predictions. To improve t h e forecast, i t is necessary
to
introduce into t h e model some assumptions concerning meteorological o r o t h e r conditions affecting t h e pollubnt distribution andto
assume continuity of these conditions, both for initial and extrapolated data. A similar, if not even greater appeal
to
t h e development of p r e c i s e concepts on processes occurring in t h e atmosphere isto
be found by Benarie (1982) in various computational models. In citing examples of t h e parameters employed in these models, Benarie (1982) expresses his doubts asto
t h e possibflity of predetermining many of t h e matching parameters in t h e context of a logical description of natural pollution conditions. In o r d e rto
define t h e bounds within which these methods can be applied. Benarie (1982) proceeds on t h e basis of t w o considerations. The f i r s t concerns t h e objective of t h e study. If f o r deriving meanestimates
of c e r t a i n pol- lution characteristics, o r f o r a general description of pollutant distributions, sta- tistical methods a r eto
be useful, they should reflect certain general o r typical characteristics of atmospheric processes. The second consideration directly con- cerns those characteristics of atmospheric processes. t h a t r e n d e r impossible t h e application of certain detailed analytical schemes, and the prediction of t h e behavior of impurities in t h e atmosphere. The design of such models commonly proceeds under t h e assumption of monotypic behavior of t h e parameters and mode of pollution distribution within an area t h a t should be l a r g e enough to p r e s e r v e certain common properties in t h ecourse
of a period long enough f o r investiga- tions, but should b e small enough t h a t i t might b e attributedto
s o m e common pro- perties reflectedin
t h e parameters introduced into t h e model. In Benarie (1982) certain characteristic time-periods are presented f o r t h e existence of such a r e a s , within which adequate functioning of most of t h e proposed models is ensured. For an area with a side of 300 km, this time-period is. accordingto
different estimates, between 1 2 and 75 hours with a pronounced mode in t h e histogram a t about 4 5 hours.Quite a l a r g e number of examples can b e offered of t h e successful application of mathematical models t o t h e description of pollution in different environments (see. f o r example, Anokhin and Ostromogil'skii. 1978; Ostromogil'skii, 1982). Ber- lyand (1975.1984) demonstrates t h e possibility of using mathematical models. based on equation-solving of turbulent diffusion in t h e atmosphere, f o r t h e prediction of a possible s h a r p increase in concentrations during a period lasting from several hours up
to
several days, under unfavomble climatic conditions. Examples illus- trating the successful application of regression models f o r t h e prediction of air- pollutant concentrations are presented in SingpurwaLla (1972); t h e model parame-ters
are chosen not on t h e basis of physical considerations, but s o as to derive t h e best forecasts, and Singpurwalla (1972) finds i t necessaryto
produce evidence jus- tifying t h euse
of such "non-physical" models. An example of a model designed in t e r m s of probability considerationso n
t h e behavior of pollutants o v e r long time intervals, and t h a t servesto
estimate t h e dynamics of pollutant distributions in s p a c e and time, is presented in Augustinyak and Sventz (1982) based on data derived from several stations.Thus w e see that, notwithstanding t h e serious difficulties encountered in t h e application of air-quality models due
to
t h e complicated n a t u r e and rapid occurrence of atmospheric processes, interesting r e s u l t s can nevertheless bederived. In e a c h c a s e this can b e achieved by clearly defining t h e class of prob- lems t h a t should b e solved by t h e model, and t h e choice of adequate mathematical o r statistical methods f o r t h e i r realization, taking into account t h e difficulties cited at t h e beginning of t h i s c h a p t e r . Concentrating t h e i r attention on analytical treatment and comprising d a t a derived from s e v e r a l background monitoring sta- tions, t h e p r e s e n t a u t h o r s recommend t h e use of a body of statistics, reflecting c e r t a i n g e n e r a l c h a r a c t e r i s t i c s in t h e behavior of atmospheric pollutants, impos- ing a minimum of assumptions on t h e usage of t h e d a t a , and not offering any d i r e c t meaningful conclusions; such a design philosophy i s feasible f o r t h e description of t h e data. Such a description in itself often furnishes t h e basis f o r t h e development of new hypotheses concerning t h e d a t a and leads
to
important r e s u l t s from testing t h e s e hypotheses. The use of applied statistical techniques and analytical t r e a t - ment of t h e d a t a f o r solving problems bearing on t h e assessment and description of air pollution, i s exemplified in t h e construction of statistical m o d e l s describing t h e behavior of pollutants in t h e atmosphere accordingto
t h e i r frequency distribu- tions of concentrations.2.2. Descriptive Air-Pollution Models
Descriptive air-quality models have been employed in r o u t i n e investigations since t h e 1950s. A review of existing models can b e found, f o r example in Mage (1981).
One of t h e e a r l i e s t models
to
b e used i s t h e two-parameter lognormal distribu- tion LN2, with a density function:P a r t i c l e sizes formed during crushing, also dust particles, are well-described by t h e LN2, and i t w a s assumed t h a t use of t h i s function could b e expanded t o d e s c r i b e particulate matter
in
t h e atmosphere, not only accordingto
size, but also accordingto
concentration distributions (Zimmer and Tabor, 1959). The major con- clusion drawn in Zimmer and Tabor (1959) on t h e basis of t h e r e s u l t s of suspended particulate measurements performed o v e r cities and beyond u r b a n a r e a s , is that.notwithstanding c e r t a i n deviations, t h e concentrations r e v e a l a lognormal distri- bution. I t should b e mentioned t h a t t h e widespread applicability of lognormal dis- tributions w a s illuminated by Aitchison and Brown (1957). In 1 9 6 1 t h e LN2 distribu- tion was also used f o r t h e description of gaseous air-pollutant concentrations (R.J.
Larsen, 1969a). I t w a s established t h a t t h e CO concentrations in t h e area of Los Angeles
"...
r e v e a l a tendency t o w a r d s a lognormal distribution" (Larsen, 1969a).L a t e r t h e LN2 model was widely used by t h e same a u t h o r
to
d e s c r i b e all t y p e s of a i r pollution (Larsen, 1969b). The wide application of t h e LN2 model, t h a t r e n d e r s possible i t s use f o r estimation of t h e air-quality under practically any measure- ment conditions, furnishes t h e basis f o r designing techniques f o r t h e assessment of a i r pollution c h a r a c t e r i s t i c s as statistical p a n m e t e r s of t h e proposed model. The use of t h e s e p a n m e t e r s in setting national s t a n d a r d s is described in Larsen (1 969 b).
Let t h e concentrations of pollutants measured during successful time- i n t e r v a l s b e denoted a s Co,C1,C2,
. . .
, C,. These values r e s u l t from many meteoro- logical, geophysical and o t h e r f a c t o r s , and Khan (1973) suggests using t h e assump- tion t h a t changes in concentrations from o n e time i n t e r v a lto
a n o t h e r c a n b e described in t h e form:c, - c,
-1=
pj ' C, -1 (2.1)where Cj and Cj are t h e concentrations measured during t h e time intervals j and j -1. The random variable pj r e p r e s e n t s t h e impact from many effects t h a t form t h e random realization of t h e concentration value during t h e time j -1. Equa- tion (2.1) i s commonly known as t h e law of proportional effect. For concentrations
to
obey t h i s law, i t i s postulated t h a t t h e change in t h e concentration during any time interval i s proportional t o t h e concentration t h a t h a s been attained up t o this moment.Equation (2.1) can b e rewritten as
Then
Assuming t h a t t h e changes at any time instant are small, we g e t
from which i t follows t h a t
In Cn
=
ln C,+
pl+
p 2+...+
pn.
The c e n t r a l limit theorem permits o n e to state than In Cn is of asymptotic normal distribution, r e g a r d l e s s of t h e pj distribution and, consequently. t h e random value C, i s of lognormal distribution. This r e s u l t i s a l s o given by Aitchison and Brown (1957).
A s i s indicated in Aivazyan
et
al. (1983). C, can b e r e g a r d e d as a ''true" value C in a n idealized scheme, when t h e e f f e c t s of all random f a c t o r s have been elim- inated, and t h e p l , p Z , .. .
,pn quantities are t h e numerical expression of t h e effects of t h e above-mentioned random factors. In t h i s connection. i t is noted in Aivazyan et al. (1983) t h a t although t h e values of t h e logarithmic distribution of t h e random quantity are formed as random e r r o r s of a c e r t a i n "true" value C. t h e latter emerges, in t h e long r u n , not in t h e r o l e of a mean value, but as t h e median.This s e r v e s
to
define t h e r o l e of t h e median as t h e b e s t estimate of c e n t r a l ten- dency air-pollutant concentrations.I t i s interesting
to
note also t h a t t h e d i r e c t application of t h e c e n t r a l limit theorem presumes independence of t h e random variables pj.
Khan (1973) d o e s not claim t h a t such independence c a n b e proved, although h e considers t h a t indirect proof can b e found in t h e r e s u l t s of empirical investigations.Real observational data seldom show p r e c i s e correspondence to t h e LN2 l a w , even in cases when i t s application can b e s t r i c t l y proved. Data on pollutant con- c e n t r a t i o n s in t h e atmosphere a l s o include deviations from t h e "pure" LN2. This h a s served and still s e r v e s as t h e basis f o r t h e c r i t i c a l analysis of t h e LN2 model and f o r t h e use of a l t e r n a t i v e models. Examples of t h i s c a n b e found in Khan (1973), Mage (1980, 1981) and Mage and O t t (1975); related problems are discussed in Horowitz and B a r a c a t (1979). Roberts (1979) and Soeda and Sawaragi (1979).
Lynn (1976) was among t h e f i r s t t o study t h e applicability of s e v e r a l proba- bilistic models
to
a i r pollution data. The analysis involved t h e normal law, LNZ, t h e three-parameter lognormal distribution L N 3 , t h e I and IV types of t h e Pearson dis- tribution and t h e Gamma-distribution. The conclusion was drawn t h a t the LNZ was t h e best of all t h e above-cited distributions. Here a situation occurring fre- quently in statistical analysis was observed. Namely, in many cases a distributfon can b e selected (even among those cited above) t h a t most closely approximates t h e distribution of t h e sampled data. However, not one of these distributions can be appliedto
t h e description of all types of samples of aerometric data. For t h e i r description, several distributions s h d d b e employed. However, t h e LN2 distribu- tion i s of g r e a t e s t value.During t h e 1960s-1970s many case studies were accumulated concerning t h e application of LN2
to
t h e description of a i r pollution data. The observed devia- tions from t h e LN2 and t h e regularities perceived in them were used by s e v e r a l a u t h o r sto
design models t h a t could ensure a high d e g r e e of applicability f o r t h e description of t h e available data, as good as t h a t of t h e LN2 model. Such an approach is exemplified in Mage (1980, 1981) and Mage and O t t (1975). where several types of distributions suitable f o r t h i s purpose are proposed. and in de Neverset
al. (1979), where t h e possibility of describing t h e data by employing combined distributions is discussed. This method of describing t h e data i s charac- terized by t h e s e a r c h f o r t h e best statistical design f o r t h e description of event- data, secured at t h e expense of general model applicability.For instance, in Mage and O t t (1975) t h e authors conclude t h a t all a i r pollution d a t a studied by them reveal a common behavior in t h e i r deviations from t h e LN2
-
t h e i r distribution functions plotted on lognormal probability p a p e r demonstrate c h a r a c t e r i s t i c "curving". In o r d e r
to
t a k e account of t h i s effect. they suggest using t h e LN3 model-
a three-parameter lognormal distribution with a density1 exp
[ --. ;
(Ln ( z -A)-
~ n t x ) ~1
'(=
= 4%
=-
(,t -A)'?
The fact t h a t this is not t h e only means f o r describing such deviations from SN2 is a p p a r e n t from Mage (1981). In t h i s work concerning t h e best description of t h e data, i t is proposed t o use t h e limited distribution models, with t h e introduction of nonstatistical prerequisites concerning t h e probable origin of such distributions in t h e problems under study.
In d e Nevers
et
al. (1979). a f t e r analytical treatment of a Large number of event-data on atmospheric particulate matter, t h e a u t h o r s distinguished not one (as in t h e former example) but f o u r types of deviations from Ule s t r a i g h t line, typi- cal of distribution functions plotted on lognormal probability p a p e r . These four types are depicted in Figure 2.1. The a n t h o r s analyzed in detail t h e reasons f o r such deviations and proposedto
describe them by a combinationof
two LN2 distri- butions. In t h e s a m e work. a n example i s given illustrating how in reality such a meteorological situation leading to a "composite" distribution can a r i s e , and an analytical treatment i s presented of real data corresponding to such a situation. It i s obvious t h a t , from t h e point of view of increasing model applicability, t h e last line of a t t a c k on t h e problem i s best. By retaining t h e well-studied and convenient LN2 distribution as t h e base-distribution. one may perform a uniform description of practically all observed deviations from LN2 by postulating t h a t s e v e r a l dif- f e r e n t types of meteorological processes a f f e c t t h e concentrations.L i v e r m o r e
.
./. ,-..*/-
/
,,"
/.
,*/
'
o r L. w-
P s e u d o
b
6 '
;
.-
*/" i '
..,
5 ,ii . -
.
- 8: Y? 39PERCENTILL
S a n D i e g o , CA
F r e s n o , CA
. ,if-
5 ZC 5 0 8C 9: 93
P E R C E N T I L E
Figure 2 . 1 Four types of deviations from t h e LN2 distribution f o r data on con- centrations of aerosol dust (de Nevers et al. 1979).
2.3. The Use of D e s c r i p t i v e Models pr S t u d i e s o f B a c k g r o u n d A i r P o l l u t i o n Monitoring Data.
Studies of different a i r pollution probability models gives ever-more convinc- ing evidence t h a t analytical treatment of a i r pollution d a t a should b e performed by means of thorough analysis of t h e chosen statistical model, a n d by t h e use of t h a t model t h a t from t h e point of view of t h e statistical c r i t e r i a provides t h e b e s t descriptions of t h e event-data. F o r t h i s purpose a set of automatic facilities i s proposed in Bencala and Seinfeld (1976) which performs t h e choice of t h e b e s t dis- tribution from t h e point of view of t h e maximal similitude principle. Such a n a p p r o a c h which, probably, is applicable
to
t h e analysis of a i r pollution data at t h e impact level, can hardly b e used f o r t h e description of background monitoring d a t a . The construction of such statistical air pollution m o d e l s leadsto
a loss of g e n e m t y in t h e physical presentation of pollutant concentrations since. o n t h e basis of s t a t i s t i c s proposed by different models,it
becomes impossibleto
establish any common f a c t o r s controlling the formation of pollution concentration frequency distributions.In t h e majority of problems using statistical air pollution models. t h e a u t h o r s are interested, f i r s t of all, in the possibility of t h e application of t h e model t o obtain extreme value statistics. F o r instance,
m o s t
of t h e studies mentioned in t h e p r e s e n t c h a p t e r relateto
t h e a i r quality s b n d a r d s adopted in t h e USA. a n d t h e formulation i s int e r m s
of t h e frequency of exceeding maximum permissible concen- t r a t i o n s during a given period of time (week, month, y e a r ) and f o r a given averag- ing time. F o r example, t h e CO concentrations a v e r a g e d f o r o n e h o u r might b e p e r - mittedto
exceed 35 ppm only o n c e a y e a r , which i s equivalentto
the statement t h a t t h e 1-hour time-averaged concentrations of CO may exceed 35 ppm in n o more t h a n 0.011% of t h e observations. I t is obvious t h a t if t h e hourly concentration distribu- tions of CO are known. and a probability model of t h i s distribution exists. t h e n from a number of observations with specified o c c u r r e n c e , i t is possible t o define t h e distribution p a r a m e t e r s andto
evaluate whether they conformto
t h e s t a n d a r d dis- tribution under t h e specific conditions.The formulation of a i r pollution background monitoring problems in such a context h a s not been encountered (Rovinskii a n d Buyanova. 1982). Our attention i s c e n t e r e d mainly on t h e determination of the mechanisms governing t h e formation of d i f f e r e n t pollution concentration distributions; and t h e s t a t i s t i d design employed should b e sufficiently g e n e r a l t h a t i t could b e applied to d i f f e r e n t air pollution background monitoring time-series. The choice of t h e model from among numerous statistical models should b e prompted by t h e problems
to
be solved; and t h e d e g r e e of generality of t h e model should c o r r e s p o n d to t h e degree of generality of t h e results.W e have chosen t h e two-parameter l o g ~ m r m a l l a w of pollutant concentration distributions
-
t h e LN2. Of all t h e l a w s studied, t h i s i s most widely used, owingto
t h e f a c t t h a t it performs well f o r all pollutants within a n y observational a r e a , and f o r various time a v e r a g e s and,m o s t
likely, r e f l e c t s c e r t a i n g e n e r a l conditions in t h e formation of d i f f e r e n t air pollutant concentration levels. Taking into account t h e f a c t t h a t w e are o f t e n confronted with t h e necessity of studying distributions t h a t deviate from LN2, we adopt h e r e t h e hypothesis postulating a n i n c r e a s e of model applicability by t h e u s e of combined LN2 distributions.3. CONSTRUCTION OF A STATISTICAL MODEL SMULBTING BACKGROUND AIR POLLUTION FTZEQUENCY DISTRIBUTIONS
3.1. E s t i m a t e s o f B a c k g r o u n d Concentration LeveLs
Measurements of background atmospheric pollution concentrations have been obtained o v e r a long period of time merely f o r t h e general evaluation of air qual- ity, and have been episodic in nature. The areas chosen f o r such measurements were usually located far-from industrial pollution sources, outside of urbanized districts. They included at time mountainous areas, located
at
g r e a t heights above sea level. In Burtseva et al. (1982) some data are given on lead concentrations in Western Europe and North America: in non-urbanized areas of Norway in 1971- 1972. at heights of 3600m
above sea levelin
Switzerland, and during a four-year (1968-1971) observational cycle in California at heights of 3800 and 1860m
above sea level. These data provided t h e basis for determfning t h e mean concentration values of lead f o r t h e USA and Central Europe. For t h e USA, t h e mean h a s been taken equalto
8 ng/m3; for Central Europe-
4 ng/m3. A t t h e same time c e r t a i n specific features w e r e noted in t h e behavior of lead in different physical- geographical a r e a s ; f o r instance, t h e Californian data revealed a seasonal t r e n d in concentration, with summer maximum and winter minimum, and t h e absence of a correlation between lead concentrations and suspended particulate matter, while in England a maximum was apparent in t h e winter concentrations of lead and o t h e r heavy metals.In Rovinskii, Burtseva et al. (1982) and Rovinskii. Egorov et al. ( 1982). a n attempt is made t o analyze and summarize t h e data available in t h e world l i t e r a t u r e on t h e distribution of t h e major pollutants in nonindustrial areas. Due
to
t h e geo- graphical position of t h e a r e a s under study, i t i s assumed t h a t these are back- ground data. It i s pointed out t h a t t h e existing data are relatedto
episodic obser- vations performedat
different time-intervals and in different localities. The r e a - son f o r this i s t h a t only at t h e end of t h e 1960s and t h e beginning of t h e 1970s did background a i r pollutant concentristions beginto attract
attention, when i t was realized t h a t anthropogenic effects are of large-scale global importance. Because of this, answers to many questions concerning t h e long-term state of t h e atmo- s p h e r e cannot b e obtained. For instance. i t i s impossibleto
answer t h e question raised in Rovinskii, Burtseva et al. (1982) asto
whether t h e r e are upward trends in t h e concentristions of air pollutants. In general. i t c a n be stated t h a t t h e world data studied in Rovinskii, Burtseva et al. (1982); Rovinskii, Egorov et al. (1982) r e v e a l very g r e a t variability in s p a c e and time. In Rovinskii and Buyanova (1982) i t i s suggested t h a t different types of backgrounds should b e distinguished: global, hemisphere. continental and regional. I t is suggested that t h e minimal mean values for d i f f e r e n t time intervals should b e used for estimates of normal (background) concentrations. This idea conveys implicitly t h e concept of t h e nature of back- ground a i r pollution. As a matter of fact, t h e proposed types of data-averaging, allowing for data smoothing o v e r given time-periods and given spatial areas, and eliminating t h e effects of concentration increases in local zones and during short- time intervals, enable oneto
derive an integrated picture of t h e background set- ting. The construction of such a picture r e q u i r e s local measurements continuously conducted o v e r a long time. In Rovinskii and Buyanova (1982) stress is laid on t h e particular importance of regional background investigations f o r different regions taken together. The regional background regularities a r e , seemingly, t h e only predictors of regional, continental and global long-term behavior of pollution con- centrations.Air-pollution background monitoring stations have been established in t h e USSR and in many o t h e r countries within biosphere r e s e r v e s , also in localities not subjected to t h e influence of any apparent sources of pollution. These programs involve measurements of air-pollutant concentrations. Since 1976 such aerometric d a t a have been accumulated in t h e USSR which makes i t possible
to
estimate back- ground concentration levels for particular regions, to analyse t h e data f o r dif- f e r e n t regions and f o r t h e world as a whole, to study t h e principles governing t h e formation of different concentration levels, andto
obtain estimates of normal a i r pollution concentrations o v e r continents (Burtseva, Lapenkoet
al. 1982; Burtseva, Volonseva et al. 1982; Pastukhovet
al. 1982). Annual data publications have begun (see for example, Bulletin of background pollution of t h e natural environ- ment in t h e region of East-European Members-Countries of CMEA, 1982, 1983).The data on heavy m e t a l concentrations in t h e area of t h e "Borovoe" s h t i o n are discussed in Burtseva, Lapenko
et
al. (1982). In t h e case of lead. t h e Lower limit of measurementerror
was foundto
b e 0.5 ng/m3, t h e coefficient of variation not exceeding 20%. Accordingto
t h e data presented ln Bnrtseva, Volosneaet
al.(1982). lead concentration measurements at background monitoring stations are performed within an accuracy of about 10%. The data r e p r e s e n t daily
m e a n
con- centrations in t h e lower atmosphere. Analysis of t h e histograms of daily mean values for lead concentrations measured o v e r a four-year period, 197'7-1980, shows a strong asymmetry in t h e frequency distribution, with a pronounced concen- tration maximum in t h e l e f t lower quartile and a long "tail" in t h e r i g h t upper quar- tile. Burtseva, Lapenkoet
al. (1982) used t h e histograms f o r simple statistical inferences on t h e possibility of obtaining relatively stable estimates of lead con- centration levels, t h e majormaxima
in t h e frequency distribution being chosen.For t h e samples in Burtseva, Lapenko et al. (1982), such an i n t e n d included 65- 85% of t h e observations. The upper limit of t h e interval w a s taken as t h e upper estimate of t h e background concentration level; thus, according
to
t h e authors' estimates, t h e background concentration level in t h e atmosphere f o r lead in t h e area of t h e "Borovoe" station is between 0.5to
30 ng/m3. F o r t h e f o u r y e a m stu- died, no clearly evident time changes in t h e concentration distributions occurred;during 230-310 days p e r y e a r , t h e concentrations varied within t h e Limits typical of normally p u r e continental areas.
The proposed method f o r estimation of the background concentration level has a number of shortcomings. One of these i s t h a t t h e method does not explain t h e behavior of t h e concentrations in t h e frequency distribution. For instance, in Burtseva, Lapenko et al. (1982), t h e a u t h o r s could not offer a plausible explana- tion f o r t h e increase in t h e frequency of lead concentrations in t h e interval of 30- 60 n /m3 in 1979, or t h e presence of a r s e n i c concentrations in the interval of 3-6 ng/m
S
) f o r 30% of t h e observations ln 1980 ( t h e a r s e n i c backgrormd level being defined at 1-3 ng/m3). Analysis of the possible various t y p e s of e f f e c t s of meteor*logical and o t h e r conditions on concentration variations fails
to
explain t h e observed events (Burtseva, Lapenko et al. 1982). Analysis of background m o n i t o r - ing data for sulfur dioxide w a s performed in Pastukhovet
al. (1982). t h e a v e m g e monthly concentrations varying between 0.3to
18.9pg/m3
during the period of investigations-
from 19'77to
1981. The highest values were recorded during t h e winter, t h e lowest-
during t h e summer. which i s a generalresult
found alsoLn
data from t h e Repetek and Berezin B.Z. background monitoring stations. The annual cycle i s associated with t w o f a c t o r s-
t h e considerable i n c r e a s e fn anthropogenic emissions from fuel-burning during t h e coLd periods of the y e a r , on t h e one hand, and t h e d r o p in t h erate
of oxidation of sulfur dioxide, on the other hand. Analysis of t h e monthly concentrations of sulfur dioxine, separately performed for t h e warm and cold seasons. made i t possible f o r t h e a u t h o r s (Pastukhov, 1982)to
esti- mate t h e sulfur dioxide concentration level in t h e area of t h e "Borovoe" station at0.5-1.0 p g / m 3
-
f o r the warm period and at 3.2-13.7 pg/m3 f o r t h e cold period.Similar analysis of t h e average monthly values a t the "Berezin B.Z." and "Repetek B.Z." background monitoring stations gives t h e values 1.0-2.4. 1 0 pg/m3
-
f o r the f i r s t and 0.3, 1 . 0-
f o r t h e second. Analysis of meteorological conditions and tra- jectories indicated t h a t t h e extreme concentmtion values cannot b e unambiguously correlated with t h e vector wind directions in t h e 'Borovoe" station area. The derived estimates f o r different observational areas are incommensurate and doubt a r i s e s concerning t h e i r possible use in estimating c h a r a c t e r i s t i c s of continental and global background concentration levels.In Szepesi (1962) estimates are presented on air pollution characteristics, plotted on different scales. I t i s suggested t h a t t h e horizontal extent of t h e dis-
tricts
should be determined from two meteorological considerations: t h e lower measure i s specified by t h e distance. within which t h e background level i s deter- mined by the mixing processesin
t h e atmospheric boundary layer. whereas t h e upper one-
by the relative extent of t h e fetch o v e r which t h e meteorological parameters remain constant. By such estimates. boundarfas were defined (Szepesi, 1982) t h a t delimit t h e area of action of t h e estimates of t h e regional air pollution level; under t h e assumption of retgional uniformity, they should o p e r a t e wer a radius of between 20to
300 km. Ndwithstandtng t h e roughnature
of these esti- mates,it
is possible to formulate t h e problem of determining the magnitudes of background concentration levels by means of comparison of data from s e v e d sta- tions. In Szepesi and Fakete (1987) i t is assnmed Chat continental and global a i r pollution background concentration levels are subject to t h e influence of processes occurring o v e r thousands and tens of thonsands of kilometers. I t i s obvious t h a t t h e mutual influence of such processes leads toan
intricate picture of formation of pollutant concentration levels. If t h e station network covering t h e continent i s sufficiently dense, i t might b e possibleto
define differences in back- ground a i r pollution levels.to
distinguish zones where the f a c t o m affecting the formation of different concentration levels are uniform, andto
determine a certain i n t e g d c h a r a c t e r i s t i c describing t h e mean background level of pollutants f o r t h e e n t i r e continent. I t might b e interestingto
compare such mean levels. derived daily at many stations f o r different time-periods, in orderto
check the hypothesis postulating t h a t lengthy periods o c c u r when t h e backgronnd level does not undergo changes across t h e continent as a whole, although daily variations are registered in t h e aerometric data f o r each s h t i o n . This hypothesis underlies t h e assumption of t h e existence of a continental and a global background value. In a somewhat dif- f e r e n t formulation, this hypothesis can b e found in Izxael(1984).In Augustinyak and Sventz (1982) approximate estimates are given f o r t h e number of observing stations t h a t permit one
to
p b t t h e area describing t h e behavior of t h e pollutants through time within a c e r t a i n t e r r i t o r y . When t h e linear Law is used, the minimal number of measurement points is 9, f o r t h e s q u a r e l a w-
18,t h e cubic l a w
-
30,etc.
At present. t h e density of background monitoring is not sufficientto
apply such models.3.2. Statistical Amdysis ofBackground Monitoring Drrta
The d a t a
to
b e used are from t h r e e background monitoring stations-
Borovoe, Berezin biosphere r e s e r v e , and Repetek biosphere reserve in t h e USSR. Descrip- tions of t h e data are given in bulletins (Bulletin of background pollution of t h e natural environment in t h e region of East-European MembersCountries of CMEA, 1982 and 1983). The techniques usedto
d e r i v e t h e data and a discussion of t h e i r reliability can b e found in Burseva, Lapenkoet
al. (1982). Burseva. Volosneva et al. (1982) and Pastukhov et al. (1982).In t h e present study, t h r e e pollutants have been selected
-
sulfur dioxide, lead, andtotal
suspended particulates, f o r which daily observations were available during 1976-83 at t h e Borovoe station and 1980-83 a t t h e Berezin and Repetek sta- tions. The t h r e e pollutants differ accordingto
t h e i r physical-chemical behavior, and the stations a r e located in different physical-geogmphical areas. A joint analysis of t h e sampled data with a viewto
finding common statistical characteris- t i c s can enable oneto
define some common principles governing t h e behavior of air pollutants, and can provide a basis f o r designing techniques f o r evaluation of background pollutant concentration levels on a wide scale-
both in space and time.The f i r s t stage of statistical data analysis should b e t h e construction of t h e statistical data model. Then, t h e statistical characteristics describing t h e data series can b e investigated, and t h e i r applicabfflty f o r obtaining non-statistical conclusions can b e explored. Techniques f o r designing statistical models and t h e
use
of t h e statistical information in h y d r o m e b o m l o g i d and g e o p h y s i d applica- tionsare
described in Aivazyanet
al. (1983). Gruza and Reitenbakh (1982) and Kleiner and Gradel (1980). In Aivazya et al. (1983). some general techniques used:in designing statistical models are presented. In p r a c t i c e two different methods of analysis are
used:
mathematical, relying on theoretical-probabilistic considera- tions, and computational-
by way of d i r e c t r e p r o d u d i o n of t h e model function on a PC. The f i r s t method calls f o r hypotheses and a priori assumptions concerning t h e d a t a t h a t should serveto
validate t h e choice of model; t h e second r e q u i r e s sMaepreliminary formalized knowledge of t h e data, t h a t could b e reflected
in
algo- rithmic form, and could be usedto
develop or refine t h e theoretical-probabilistic method. In t h e present study, both of these mutually complementary methods are employed: t h e f i r s t stage, presumably, should involve t h e development of c e r h i n g e n e m l theoretical-probabilistic concepts of t h e model.In Burtseva, Lapenski (1982) s e v e r a l histograms were examined t h a t describe t h e heavy metal frequency distribution at the Borovoe background monitoring sta- tion. These histograms exhibit a lognormal distrfbntion, with t h e mode shifted
to
t h e left and a long "tail" at t h e right. Histogrruns of t h i s type can be perceived in t h e distribution of all t h r e e pollutants, sampled f o r statistical analysis at alI sta- tions and f o r any period. I t is t h e r e f o r e possible alreadyto
utilize t h e logarithmic form in t h e analysis and f o r checking t h e hypothesis of a lognormal distribution.In Figures 3.1, 3.2, 3.3, plots are shown t h a t c h a r a c t e r i z e t h e lead concentra- tion distributions
at
t h e "Borovoe" station during t h e four-year period of observa- tions. Because much of t h e subsequent analysia i s based on studies of t h e s e plots, w e shall dwell upon them. These plots p o r t r a y graphically t h e empirical density and cumulative distribution functions (3.1 and 3.2), and depict the deviation of t h e empirical density function from t h e theoretical one (3.3). Methods f o r graphical assessment of t h e distribution parametersare
discussed in Mmzewski and Sowa (1978-1979) and problems bearing on graphical estimatesare
treated also in Rnbin (1976), Aivazyanet
al. (1983) and o t h e r publications. Kleiner and Grade1 (1980), note t h a t t h euse
of graphical methods is generally t y p i d of !&atistical an?4sis of geographical data. Those a u t h o r s consider t h a t t h e reason is t h a t geophysical data usually involve daily, seasonal. annual and i n t e r - ~ ~ o a l variations, apart from o t h e r more pronounced effects, c h a r a c t e r i s t i c of short-time intervals, and, inasmuch as t h e major objective of t h e s e methods fsto
illuminate these relation- ships and s t r u c t u r e s , representation of t h e d a t a in t h em o s t
recognizable form becomes particularly important. For evaluation of the d e g r e e of agreement of t h e d a b with t h e chosen LN2 distribution. various methods can b e used. Methods of evaluation, in particular f o r t h e lognormal distribution are discussed in Rovinskii and Cherkhanov (1982) while in Selvin (1976) and Gnanadesican and Kettering (1972), s e v e r a l methods are examined f o r numerical estimation of the model distri- bution under conditions of different types of d a t aerrors.
Many of t h e methodsN O S M ~ L P L O ~ O F - V A R I A B L E c r a
SYMBOL COUNT M E L N S T I D E V
1 1 7 8 7 9 8 0 8 1 B 1 0 3 9 2 , 8 5 4 0 t h 7 8
, * , . . , + . ~ , ~ * , . ~ . * . , , , + ~ . , , + , l l , * l , l . * l , ~ , + , , . l + , , , , * l ~ . * * ~ ~ ~ , + l ~ ~ l * 1 l t ~ + ~ f 1 ~ * ~ ~ 1 ~ * @
3 , 7 5
I 8 I
I B
, ,
b BB
I 0 0 I
2 , 1 5 8 8
:
I 8 8 I
n 8 8
9 8 0 i
8 8 8 B
1 , 5 0 0 b ;
I B B
I I B 0 1
I PBBB
.
8 8
, 7 5 0
:
~ i a +? 0 0 P
8 B I
I 8 0 I
0 8 8 I
0 , o o
:
OBBB II Y 8
I 8 0 b I
I 0 0 0 I
0 B B I
- , 7 5 0 ; B B B
I b b +
6 0 e
I 8 8 8 I
E B B 0
- 1 ,SO
:
BB 8 ;I R
I BBBBB I
, B BE B
- Z , L 5
:
R ;a B
8 , a
1 d
- s , o o + a
1 0
*
1
I
, I I
- 3 , ? >
:
1,*....*..,.+..!.*.,,,*,,.. + . . C , + . . . . * . . * . + , . , . + . . l a +
' ' v i : h J , , + , . I * b . f . ~ * * ! , * * . . l . ' ,
, 3 5 0 1 1 0 5 1 1 7 5 2 1 L I . 1 ' 1 5 ~ , S S s t 2 5
9 , C O , 7 0 0 1 9 C O 2 , l O 2 , 8 0 3.50 4 , Z O & , P O 5 , b O
Figure 3.2 Normal plot of cumulative logarithmic concentnttions of lead. Boro- voe station, 1978-81.
D E V I A T I O N S F R O M N O R M A L
B B B B B
I B
0 B
B 0 8
B 8 B
B B
' 6 9 6
0 0 8 8 0 0 0 RB
BBBBB BBB B E B B
8 B
Figure 3.3 Deviations from normal plot of logarithmic concentrations of lead, Borovoe station, 1978-1981.
discussed in these works are designed t o derive numerical statistics t h a t best describe t h e empirical distributions. Graphical qualitative evaluations of t h e dis- tribution pattern are also used. In o r d e r t o determine how much t h e observed dis- tribution differs from a given theoretical distribution, various c r i t e r i a of good- ness of fit can be used. However, according
to
Kleiner and Grade1 (1980), the numerical result derived from t h e i r use does not indicate in what places and f o r what reasons t h e observed distribution deviates from t h e model one. In t h e case of a normal distribution. t h e r e would b e an exactly symmetrical bell-shaped c u r v e in Figure 3.1, a straight line in Figure 3.2, and a very n a r r o w s p r e a d in Figure 3.3.Histograms are often constructed when t h e number of observations becomes large. The length of t h e interval i s taken equal
to
where z,, and
zd,
are t h e maximal and minimal pointson
t h e logarfthmic concen- tration scale f o r t h e given sample. N-
t h e number of observations in t h e sample.The distribution function is plotted on normal probability p a p e r as distribu- tion quantiles against t h e observed variable.
where n is t h e number of t h e variable
z ,
in t h e variational s e r i e s , a r r a n g e d in ascending o r d e r . The value of t h e F ( z n ) function corresponds to t h e probability( 3 n -1) / (3N + 1 ) of t h e centered and normalized normal distribution
t
Q ( t )
= /
N ( z ; O , l ) & ,- a
where
Equation 3.4 r e p r e s e n t s Equation 3.2 with Linear t r e n d removed:
where h and
5
denote t h e sample a v e r a g e and variance, respectively. This equa- tion shows t h e deviation from t h e s t r a i g h t line, specified by estimates of parame- ters h andE
, and thereby gives a qualitative display of t h e d e g r e e of agreement between t h e e v e n t d a t a and a LN2 distribution, graphically revealing t h e n a t u r e of inconsistencies with t h e theoretical distribution.Discussions of t h e problems concerned with plotting and evaluation of t h e dis- tributions by employing g r a p h s of this type c a n b e found in Aivaeyan (1983) and Kleiner and Grade1 (1980).
As can b e seen from Figures 3.1, 3.2, 3.3, t h e empirical density and distribu- tion functions, as expected, differ from t h e theoretical ones. The question as to how