• Keine Ergebnisse gefunden

On the Distribution of the Hurst Range of Independent Normal Summands

N/A
N/A
Protected

Academic year: 2022

Aktie "On the Distribution of the Hurst Range of Independent Normal Summands"

Copied!
35
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

ON THE DISTRIBUTION OF THE HURST RANGE O F INDEPENDENT NORMAL SUMMANDS

A.A. Anis* and E.H. Lloyd*"

RR-77-16 July 1977

This work was carried out at the University of Lancaster under the auspices of the Science Research Council, London, and at IIASA.

"Ain Shams University, Cairo, Egypt.

**University of Lancaster, UK.

Research Reports provide the formal record of research conducted by the International Institute for Applied Systems Analysis. They are carefully reviewed before publication and represent, in the Institute's best judgment, competent scientific work. Views or opinions expressed therein, however, d o not necessarily reflect those of the National Member Organizations supporting the Institute or of the Institute itself.

International Institute for Applied Systems Analysis

A -2361 Laxeriburg, Austria

(2)
(3)

PREFACE

As new techniques evolve for the use of water and as human needs increase, the management of water resources becomes a task of growing importance.

From 1974 t o 1976, stochastic reservoir theory was one of the major research fields of the IIASA Water Project (now the Water Group of the Resources and Environment Area). A number of research reports were published on different aspects of this problem. The present report by two leadine authorities on stochastic reservoir theorv deals with the Hurst u range which in a hydrological context is of special relevance t o reservoir design and the simulation of river flows. I t is also relevant t o the analysis of the structure ofi stochastic processes and time scries.

(4)
(5)

SUMMARY

The well-known empirical findings of H.E. Hurst have inspired a vast quantity of research on the ability of various mathematical models of geophysical time series to reproduce the Hurst effect. Anis and Lloyd have earlier obtained an exact result for the expected value of the Hurst range when the inflow process is a sequence of independent and identically dis- tributed normal variables. The research described in the present report is an attempt to extend this so as t o obtain an explicit formula for the probability distribution of the Hurst range for this process.

A general method is developed, and explicit results are obtained for the cases n

I

4 (n denoting the number of summands considered). These distributions are of an unusual form, the probability density functions showing discontinuities and corners. Graphs are provided showing the quantitative behavior of these functions.

It is surmised that the problem of deriving exact explicit formulae for the distribution of Hurst's range for general values of n is of unmanage- Able complexity.

(6)
(7)

On the Distribution of the Hurst Range of Independent Normal Summands

INTRODUCTION

Hurst's range is that of accumulated sums of deviations from the mean of a set of identically distributed random vari- ables, expressed in units of the sample standard deviation of the variables in question [ I ] . The concept of this range first arose in a hydrological context where it has special relevance to reservoir design and to the simulation of river flows. It is also relevant to the analysis of the structure of stochastic processes and time series.

While the Hurst range has been intensively studied using simulatory methods, the analytical theory has s o far proved somewhat intractable. Progress toward an understanding of the distribution of the Hurst range of autccorrelated and not neces- sarily normal random variables has so far advanced no further than the expected value of a set of mutually independent normal variables (or, further, that of symmetrically correlated normal variables [ 2 ] ; but since symmetrical correlation is not a physi- cally possible form of geophysical autccorrelation, this gener- alization does not signify much progress). The expected values are discussed in this report.

A general solution to the distribution problem would have practical and theoretical value, for example with regard to the design of tests of significance of some river flow models.

The present work is an attempt to move toward a solution to this distribution problem. Restricting ourselves to mutually independent normal variables, we provide an appropriate formu- lation of the Hurst range for a set of n such variables, and obtain some general results relative to the distribution of this range. We summarize our earlier work on the expected value of the Hurst range, and develop a method for obtaining the required probability distribution. We have applied this method to obtain explicit formulae for the probability distri- bution for the cases where n

<

4. It turns out that the density function is of a complicated and interesting type, exhibiting continuity intervals, corners, and discontinuities.

DEFINITION OF THE HURST ?ANGE

Let X 1 , X 2 ,

...,

Xn denote a set of mutually independent ran- dom variables, with a common normal distribution, which we may

(8)

without loss of generality take as standardized (expectation = 0 , variance = 1). Let

? = n xr/n (1)

1

denote the sample mean, and

the sample standard deviation.

Hurst's range relates-to the cumulative sums of the

"adjusted variables" Xr

-

X , r = 1 , 2 ,

...,

n, scaled by the divi- sor Dn. Thus we require the "partial sums" of the scaled adjusted variables

namely

and

Note: an unambiguous notation would require the replacement of our X by Xn, and our Sr = by .Sr. It is hoped that no con- fusion will arise from our simplification. Let

M~ = max (S1 ,S2,.

. .

,Sn-l t o )

denote the largest of the nonnegative partial sums, and

L~ = min (S1,S2t..-tSn-1,0) ( 6 ) the numerically largest of the nonpositive values, so that

Mn

>

0 and Ln ( 0.

(9)

The Hurst range is then

(W1 being undefined)

.

We note at this point that, in addition to the relation

the Yr also satisfy the quadratic relation

We may eliminate Yn between these two, resulting in

The Yr are not mutually independent, but they are identi- cally distributed and their joint distribution is such that

distr (Y1,Y2,

...,

Yn) = distr ( Y a , Y B ~ - - . ~ Y v ) (11) for every permutation (a,B,

...,

v) of the ordered set (1,2,

...,

n).

Such variables are called e x c h a n g e a b l e . This exchangeability is a consequence of the fact that the Y are defined in terms

r

of the Xs, which are mutually independent and therefore exchange- able, and of % and Dn, which are symmetric functions of the X

j whose values are unaffected by any rearrangement of the order of the X .

.

3

These considerations have simplifying consequences in studying the joint distribution of the Sr.

Further, since Y 1

+ +

Yn = ' O r it follows that

(10)

b u t

d i s t r ( - Y n ) = d i s t r ( Y 1 )

w h e n c e

d i s t r ( S 1 ) = d i s t r ( S n - I )

.

S i m i l a r l y

d i s t r ( S r ) = d i s t r ( S n - r ) , r = 1 , 2 , .

. .

, n - I

.

THE EXPECTED VALUE OF THE HURST RANGE

The e x p e c t e d v a l u e o f t h e H u r s t r a n g e E(Wn) h a s b e e n shown by A n i s a n d L l o y d [ 2 ] t o b e

THE DISTRIBUTION OF THE HTJRST RANGE

The m o s t p r o m i s i n g d i r e c t p r o c e d u r e w o u l d s e e m t o b e t o o b t a i n t h e j o i n t d i s t r i b u t i o n o f t h e s c a l e d p a r t i a l sums S r , a n d t h e n u s e t h e u s u a l t h e o r y o f t h e o r d e r s t a t i s t i c s o f c o r r e - l a t e d v a r i a b l e s t o o b t a i n t h e r a n g e W n . I t i s m o r e c o n v e n i e n t t o work i n t e r m s o f t h e Y r , s i n c e t h e s e a r e e x c h a n g e a b l e , w h e r e

a n d Dn i s d e f i n e d i n ( 2 ) .

A l l t h e Y r c a n b e e x p r e s s e d i n t e r m s o f a n y s u b s e t o f s i z e n-2 o f t h e m , s i n c e

(11)

and

We shall in fact retain n-1 of the Y's, say Y1,Y2,...,Yn-1, eliminating Yn by using (14) from which

the n-1 retained variables are then subject to the restraint

This may be expressed concisely in terms of the vector

n

= (Y1 ,Y21.

-

lYn-l)

'

(of order n-1)

,

in the form

n B n +

( ~ ' n ) ~ = n

where

is a symmetric matrix of order (n-1)

.

The vector

n

may then be regarded as the position vector, in (n-1)-space, of an appro- priately distributed random point which lies on the surface of the ellipsoid (16)

.

(12)

This attractive geometrical picture can be improved by a further transformation, which will send the ellipsoid into a sphere. Accordingly, we introduce new random variables

z1

tz2t

- . . tz,-l

t in the form of a vector

by means of a transformation

such that ( 1 6) becomes

Now, by (16) and (la), we have

whence the requirement of (19), namely that 5'5 = 1, can be satisfied by choosing the matrix B such that

B'AB = nI

BB' = n ~ - '

.

Since A (of order n-1) has the simple form (17)

,

its inverse A-l 1s .

and we may take B to be the unique lower triangular matrix with positive diagonal elements such that

BB' = n ~ - '

(13)

Then

This "triangular resolution" is a standard matrix procedure, and t h e explicit form for B is readily found. For example, when n = 3, w e have

and

whence

Similarly when n = 4 w e have

(14)

and

s o that

For a general value of n w e f i n d

where

and

(15)

Since the n-1 variables Z1,Z2,

...

zn-l are constrained to lie on the (n-1)-sphere

a sensible way of expressing their joint pdf (probability density function) is to give the pdf of any subset of them of size n-2, say the subset Z1rZ2r...rZn-2. It turns out that this is

where

The proof will be deferred (see Appendix).

Returning now to the main problem, we have

= (max

-

min) (Y1 ,Y1

+

Y2

, .. .

,Y1

+ +

Yn-l, 0)

= (max

-

min) (b;5rb;5r.--rb~-15r0)

where, for r = 1,2,

...,

n-1, we have expressed Y1

+

. * *

+ r' as a linear function big of the Zi by means of the transforma- tion (1 8)

,

with B as in (21 )

,

where the point S lies on the sphere 5'5 = 1, and the distribution of 5 is given by (25).

Then the pdf gn(w) of Wn is obtained by using the follow- ing argument:

(16)

where

rij (w) dw = PCmax is Sir min is S and Si-S E dw)

j j

Here "max" means "max (S1 , S2,.

. . ,

Sn-l

,

0) "

,

and similarly for

"min". The abbreviation "S1

-

S . E dw" means "w 5 S i - S . < w+dwl'.

Thus 3 3

where f..(w) is the pdf of Si

- s

at w.

11 j

In evaluating (19) we express the S r in terms of 5 and

-

use the distribution (25) of that vector. There are n(n-1) terms in the sum (27) but not all are distinct. In the first place we have the obvious symmetry relations

which we abbreviate to

In addition we have

and so on, or, in general

(17)

where addition in the subscripts is modulo n. This follows from the symmetry of the Sr and the exchangeability of the Yr.

EXPLICIT EVALUATION OF Distr (W,), n = 3

We now apply the foregoing to evaluate distr (Wn) in the cases n = 2,3,4. The case n = 2 is trivial since, in view of the restraints (8) and ( 9 ) , namely

the Hurst range must be a constant. In fact, it is easy to see from first principles that W2 is identically equal to unity; for

Then

=I; ,

i f x 1

> x 2

otherwise Similarly

L~ = mintX1

-

X,O)/D~

0

,

if XI > X2

= otherwise

(18)

Then

I

1

-

0 , if X I

W 2 = M 2

-

L2 = > X 2

0 - ( I ) , otherwise

= 1 , in either case

.

The case of n = 3 is less trivial but still fairly simple.

From ( 2 7 ) we have

On using the symmetry relations ( 2 9 ) and ( 3 0 ) it will be seen that all six terms have a common value, which it is convenient to take as T , ~ . Hence

In n = 3, the transforming matrix B is given by ( 2 0 1 , whence

with

z ; + z 2 = 1 2

.

The pdf of Z 1 at z is given by ( 2 5 ) as

g ( z ) = 1 / ~ ( 1

-

z 2 ) + , z 2 < 1

.

(19)

To e v a l u a t e r 1 3 ( w ) we n o t e t h a t

n I 3 ( w ) = ~ { m a x i s S 1 , min i s S 3 ( = 0 ) I s 1 = w ? f 1 3 ( w ) ,

= P{max i s Y 1 , min i s O ( y l = w ? f I 3 ( w )

.

( 3 3 )

The f a c t o r f 1 3 ( w ) i s t h e pdf a t w o f S1

-

S3 = S1 = Y 1 = Z 1 J 2 . The pdf o f Z 1 a t z i s g i v e n by ( 3 2 ) ; t h u s t h e pdf o f Z1J2 a t w i s

The o t h e r f a c t o r i n ( 3 3 ) i s

s i n c e t h e p r o p o s i t i o n "max i s S 1 , min i s S 3 ( = 0 ) " i s e q u i v a l e n t t 0

t h a t i s , t o

0 < Y 1 + Y 2 < Y 1

.

The e x p r e s s i o n ( 3 5 ) may b e d e v e l o p e d a s

(20)

2 2

where Z 1

+

Z2 = 1 . In geometrical terms, we have to consider 2 2

the unit circle Z 1

+

Z2 = 1 and, for various values of w , the probability that for a point (Zl,Z2) on this circle, the co- ordinate Z2 will lie between 4J(3/2)w and -3J(3/2)w when the other coordinate Z1 is defined by Z 1 = w/J2 (see Figure 1).

(21)

Clearly the value Z1 = w/J2 defines two values of Z2, say Z2 = a(w) and Z2 = -a(w)

,

where {a(w) ,w/J21 is a point on the unit circle, so that a (w) =

+

~'(1

-

li w 2 )

.

When

the required conditional probability (36) is equal to 1, and in all other cases it is equal to zero. Now (37) is satisfied if and only if

that is

(It is because of this geometrical interpretation that we refer to (38), and its analogies for general values of n, as the

"geometrical factor".)

Combining these results we finally obtain from (31) and (33) the following expression for the pdf g3(w) of the Hurs.t range for n = 3:

g3(w) = 6/nJ(2

-

w 2 ) , J(3/2) < w < J2

= 0

,

otherwise

.

The shape of this is illustrated by Figure 2. Note the corner at w = J(3/2).

The expected value of W3 is

in agreement with the value given by (121, for n = 3

(22)

Figure 2.

Similarly one finds

var (W3) = 1

+

3/3/2~

-

18/71 2

.

E X P L I C I T EVALUATION O F Distr (Wn) : (n = 4 )

While in principle the general method outlined and illus- trated for n = 3 may be applied for any value of n; the practical complexities increase rapidly with n. We have completed the cal- culations for n = 4, which proceed as follows.

Among the twelve terms rij we have the relations (29):

- - -

7112 = 7121 - "23 - 7132 - 7134 = -

"43 = '41

-

4 ( = na, say)

(23)

and

'13 - - '31 = '24 = '42

The expression (41) thus reduces to

We consider separately the evaluation of and rb.

We shall need the transforming matrix B of (221, which for n = 4 gives

where

In terms of the Zi we find that

sl

= Z, J3

S2 = 2Zl/J3

+

2Z2/(2/3)

where the joint pdf of any two of the Z's, say Z l and Z2, is

consequently the common d i s t r i b u t i o n o f Z l a n d o f Z2 i s u n i f o r m ( - I , ] / .

(24)

It is a simple matter to verify from (43) that S2 is also uniformly distributed, on (-2,2).

Evaluation of na(W) We have

na (w) = n21 (w) = Pis2 is max, S1 is min

1 s2 -

S1 = w} f21 (w)

(4 4 where fZ1 (w) is the pdf at w of S2

-

S1 = Y1 = Z1J3.. Since Z 1

is uniformly distributed on (-1,l)

,

it follows that Z 1 J3 is uniform, on ( - J3, J3)

,

with pdf

-J3 < w < 43 f21 (w) =

,

otherwise

As regards the "geometrical" factor in (44), that is the conditional probability term, we note that the proposition

"S2 is max, S1 is min"

is equivalent to the following:

that is, to

The conditional probability term in (44) therefore reduces to

(25)

which, on using the relations (42), and replacing Z 1 by 2 2 2

1

-

Z2

-

Z3 in accordance with (41a), reduces to

where L 1 (Z2,Z3) and L2(Z2,Z3) are linear functions of Z2 and Z3. To be precise:

In geometrical terms, (46) represents the probability content of that portion (if any) of the circumference of the circle

2 2 2

Z2 + Z3 = 1

-

w that lies between the lines

and also between the lines

Figure 3 gives a representation of a typical situation, the relevant parts of the circumference (shown hatched as arcs

(ab) and (cd)) being those inside the parallelogram represented by (48).

Since by (45) f21 (w) is a uniform distribution, it follows that p(w) will be proportional to the total length of the en- closed arcs (ab), (cd), and therefore, on normalizing, numeri- cally equal to the ratio of the total arc length to the circum- ference of the circle.

(26)

Figure 3.

F o r s u f f i c i e n t l y s m a l l v a l u e s o f w , a l l v e r t i c e s A , B, C , Dl o f t h e p a r a l l e l o g r a m l i e i n s i d e t h e c i r c l e ; a s w i s a l l o w e d t o i n c r e a s e , t h e v e r t i c e s move o u t w a r d s . On c o n s i d e r a t i o n , i t w i l l b e s e e n t h a t f o u r c a s e s h a v e t o b e c o n s i d e r e d , namely

0 < w

<

1 : a l l v e r t i c e s i n s i d e ; no a r c s . 1 < w

< fi:

v e r t e x B o u t s i d e , o t h e r s i n s i d e ;

a r c ( a b ) e x i s t s ; no a r c ( c d )

.

f i < w

5

v ' ( 8 / 3 ) : a l l v e r t i c e s o u t s i d e , b u t c i r c l e n o t e n t i r e l y i n c l u d e d i n t h e p a r a l l e l o g r a m ; a r c s ( a b )

,

( c d ) b o t h e x i s t .

v ' ( 8 / 3 ) < w

<

4 3 : a l l v e r t i c e s o u t s i d e ; c i r c l e c o m p l e t e l y c o n t a i n e d .

The c o r r e s p o n d i n g v a l u e s o f p ( w ) a r e f o u n d t o b e :

(27)

1 2 3 w

-

J(8

-

p3 (w) = 7 + - arc sin

32 J ( 3 - w ) 2

Taking into account the factor f2, (w) of (45), we finally have the expression for va(w):

O < W i l 1 < w z J2

na(w) = p3 (w)/2/3 , J2 < w 5 J(8/3)

.

1/2J3

,

~'(8/3) < w 5 J3

I otherwise

Evaluation of

T~(W)

Taking Tb(w) as arguments of the above kind lead to the following:

Here f is the pdf at w of S 2

-

S4 = S2, which in accordance 2 4

with (43) is uniformly distributed on (-2,2), so that

The geometrical factor in (50) turns out to be

(28)

2 2 2

where (since Z1

+

Z2

+

Z3 = 1) we have non-zero probabilities 2 2

only when Z1

+

Z2

<

1 . We are therefore concerned with the probability content of that region of (Z1 ,Z2) space which lies in the intersection of the slab 0 < Z1 < w/J3, the oblique line Z1

+

z2J2 = w~'3/2, and the annulus

Figure 4 illustrates a typical situation, the.hatched region being the intersection in question.

Figure 4.

(29)

By methods similar to those used for xa(w) we find

(2 arc sin J(8

-

W 2w2)

-

Cn

) .

12 < w 5 J(8/3)

T~,(w) = f

,

J(8/3) < w < J2

I

0 otherwise

.

The pdf of the Range W4

Combining these results in accordance with (41a) we finally obtain the following expressions for the pdf of the Hurst range for n = 4:

4

7

arc sin

Tl 3 l < w c J 2

4

-

1 + T13 8 arc sin

1 )

w

-

J(3 J(8

-

w

-

2 3) ,

10

otherwise

.

(53)

This expression is a correctly normalized pdf ( g4(w) dw = 1).

4 -m

I

and the expectation E (W4) = %(I + agrees with the value given by (12) for n = 4. The variance is

1 8 4 1

var(w4) = 3 +

-

3 arc tan J2

+

-(I + 2J2)

- +

19)

.

TI 3 Tl

Figure 5 shows a graph of g4(w).

(30)

Figure 5.

(31)

Appendix

Derivation of the Joint Distribution (25)

The transformation to the "spherical" variables Z described j

in (18) is not only a geometrically attractive device: it leads to a simple derivation of the distributions in which we are interested.

We have

where

B is a lower triangular matrix of order n-1 satisfying BB' = n ~ - ' , (A being defined in (1 7) )

,

and

Since the T . are linear functions of the Xi, with 3

say, the vector T is multivariate normal in n-1 variables. The matrix C is of order (n-1) x n. It is the matrix defined by the

first n-1 rows of the (n x n) matrix In

-

1 lnlA ( = N , say).

The dispersion matrix of T is C C 1 , which is the leading principal submatrix, of order (n-1) x (n-1) of NN'. But N is symmetric and idempotent, whence NN' = N 2 = N , and its leading

(32)

principal submatrix, CY£ order (n-I) x (n-1) is the matrix

-

- 1 1'

,

of order (n-1) x (n-l), namely the matrix In-1 n In-1 n-1

-

1

A

.

Hence

CC' = A

-

1

It follows that T is multivariate normal with

-

1

E(T) = 0 and D (T) = C C ' = A

We have expressed

n

as q = T/D

.

The quantity Dn itself may be expressed in the form n

n- I

=

1

T?

+

Tj)2 n

1 J

since

1

T . = 0

1

Thus

and we may now express the vector 5 (the elements of which are the spherical variables Z1,...,Zn-2) in terms of T:

5 = B-lq

(33)

say, where

Since r is multivariate normal, with dispersion matrix D (r) = = _(BB1), it follows that 1 w is multivariate normal

with dispersion matrix

Hence the elements W 1 ,

...

tWn-l if w are mutually independent normal variables with common variance l/n, and

where the variables %n~: are mutually independent gamma vari- ables, with common exponent 'l.

By a standard theorem [3] the joint distribution of

2 2 2 2

Z1tZ2t...tZn-2 is a Dirichlet distribution. If we denote Z . 3 by Rj, j = 1,2,

...,

n-2, the joint pdf of R1,R2,

...

tRn-2 at

(rl ,r2

.. .

,rn-2) is given by the theorem as

(34)

and hence the joint pdf of Z1,Z2r...rzn-2 at ( z 1 ~ z 2 ~ . . . r z n - 2 ) is

Q(z) = 2-(n4) f (r)

1 a

(r)

/ a

(2)

l

This is the result quoted in (25).

In fact the theorem referred to goes further: the joint distribution of every subset of size k of the set (Z:,Z:.

. . . ,

Zn-2)r k 2

<

n-2, is also of the Dirichlet form. For example, the joint pdf of R1 = 2 - 2

z ~ - ~z ~ , . . R

.

, R ~ ~ = :Z is

In particular, the pdf of R1 = 2: is

Making the appropriate transformations we obtain the joint pdf of (Z1,Z2

,...,

Zk), k 5 n-2, as

(35)

a n d , i n p a r t i c u l a r , t h e p d f o f Z 1 i s

REFERENCES

[ I ] H u r s t , H.E., M e t h o d s o f U s i n g Long-Term S t o r a g e i n R e s e r - v o i r s , P r o c . I n s t . C i v i l E n g r s . , - 5 ( 1 9 5 6 1 , 5 1 9 - 5 9 0 . [21 A n i s , A.A., a n d E.H. L l o y d , T h e E x p e c t e d V a l u e o f t h e

A d j u s t e d R e s c a l e d H u r s t Range o f I n d e p e n d e n t N o r m a l Summands, B i o m e t r i k a ,

63

( 1 9 7 6 ) , 1 1 1 - 1 1 6 .

131 W i l k s , S . S . , M a t h e m a t i c a l S t a t i s t i c s , W i l e y , New Y o r k , 1 9 6 2 .

Referenzen

ÄHNLICHE DOKUMENTE

Table 7.1 also shows that, in the 2014 Lok Sabha election, the BJP did particularly well, and the INC did particularly badly, in Uttar Pradesh: 20.6 percent of the BJP

The purpose of this chapter is to analyze the policy of the Russian and Kazakhstani authorities with respect to diesel fuel taxes. Russia is closely connected

Our contribution is to introduce a continuum of heterogenous agents by risk aversion into a basic trust game to derive aggregate measures of trust- worthiness, trust, and output..

Beside from not being related to the range size of Central European tree species, the soil niche breadth of Fagus sylvatica, Quercus petraea, Acer pseudoplatanus, Prunus avium,

It is clear that Hurst's method was to plot values of log rE* against log n, for a variety of values of n, for each of his sets of river data. What is not clear is the relation of

When I use a log-normal distribution for simulation purposes, I use a truncated version of the distribution, one which extends from zero up to some maximum value, let us say 6.

Com base no capítulo introdutório, mais especificamente no Gráfico 1.2, observa-se que entre os anos de 2002 (ano base da matriz de insumo-produto estimada neste trabalho) a 2006

Abstract: Many plant species have been introduced to new continents, but only a small subset oj tbese bave become invasive. It bas been predicted tbat self-compaUble