• Keine Ergebnisse gefunden

6.2 Maximum likelihood type estimation

6.2.4 Simulations

We illustrate theorem 6.5 by calculating θ(h,β)n for simulated series of a LARCH process with standard normal εt and a parametrization such that (B) hold. The model parameter vector θo and the constants h and β are chosen as follows.

• Case 1: do = 0.1, ao = 1, co = 0.2;h= 0.01,β = 0.799;

• Case 2: do = 0.2, ao = 1, co = 0.2;h= 0.01,β = 0.599.

To simulate the process Xt via (5.1) and (5.2) a pre-sample of length 10000 is used for initiation. Moreover, the infinite series in (5.2) is truncated at order

CHAPTER 6. LARCH - STATISTICAL INFERENCE 131

0 500 1000 1500 2000

−4−2024

(a)

Index

X

0 5 10 15 20 25 30

0.00.20.40.60.81.0

Lag

ACF

(c)

0 500 1000 1500 2000

−6−4−2024

(b)

Index

X

0 5 10 15 20 25 30

0.00.20.40.60.81.0

Lag

ACF

(d)

Figure 6.2: Two simulated sample paths of a long-memory LARCH process Xt

and the corresponding sample autocorrelation functions of Xt2.The long-memory parameter do is equal to 0.1 in figures a and c, and 0.2 in figures b and d respec-tively.

2000. Figures 6.2a and b show typical sample paths ofXt for the two cases. The corresponding sample autocorrelation functions of Xt2 are given in figures 6.2c and d respectively.

For simplicity, we first focus on single estimation of the parameter a under the assumption that the other parameters d and c are known, i.e. we only use the third component of θ(h,β)n denoted by ˆan. To compare asymptotic with finite sample results, a small simulation study is carried out as follows. For sample sizes n = 1000,2500,5000 and 10000, N = 600 independent samples of the LARCH process are drawn and the estimator θn(h,β) respectively ˆan is calculated. For case 2 (do = 0.2), summary statistics of the results are given in table6.1. Moreover, normal probability plots can be found in figure6.3. One can see that the simulated standard deviations are close to the asymptotic value 0.628 (which is calculated by means of simulation of the matrix Hh1GhHh1, see theorem 6.5b).

Further, the normal probability plots indicate that the true distribution of the estimator is approximated quite well by the normal distribution.

Next, consider single estimation of the long-memory parameter d (again under

CHAPTER 6. LARCH - STATISTICAL INFERENCE 132 a= 1.0

n 1000 2500 5000 10000

mean 0.996 0.991 1.003 0.995 bias −0.004 −0.009 0.003 −0.005

sd 0.0879 0.065 0.051 0.040

sd·nβ/2 0.698 0.678 0.659 0.634

asymp.sd 0.628

Table 6.1: For case 2 and n = 1000,2500,5000 and 10000, N = 600 LARCH series were simulated and a was estimated under the assumption that d and c are known and β = 0.599, h= 0.01. The table shows summary statistics of the simulated distribution of ˆan.

the assumption that the other parameters a and c are known). For sample sizes n = 1000,2500,5000 and 10000, N = 1000 independent samples of the LARCH process are drawn and the estimator θn(h,β) respectively the first component, de-noted by ˆdn, is calculated. Summary statistics of the results are given in tables 6.2 (case 1) and 6.3 (case 2). Normal probability plots based on all 1000 simula-tions are given in figures 6.4a through h. The asymptotic standard deviation of dˆn given in theorem 6.5b is equal to 1.68 in case 1 and 1.14 in case 2 (calculated by simulation). Comparing the results, one can see a strong discrepancy between robust and non-robust estimates of the expected value, standard deviation and skewness of ˆdn. The robust estimates are close to the asymptotic values obtained from theorem 6.5b, already for n= 1000. This is not the case for the non-robust estimates. Most extreme are the values of the (non-robust) skewness measure which should converge to zero, but seems instead to be increasing in absolute value. This can be explained as follows. Out of N = 1000 simulations, there are a few cases where the algorithm terminated at a solution equal, or very close to, the lower end of the parameter range used in the numerical minimization (the reason for this is that the objective function might by very complicated though a positive value of h is chosen, see figure 6.1). As expected from theorem 6.5a (and b), the number of cases where this happens decreases with increasing n.

However, since the variance of estimates in the interior of Θ tends to zero with increasing n, those few estimates that are equal to the fixed lower limit of the parameter space are becoming increasingly extreme outliers, compared to the bulk of the simulated data. Indeed, even if N tends to infinity and only one out of N simulations is equal to the lower bound, the empirical skewness will not

CHAPTER 6. LARCH - STATISTICAL INFERENCE 133

−3 −2 −1 0 1 2 3

0.900.951.001.051.10

a=1, n=1000

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

0.940.981.021.06

a=1, n=2500

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

0.940.960.981.001.021.04

a=1, n=5000

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

0.960.981.001.02

a=1, n=10000

Theoretical Quantiles

Sample Quantiles

Figure 6.3: Normal probability plots of N = 600 simulated estimates ˆan for case 2 (do = 0.2).

converge to zero. For this reason, the (non-robust) empirical standard deviation, skewness and normal probability plot are grossly contaminated by the small (and asymptotically negligible) number of simulations where the algorithm did not converge properly. Apart from the robust estimates, we therefore also computed the same empirical non-robust quantities leaving out the 10 (out of N = 1000) smallest values of ˆdn. The non-robust estimates are then indeed much closer to the theoretical values, and the normal probability plots (see figure 6.5) indicate convergence (though rather slow for do = 0.2) to the normal distribution.

An additional observation we can make is that convergence to the asymptotic distribution is slower for stronger long memory (do= 0.2). The reason is that for do = 0.2, the number of terms used in ˜Ln,h(θ) is much smaller, namely O(n0.599) as compared to O(n0.799) fordo = 0.1. More specifically, for n= 1000,2500,5000 and 10000, we havem(n) = 62,108,164 and 248 fordo = 0.2, whereas fordo = 0.1 we have m(n) = 249,518,902 and 1570.

CHAPTER 6. LARCH - STATISTICAL INFERENCE 134

n 1000 2500 5000 10000

do = 0.1 : all 1000 simulations

mean 0.047 0.069 0.085 0.088

median 0.094 0.099 0.104 0.101

s 0.353 0.290 0.216 0.198

˜

s 0.121 0.082 0.054 0.041

nβ/2s 5.570 6.605 6.490 7.864 nβ/2s˜ 1.909 1.859 1.621 1.629 skewness -10.620 -13.161 -16.320 -20.464 q-skewness -0.118 -0.038 -0.093 -0.089

do = 0.1 : 10 smallest values of ˆdn excluded

mean 0.072 0.091 0.098 0.098

median 0.094 0.101 0.104 0.101

s 0.150 0.090 0.057 0.043

˜

s 0.119 0.080 0.053 0.040

nβ/2s 2.384 2.063 1.720 1.722 nβ/2s˜ 1.882 1.822 1.595 1.604 skewness -1.199 -0.747 -0.684 -0.422 q-skewness -0.103 -0.042 -0.075 -0.079

Table 6.2: Mean, standard deviation and skewness of ˆdn with β = 0.599, based on N = 1000 simulated LARCH processes with true long-memory parameter do = 0.1 (case 1). The asymptotic standard deviation from theorem 6.5(b) is equal to 1.681. Here, sis the empirical standard deviation, ˜s is the MAD divided by the 75%-percentile of the standard normal distribution, q-skewness is the empirical quartile skewness. In the upper table, all N = 1000 simulated values are used, in the lower table, the 10 smallest values of ˆdn are excluded.

CHAPTER 6. LARCH - STATISTICAL INFERENCE 135

n 1000 2500 5000 10000

do = 0.2 : all 1000 simulations

mean -0.292 0.059 0.110 0.168 median 0.181 0.201 0.198 0.199

s 1.395 0.719 0.552 0.255

˜

s 0.215 0.133 0.102 0.082

nβ/2s 11.041 7.489 7.079 4.030 nβ/2s˜ 1.703 1.385 1.310 1.291 skewness -2.761 -5.800 -7.899 -11.752 q-skewness -0.292 -0.134 -0.117 -0.093

do= 0.2 : 10 smallest values of ˆd excluded mean -0.245 0.110 0.161 0.186 median 0.184 0.202 0.199 0.200

s 1.319 0.511 0.219 0.114

˜

s 0.213 0.131 0.101 0.080

nβ/2s 10.437 5.319 2.810 1.800 nβ/2s˜ 1.688 1.362 1.290 1.262 skewness -2.949 -6.831 -4.829 -1.336 q-skewness -0.285 -0.112 -0.098 -0.081

Table 6.3: Mean, standard deviation and skewness of ˆdn with β = 0.599, based onN = 1000 simulated LARCH processes with long-memory parameter do = 0.2 (case 2). The asymptotic standard deviation from theorem 6.5(b) is equal to 1.14. Here, s is the empirical standard deviation, ˜s is the MAD divided by the 75%-percentile of the standard normal distribution, q-skewness is the empirical quartile skewness. In the upper table, all N = 1000 simulated values are used, in the lower table, the 10 smallest values of ˆdn are excluded.

CHAPTER 6. LARCH - STATISTICAL INFERENCE 136

−3 −2 −1 0 1 2 3

−5−4−3−2−10

(a) n=1000

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−5−4−3−2−101

(e) n=1000

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−5−4−3−2−10

(b) n=2500

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−5−4−3−2−10

(f) n=2500

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−4−3−2−10

(c) n=5000

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−5−4−3−2−10

(g) n=5000

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−5−4−3−2−10

(d) n=10000

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−5−4−3−2−10

(h) n=10000

Theoretical Quantiles

Sample Quantiles

Figure 6.4: Normal probability plots ofN = 1000 simulated estimates ˆdn for case 1 (figures a through d) and case 2 (figures e through h).

CHAPTER 6. LARCH - STATISTICAL INFERENCE 137

−3 −2 −1 0 1 2 3

−0.6−0.20.00.20.4

(a) n=1000

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−5−4−3−2−101

(e) n=1000

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−0.3−0.10.10.3

(b) n=2500

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−2.0−1.00.0

(f) n=2500

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−0.10.00.10.2

(c) n=5000

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−2.0−1.00.0

(g) n=5000

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−0.050.050.150.25

(d) n=10000

Theoretical Quantiles

Sample Quantiles

−3 −2 −1 0 1 2 3

−0.40.00.20.40.6

(h) n=10000

Theoretical Quantiles

Sample Quantiles

Figure 6.5: Normal probability plots of simulated estimates ˆdn for case 1 (figures a through d) and case 2 (figures e through h), with 10 (out of N = 1000) of the lowest values excluded.

138

Chapter 7

Aggregation and Estimation

7.1 Introduction

In section 3.2 we already mentioned that contemporaneous aggregation in the sense of summing or averaging across different units of micro-level time series plays an important role in the theory of long memory processes. Several papers deal with the question of statistical inference for the distribution of the underlying random coefficients of the micro-level processes, also called the mixture distribu-tion. Recently, Leipus et al. (2006) considered the aggregation of independent AR(1) processes and proposed an estimator for the mixture distribution that is based only on the aggregated long memory process. For a similar aggregation scheme, Robinson (1978) investigates an estimator for the moments of the mix-ture distribution based on the sample autocovariances of the first N underlying AR(1) processes and derives the asymptotic distribution as N tends to infinity, whereas the length of the processes remains fixed. In contrast, we examine in this chapter the asymptotic behavior of a similar estimator, when both the number and the length of the AR(1) processes simultaneously tend to infinity.

To be specific, we consider a panel of N independent AR(1) processes, each of lengthn, where the mixture distribution of the AR(1) coefficients is, as in Granger (1980), a kind of beta distribution. In our estimation procedure the serial corre-lation coefficient of lag 1 is used as approximation of the true coefficient of each AR(1) process. In a second step the parameters of the mixture distribution are estimated by the maximum likelihood estimator (MLE) for the beta distribution where the thereby needed, but unobservable AR(1) coefficients are replaced by the

139

CHAPTER 7. AGGREGATION AND ESTIMATION 140 approximated values. We show that asymptotically, and ifnconverges sufficiently fast to infinity, the new method is equivalent to the MLE of the beta distribution.

In doing so, infinitely many AR(1) processes have to be handled simultaneously whereas the corresponding coefficients can be arbitrarily close to one, and thus, for the asymptotic analysis, we have to find uniform bounds for the order of con-vergence of the mean squared error (MSE) of the serial correlation coefficient.

We will prove an appropriate proposition by means of an integral representation of the MSE based on results by White (1961) and Shenton and Johnson (1965).

However, this approach assumes that the initial values of the AR(1) processes are uniformly bounded, excluding the case of stationarity. Nevertheless, the initial values can be chosen arbitrarily close to the stationary distribution and, more important, the processes are still asymptotically stationary which will be defined in the next section.

The new estimator can be used for two purposes: On the one hand it provides a tool for analyzing panel AR(1) data, where each single AR(1) process admits an individual coefficient, a situation that has not been considered in the litera-ture yet. On the other hand, it can be used as estimator for the long memory parameter of the aggregated process, since this only depends on the mixture dis-tribution. In that context, we should mention that the results of this chapter can be seen as a first step towards an estimation procedure for more general situations such as aggregation of volatility models. In chapter 6 we saw that in the case of long memory LARCH processes, the proposed parameter estimator has a very slow rate of convergence, whereas in the short memory case the usual n12-rate is attained. Therefore a similar procedure as described in the present chapter, tak-ing into account estimates for underlytak-ing short memory LARCH processes, could lead to more efficient estimators for the long memory parameter of an aggregated process. Compare the result by Giraitis et al. (2009) mentioned in section 3.2 and also see the concluding remarks in chapter 8.

This chapter is structured as follows: In section 7.2 we show that contempora-neous aggregation of asymptotically stationary AR(1) processes can lead to long memory in the aggregated series as in the stationary case. In section 7.3we intro-duce the estimator of the parameters of the mixture distribution and present the asymptotic result, whereas in section 7.4 the finite sample properties are illus-trated by a small simulation study. The more extensive study of the asymptotic properties of the serial correlation coefficient will be given in section 7.5.

CHAPTER 7. AGGREGATION AND ESTIMATION 141

7.2 Aggregating asymptotically stationary AR(1)