Estimation procedure - Asymptotic Statistical Theory for Long Memory Volatility Models

In this section we study a procedure for estimating the parameters α and β of the distribution of the AR(1) coefficients ϕi, based on the basic processes Xi,t. First note that, due to (A7.2), the squared coefficients ϕ²_i are beta distributed with parameters α and β. This can be seen by calculating the density ofϕ²_i:

fϕ²(x) = 1 or equivalently by the roots of the equations

ψ(α)−ψ(α+β) = 1 where ψ(x) = _dx^d ln(Γ(x)) denotes the so-called digamma function. The idea is to replace the unobservableϕi’s by estimates based on observationsXi,t,t = 0, . . . , n

CHAPTER 7. AGGREGATION AND ESTIMATION 145 1 and therefore it is a pretty difficult task to ensure that the terms ln( ˆϕ²_i,n) respectively ln(1−ϕˆ²_i,n) have finite expected value even though E|ln(ϕ²_i)| and E|ln(1−ϕ²_i)|are finite. To guarantee integrability of all involved random variables we introduce a truncation parameter h = h(N, n) > 0 with h < ¹₂, h → 0 as N, n → ∞, and define the truncated estimator ˆϕi,n,h= min{max{ϕˆi,n, h},1−h} and ˆθN,n,h= ( ˆα,β)ˆ ^t by the equations

ψ( ˆα)−ψ( ˆα+ ˆβ) = 1 N

XN i=1

ln( ˆϕ²_i,n,h) (7.7)

ψ( ˆβ)−ψ( ˆα+ ˆβ) = 1 N

XN i=1

ln(1−ϕˆ²_i,n,h). (7.8) We will show that asymptotically, that means by letting N, n tend to infinity and h to zero, the estimator ˆθN,n,h has the same distribution as the MLE of θ defined by (7.5) and (7.6). In this investigation, the asymptotic properties of ˆϕi,n

will play an important role. It is a well known result that for fixed ϕi, the mean squared error (MSE) of the serial correlation coefficient is of ordern⁻¹. However, since we deal with infinitely many ϕi, which may be arbitrarily close to one, it is not obvious that a uniform bound of order n⁻¹ holds for the MSE of ˆϕi,n−ϕi. This result is formulated in the next proposition, whereas the rather extensive proof is given in the section 7.5.

Proposition 7.2 Let (A7.1)-(A7.3) hold. Then E[( ˆϕ1,n −ϕ1)²|ϕ1] ≤ C¹_n for everyn ≥1, whereC only depends onC₀ of assumption (A7.2), and thusE|ϕˆ_1,n− ϕ1|² ≤C¹_n for every n≥1.

This result can now be used to derive the asymptotic distribution of the right hand side of equations (7.7) and (7.8):

Theorem 7.1 Let (A7.1)-(A7.3) hold and N, n → ∞, h → 0. Denote α∧β = min{α, β} and define

SN,n,h = XN

i=1

ln( ˆϕ²_i,n,h)−E[ln(ϕ²₁)]

ln(1−ϕˆ²_i,n,h)−E[ln(1−ϕ²₁)]

! .

(a) If _h¹2n →0 and ^ln(h)_N ² →0, then N⁻¹SN,n,h L²

→0.

CHAPTER 7. AGGREGATION AND ESTIMATION 146

Proof: We will concentrate on the asymptotic result for the second component of the two-dimensional vector SN,n,h, which we denote by S_N,n,h⁽²⁾ . The bivariate limit theorem follows analogously by the Cramer-Wold device.

(a) Define

(b) We first show that the central limit theorem holds for ^√¹_N PN

i=1Xi,n,h. Indeed, the necessary condition for triangular arrays is fullfilled: From (a), we getσ_n,h² :=

var(X1,n,h)→var(ln(1−ϕ²₁))>0. Thus for δ >0

CHAPTER 7. AGGREGATION AND ESTIMATION 147 It follows that PN

i=1

To complete the proof we have to show that bN,n,h :=√ and Taylor expansion of ln(1−x²) up to second order yields

|bN,n,h| ≤ √ from proposition 7.2 together with

E Theorem 7.1 implies the following asymptotic result for ˆθN,n,h defined by (7.7) and (7.8). Denote Θ = [1,α]¯ ×[1,β] the parameter space with 1¯ <α,¯ β <¯ ∞. Theorem 7.2 Let (A7.1)-(A7.3) hold and N, n → ∞, h → 0. Further, let the true parameter vector θo be an inner point of Θ.

(a) If _h¹2n →0 and ^ln(h)_N ² →0, then there exists a sequence of solutions of (7.7) and (7.8) with θˆ_N,n,h→θ_o in probability.

CHAPTER 7. AGGREGATION AND ESTIMATION 148 (b) If ^ln(h)^√_N² →0, √

Nh^α^∧^β →0 and ^√_h2^Nn →0, then

N^−1/2(ˆθN,n,h−θo)→ N^d (0, A⁻¹(θo)), where

A(θ) = ∂

∂θ

ψ(α)−ψ(α+β) ψ(β)−ψ(α+β)

= ψ1(α)−ψ1(α+β) −ψ1(α+β)

−ψ1(α+β) ψ1(β)−ψ1(α+β)

! .

Here, ψ1(x) = _dx^d²2 ln(Γ(x)) denotes the trigamma function.

Proof: The proof follows almost exactly as in the case of maximum likelihood estimation for the beta distribution (see e.g. Lehmann und Casella 1998). Define

ΦN,n,h(θ) = ψ(α)−ψ(α+β)− _N¹ PN

i=1ln( ˆϕ²_i,n,h) ψ(β)−ψ(α+β)−_N¹ PN

i=1ln(1−ϕˆ²_i,n,h)

and

Φ(θ) = ψ(α)−ψ(α+β)−E[ln(ϕ²₁)]

ψ(β)−ψ(α+β)−E[ln(1−ϕ²₁)]

! .

For (a), we have to show that Φ_N,n,h(θ)→Φ(θ) in probability uniformly inθ ∈Θ, i.e. sup_θ_∈_ΘkΦN,n,h(θ)−Φ(θ)k → 0, and that Φ(θ) has a unique zero in θ = θo. The latter follows from standard properties of the beta distribution. Uniform convergence follows from the observation that

sup

θ∈ΘkΦN,n,h(θ)−Φ(θ)k=kN⁻¹SN,n,hk

and theorem 7.1(a). For (b), an application of the bivariate mean value theorem results in

0 = ΦN,n,h(ˆθN,n,h) = ΦN,n,h(θo) +A(˜θN,n,h)(ˆθN,n,h−θo),

where kθ˜_N,n,h − θ_ok ≤ kθˆ_N,n,h − θ_ok. Since ˆθ_N,n,h is consistent and A(·) is a deterministic, continuous function, we get A(˜θ_N,n,h) → A(θ_o) in probability and that, with probability tending to one, A(˜θN,n,h) is positive definite. This, together with Slutky’s theorem and theorem 7.1(b) applied to ΦN,n,h(θo) = N⁻¹SN,n,h, implies that N⁻^1/2(ˆθN,n,h−θo)→ N^d (0, A⁻¹(θo)ΣA⁻¹(θo)). Finally, since A(θ) is the Fisher information matrix for the beta distribution, we have A(θo) = Σ.

CHAPTER 7. AGGREGATION AND ESTIMATION 149 Remark 1 A possible combination of length of seriesnand truncation parameter h with respect to the number of basic processes N would be

n=N¹²⁺^α∧β¹ ^+ǫ and h=const·N⁻^2α∧β¹ ,

with a constant ǫ >0. This means, that n has to tend to infinity at a faster rate than N. In particular, if α ≥ β and β is close to 1 (and thus the long memory parameter d= 1−β/2 is close to ¹₂), then ¹₂ +_β¹ +ǫ is close to ³₂. On the other hand, if β is close to 2 (d≈0), ¹₂ +_β¹ +ǫ is close to one.

Remark 2 The need of the conditions √

N h^α^∧^β → 0 and ^√_h2^Nn → 0 in theorem 7.1(b) can be explained as follows: In the central limit theorem for SN,n,h the asymptotic bias limBN,n,h with

BN,n,h:= B⁽¹⁾ B⁽²⁾

:= 1

√NE[SN,n,h]

arises as a result of the replacement of ϕi by ϕˆi,n,h in (7.7) and (7.8). This bias can be decomposed into a first component due to the mean squared error of the serial correlation coefficientϕˆi,n and a second one owing to the truncation ofϕˆi,n. For instance, consider B⁽²⁾. We then showed in the proof of theorem 7.1(b) that

|B⁽²⁾| ≤ √

NE[ln(1−ϕˆ²_1,n,h)]−E[ln(1−ϕ²_1,h)]

+√

NE[ln(1−ϕ²_1,h)]−E[ln(1−ϕ²₁)]

≤ C√

N E|ϕˆ1,n−ϕ1|+C

√N

h² E[ ˆϕ1,n−ϕ1]² +√

N E 2h

1−ϕ²₁(1{ϕ1 >1−h}+1{ϕ1 < h})

≤ C^′

√N

√n +C^′

√N

h²n +C^′′√

Nh(h^β−1+h^α).

Here, ϕ1,h := min{max{ϕ1, h},1− h} denotes the truncated value of ϕ1 and C, C^′, C^′′ are finite constants. Analogously, one gets that

|B⁽¹⁾| ≤C^′

√N

√n +C^′

√N

h²n +C^′′√

N h(h^α⁻¹+h^β),

and thus the mentioned conditions ensure that the bias is asymptotically negligible.

CHAPTER 7. AGGREGATION AND ESTIMATION 150

N 250 500 1000 2000 250 500 1000 2000

n 817 1894 4394 10196 340 707 1468 3051

α= 2, β = 1.4 α= 2, β= 1.8

mean 1.449 1.423 1.411 1.405 1.879 1.834 1.816 1.807 bias 0.049 0.023 0.011 0.005 0.079 0.034 0.016 0.007 bias·√

N 0.783 0.506 0.353 0.219 1.263 0.753 0.512 0.315 sd 0.112 0.075 0.062 0.041 0.169 0.116 0.076 0.055 sd·√

N 1.771 1.674 1.948 1.840 2.686 2.605 2.413 2.484

asymp. sd 1.828 2.402

Table 7.1: For each panel ofN independent random AR(1) processes with sample size n, 400 realizations have been simulated. The table gives the sample means, biases and standard deviations of the resulting 400 values of ˆβ. In the left block of the table the parameters of the random AR(1) coefficient are α = 2, β = 1.4, in the right block α= 2, β= 1.8. The asymptotic standard deviations in the last row are calculated by theorem 2.

7.4 Simulations

A small computer experiment has been performed to illustrate the finite sample properties of the estimator ˆθN,n,h. For two different values of θ = (α, β)^T, we simulate a panel of random AR(1)-processes as given in (7.1) where assumptions (A7.1)-(A7.3) are fulfilled. In particular, we set Xi,0 = 0 for i = 1, . . . , N and the parameters of the distribution of ϕi are chosen to be αo = 2 and βo = 1.4 in the first case, and αo = 2 and βo = 1.8 in the second case. Thus, the long memory parameter is do = 1−βo/2 = 0.3, respectively do = 0.1. In both cases, the panel is simulated for different numbers of processes N = 250, 500, 1000 and 2000, with corresponding lengths of processes n = 340, 707, 1468 and 3051 in the first case, respectively n = 817, 1894, 4394 and 10196 in the second case.

Moreover, following remark 1, the respective values of the truncation parameter are h = 0.01N⁻²^βo¹ , and also note that the values for n satisfy n > N¹²⁺^βo¹ . For the eight configurations (βo = 1.4 respectively βo = 1.8, and in each case four combinations ofN, nandh), 400 realizations of the panel are simulated. For each simulated panel, the estimator ˆθ_N,n,h = ( ˆα,β)ˆ ^T is calculated. We concentrate on the results of the more interesting component ˆβ, since the long memory parameter d only depends on βo, and, in addition, since the results for ˆα look similar.

The sample means, biases and standard deviations of the simulated values of ˆβ are summarized in table 7.1. One can see that the sample standard deviations,

CHAPTER 7. AGGREGATION AND ESTIMATION 151

500 1000 1500 2000

0.0020.0040.0060.010

(a) VAR beta=1.4 (log−log scale)

sample size n

var beta_n

slope=−0.923

500 1000 1500 2000

0.0050.0100.0200.050

(b) BIAS beta=1.4 (log−log scale)

sample size n

bias beta_n

slope=−1.103

500 1000 1500 2000

0.0020.0050.010

(c) MSE beta=1.4 (log−log scale)

sample size n

mse beta_n

slope=−1.002

500 1000 1500 2000

0.0050.0100.020

(d) VAR beta=1.8 (log−log scale)

sample size n

var beta_n

slope=−1.090

500 1000 1500 2000

0.010.020.05

(e) BIAS beta=1.8 (log−log scale)

sample size n

bias beta_n

slope=−1.157

500 1000 1500 2000

0.0050.0100.020

(f) MSE beta=1.8 (log−log scale)

sample size n

mse beta_n

slope=−1.174

Figure 7.1: The biases, variances and MSE’s corresponding to the values of table 1 are plotted against N in log-log scale. In the first row the parameters of the random AR(1) coefficient are α = 2, β = 1.4, in the second row α = 2, β = 1.8.

In each plot, the least squares line is fitted and the respective slope is given.

normalized by √

N, are close to the asymptotic standard deviations of 1.828 in the first case, respectively 2.402 in the second case. The latter values have been calculated by means of theorem 7.2 and the trigamma function. Moreover, the sample biases, again normalized by√

N, decrease for increasing N, in accordance to the result of theorem 7.2, that asymptotically, the bias is negligible. In figure 7.1, the sample bias, variance and MSE of ˆβ is plotted against N in log-log scale (figures 1(a)-1(c) for βo = 1.4 and 1(d)-1(f) for βo = 1.8). In each plot the least squares line is fitted and the corresponding slopes are given, which all are close to -1 as expected from theorem 7.2. Finally, figures 7.2(a)-(d) show the normal probability plots for ˆβ in the case βo = 1.4 and figures 7.2(e)-(h) for the case βo = 1.8. Even for the small values of N, it seems that the distribution of ˆβ can be approximated well by the normal distribution.

CHAPTER 7. AGGREGATION AND ESTIMATION 152

Figure 7.2: Normal probability plots of the simulated values of ˆβ corresponding to table 1. The gray line indicates the true parameter value of 1.4 in figures 2(a)-(d) and 1.8 in figure 2(e)-(h).

7.5 Bias and MSE of the serial correlation

Im Dokument Asymptotic Statistical Theory for Long Memory Volatility Models (Seite 149-157)