2 Stability of the GARCH(1,1) model

(1)

Yule{Walker type estimators in GARCH(1,1) models:

Asymptotic normality and bootstrap

Gisela Maercker Martin Moser

Abstract

We investigate GARCH(1,1) processes and rst prove their stability. Using the representation of the squared GARCH model as an ARMA model we then consider Yule{Walker type estimators for the parameters of the GARCH(1,1) model and derive their asymptotic normality. We use a residual bootstrap to dene bootstrap estimators for the Yule{Walker estimates and prove the consistency of this bootstrap method. Some simulation results will demonstrate the small sample behaviour of the bootstrap procedure.

1 Introduction

Many time series exhibit non-constant conditional variance (conditional heteroskedasticity). Nonlinear processes capable of modelling such volatility have come into particular interest in time series analysis, especially in econometrics.

Conditional heteroskedasticity can be modelled by processes of the form

Xt =t"t t = 012::: (1.1)

where the innovations ^f"t^g are independent identically distributed (i.i.d.) random vari- ables with mean zero and unit variance, and the volatility t describes the change of (conditional) variance.

In nancial time series such as stock returns or foreign exchange rates, volatility clus- tering has been observed for a long time, i.e. periods of large price changes are followed by periods of small price changes. This phenomenon can be modelled by autoregressive conditional heteroskedastic (ARCH) models introduced by Engle (1982), where the conditional variance _2t is a linear function of the squared past observations.

Bollerslev (1986) proposed the generalized ARCH or GARCH model by including also lagged values of 2t in the conditional variance equation. The GARCH(pq) model is

1

(2)

dened by (1.1) with 2t =0+^X^p

j=1jX2t^;j +^X^q

i=1i2t^;i 0 > 0 1:::p1:::q0 p > 0q > 0:

GARCH modelling allows a more exible lag structure than ARCH models and often permits a more parsimonious parametrization. Especially the GARCH(1,1) model has served as an appropriate model in many applications.

Over the past years, many semiparametric and nonparametric approaches to the ARCH model have been studied. For a variety of other extensions and applications of the ARCH model, we refer to the survey article by Bollerslev, Chou and Kroner (1992) and Engle (1995).

In the present work we investigateparameter estimationand bootstrap in GARCH(1,1) models. Rewriting the GARCH(1,1) model it becomes obvious that the squared process can be represented as an ARMA(1,1) (autoregressive moving-average) process. According to this Yule{Walker type estimators for the parameters of the GARCH(1,1) processes can be dened (Bollerslev (1986)). We prove stability of the GARCH(1,1) model and then derive asymptotic normality of the Yule{Walker estimators.

The bootstrap is a method for estimating or approximating the distribution of a statis- tic and its characteristics based on a resampling of the observed data. Since its introduction by Efron for models with i.i.d. observations, there have been many applications and extensions of the bootstrap principle, also in case of dependent data. Bootstrap methods for e.g. nonparametric ARCH models have been studied by Franke, Kreiss and Mammen (1997), but they do not cover the GARCH case. A wild bootstrap method for quasi- maximum likelihood estimators of GARCH(1,1) models is proposed in Maercker (1998).

We use a residual bootstrap to dene bootstrap estimators for the Yule{Walker estimates and prove the consistency of the bootstap procedure. Some simulation results demonstrate the small sample behaviour of the bootstrap method.

The paper is organized as follows. In Section 2 we state sucient conditions for the stability of GARCH(1,1) models. In Section 3 we dene Yule{Walker type estimators for the parameters in GARCH(1,1) processes and derive asymptotic normality of these estimators. In Section 4 we construct bootstrap estimators for the Yule{Walkerestimators.

We show that this bootstrap method works in the sense that it is consistent. In Section 5 some simulation results for the bootstrap method are presented. All proofs are deferred to the Appendix. Some additional tools and notations about Markov chain theory and mixing which are used for the proofs are also provided in the Appendix.

2

(3)

2 Stability of the GARCH(1,1) model

Assume that we are given observations X1:::Xn from the heteroskedastic model

Xt=t"t t²^Z^Z (2.2)

with innovations

f"t^g i.i.d E"0 = 0 E"²0 = 1 (2.3) and conditional variance

2t =! + X2t^;1 +2t^;1 t²^Z^Z where ! > 0: (2.4) The process dened by (2.2) and (2.4) is called GARCH(1,1) (generalized autoregressive conditional heteroskedastic) process (Bollerslev (1986)).

Combining (2.4) and (2.2) the conditional variance may be written as

2t =! + 2t^;1( + "2t^;1): (2.5) Iterating t in equation (2.4) and (2.5) respectively yields for h1

2t = !^h^X^;¹

k=0^k +^h^X^;¹

k=0^kX2t^;1^;k+^h2t^;h (2.6) 2t = !^h^X^;¹

k=0 k

Y

i=1

+ "2t^;i

+2t^;h^Yh i=1

+ "2t^;i

(2.7)

where, as usual, empty products are set equal to one.

We make the following stability assumption on the model.

Assumption S

(S1) + < 1.

(S2) The distributionG of "0 has a Lebesgue densityg which is positive and continuous.

By Jensen's inequality, (S1) implies

E^hln( + "²0)ⁱ< 0: (2.8) As is shown in Nelson (1990), condition (2.8) is necessary and sucient for the existence of a unique stationary solution of (2.5), which then is given by the innite series

2t =!^X¹

k=0 k

Y

i=1( + "2t^;i) t²^Z^Z: (2.9) 3

(4)

In the following we will assume that ^f_2t^g is the stationary solution of (2.5) and hence may be represented by (2.9). In particular, as E"2t = 1, we have

= EX2t =E2t = !

1^;( + ) <¹: (2.10)

Furthermore, E^P¹h=0^h2t^;h = ₁^; < ¹ implies^h2t^;h ^! 0 as h ^! ¹ almost surely, thus (2.6) may be extended to

2t = !1^; +

1

X

k=0^kX2t^;1^;k t² ^Z^Z: Set (xs) = (! + x²+s²)¹². Then

Yt:= (Xtt)⁰=(Yt^;1)("t1)⁰ t²^Z^Z

is a bivariate Markov process with state space IR^q₁^;^!¹). With the help of a drift criterion we will establish geometric ergodicity and absolute regularity for this process.

Whereas stability of ARCH processes was investigated before by e.g. Guegan and Diebolt (1992), Doukhan (1994), and Borkovec and Kluppelberg (1998), the results do not cover the GARCH case.

For a denition and discussion of geometric ergodicity and absolute regularity we refer to the Appendix.

Lemma 2.1

Let Assumption S hold. Then the process ^fYt^g is geometrically ergodic and absolutely regular. Furthermore, there exist constants c > 0 and > 1 such that the -mixing coecients of ^fYt^g satisfy k c^;^k, k ²IN.

We conclude this section with an ARMA-representation of the squared process ^fX2t^g

(cf. Bollerslev (1986)) which will be used frequently in the sequel. Set t=X2t^;2t =2t("2t^;1):

Then, by (2.4), we have

X2t =! + ( + )X2t^;1^;t^;1+t: (2.11) Therefore^fX_2t^gis an ARMA(1,1) process with parameters+ and^; and innovations

ft^g. Dening ^Ft as the -eld generated by^f"s:st^g we note Et^j^Ft^;1] =2tE"2t^;1] = 0 thus the innovations ^ft^gform a martingale dierence sequence.

4

(5)

3 Yule{Walker type estimators { denition and asymptotic normality

Consider the centered squared GARCH process ^fX2t^;^g in the ARMA-representation

X2t^;^;( + )X2t^;1^;=t^;t^;1 (3.12) t=_2t("_2t^;1), derived from (2.11). As is well established in ARMA models the empirical autocovariances of the process can be used to obtain Yule{Walker (YW) type estimators for the parameters , + , and , respectively. Observe that the squared process^fX2t^g

exhibits autocorrelation whereas the process itself is not correlated over time.

Let us assume that EX40 < ¹. Conditions for the existence of moments are given in Bollerslev (1986) and Nelson (1990). As we will need even more stringent conditions for the proof of asymptotic normality of the YW type estimators the discussion of those conditions will be postponed, see Remark 3:2 below.

Set 2=E2t. RecallingEt^jFt^;1] = 0 we note thatEtX2t =2,EtX2t+1 =2 and EtX2s = 0 fors < t. For the derivation of suitable identitiesinvolvingthe covariances h = Cov(X20X2h), h 0, we now proceed in the usual way, see also Bollerslev (1986, 1988), by multiplying both sides in (3.12) with X_2t^;_h, h = 0,1,..., and computing expectations.

This yields the following identities.

0^;( + )1 = (1^;)2 (3.13)

1^;( + )0 = ^;₂ (3.14)

h^;( + )h^;1 = 0 h2: (3.15)

Elimination of 2 gives the system

+ = ²1 (3.16)

^;¹^; = ( + )¹ ^;0

1^;( + )0 = 2^;0

1 ^;( + )0: (3.17) Using the empirical moments ^ = _n¹^P_nt=1X2t and ^h = ¹_n^Pⁿ_t=1^;^h(X2t^;)(X^ 2t+h^;) we get^ the following YW-type estimators

( + )d _n = ^^²1 (3.18)

(^;d¹^;)_n = ^2 ^;^0

^1^;( + )^d _n^0 (3.19)

!^n = ^1^;( + )^d _n: (3.20) 5

(6)

In order to derive estimators ^n and ^n of and , we set

^n^;¹+ ^n= ( + )^d n+(^;^d¹^;)_n: (3.21) Denoting the right-hand side of (3.21) by ^cnwe obtain ^_2n^;^cn^n+1 = 0. Hence, if ^cn 2, we set

^n= ^cn=2^;^qc^2n=4^;1

so that 0 < ^n 1 and ^n < 1 if ^cn > 2. In practice it might happen that ^cn < 2, then set ^n = 0. But by construction and the ergodic theorem ^cn is a consistent estimate for 1= + and therefore, almost surely, ^cn > 2 for suciently large n. Finally we dene

^n =( + )^d _n^; ^n:

Again, by the ergodic theorem, ^n = (^n ^n ^!n)⁰ is a consistent estimator for = (!)⁰.

Next, under additional moment assumptions, the asymptotic normality of ^n will be shown. We may use this to construct condence sets for . However, the normal distribution is only an approximation to the exact distribution of n. An alternative approach which often yields a better approximation is the bootstrap. Bootstrap condence intervals are obtained by replacing the unknown distribution with its bootstrap estimator. We shall introduce a bootstrap method and study the consistency of this procedure.

Before doing so it should be mentioned that, as well in GARCH as in ARMA models, the YW estimators are less ecient than maximum likelihood (ML) estimators, or, depending on the error distribution, quasi-maximum likelihood (QML) estimators which are commonly used in practice. Indeed, simulation experiments underline this fact. Under normality of the innovations, ML estimators outperform the YW estimators. This is not necessarily true for QML estimators in case of non-normal innovation distributions.

QML estimates are found by an iterative procedure where the YW estimates may be used as initial estimates. In contrast to this in the YW type estimation procedure the observed data are used in a more direct way. This estimation procedure will be imitated by an appropriate bootstrap technique. A bootstrap method for QML estimators is discussed in Maercker (1997, 1998).

For the derivation of asymptotic normality of ^n we shall make use of a central limit theorem (CLT) for strongly mixing processes.

Theorem 3.1

Let Assumption S hold and suppose E^jX0^j8+ <¹ for some > 0. Then

pn^n^;^;^D^!^N((000)⁰) (3.22) 6

(7)

if is positive denite where =D2D1~D⁰1D2⁰, D1 =

0

B

@

1 0 0

;

1^; ² ⁰

2 1 0

; 0 1

1

C

A D2 =

0

B

@

1^; ^; 0

0

0 0 1

1

C

A = 12 ^; (^;¹+) 4^q(^;¹+)²=4^;1 and ~ = E20Z0Z0⁰ with

Zt1 = 1^;¹

h

;X2t^;1+X2t^;2 ^;(1^;)ⁱ

Zt2 = (1^;( + )0)^;¹(1^;²)t^;1 +²( + ) + ^;^;^;¹Xt^;1

+X2t^;2+^;¹^;(1^;) + !(1^;) Zt3 = 1^;:

Remark 3.2

As ^nis based on the empirical autocovariances ^h, i.e. on lagged empirical fourth moments ofXt, the need for the rather stringent moment conditionE^jX0^j8+ <¹ in Theorem 3.1 becomes obvious. Under Assumption S expansion (2.9) holds and, by Minkowski's inequality, for any p > 0 a sucient condition for E^jX0^j2p < ¹ is given by E ( + "20)^p < 1. This condition is also necessary, see Nelson (1990). For the case of normally distributed innovations "t the restrictions on the parameter space implied by EX₈₀ < ¹ and EX₀¹⁰ < ¹ are, among others, illustrated in Figure 3.1 of Bollerslev (1986).

4 The bootstrap procedure

We now discuss a bootstrap method for estimating the distribution of ^pn^n^;. We use a residual bootstrap to construct bootstrap estimators. It will be shown that the (conditional) distribution of these bootstrap estimators converges in probability to the same asymptotic distribution as given in Theorem 3:1 for the original estimators, that is, the bootstrap procedure is (weakly) consistent.

Given a sampleX1:::Xn, the bootstrap process ^fX_t^g will be of the form

Xt =t"t t2 = ^! + ^Xt^;²1+ ^t^;²1 t²^Z^Z (4.23)

f"t^gi.i.d E"0 = 0 E"0²= 1 (4.24) 7

(8)

where the distributionG of "0 is an estimate of the distributionG of "0 and E denotes the conditional expectationE^jX1:::Xn]. The distribution ^L(^pn(^^;)) will then be approximated by the (conditional) distribution^L(^pn(^^;^)) where ^ = (^ ^ ^!) is calculated in the same way as ^, with X1:::Xn replaced byX1:::Xn. For notational simplicity here and later the indexn indicating the dependence of the estimators and the bootstrap process on the number of observations will be omitted.

In detail, the construction of ^fXt^g consists of the following steps. Compute the Yule- Walker estimate ^ as described in Section 2. Set ^20 = ^ and dene

^2t = ^! + ^X2t^;1 + ^^2t^;1 t = 1:::n or equivalently,

^2t = ^!^X^t^;¹

k=0 ^^k+ ^^t^X^;¹

k=0 ^^kX2t^;1^;k + ^^t t = 1:::n:^ (4.25) Calculate empirical residuals

"^t= X^t^t t = 1:::n

and let ^G(x) = _n¹ ^P_nt=11^f"^t x^g denote their empirical distribution. In view of Assump- tion S smooth ^G by convolution and set ~G = ^G^N(0h²) where h = n^;¹⁵. Dene the distribution G of "0 as the standardized form of ~G, i.e. G(x) = ~G(~Gx + ~G) where _~G = ¹_n^P_nt=1"^t and _2~G = ¹_n^P_nt=1(^"t^;_~G)²+h² are the mean and variance of ~G, respectively. As ^^! almost surely we may assume

! ^ ^ > 0 and ^ + ^ < 1:^ (4.26) Finally, dene the bootstrap GARCH process ^fXt^g as the stationary solution of (4.23).

In particular, we have as in (2.9) t² = ^!^X¹

k=0 k

Y

i=1

^+ ^"t^;²i

t²^Z^Z: (4.27)

Furthermore, by construction and (4.26),^fXt^gfullls Assumption S. Hence, the conclu- sions of Lemma 2.1 apply and Yt = (Xtt)⁰ is a geometrically ergodic and absolutely regular process with exponential decay of the mixing coecients.

The density g of the bootstrap innovations is given by g(x) = nh^~G

n

X

t=1' ~Gx + ~G^;"^t

h

!

8

(9)

where '(x) = ^p¹₂e^;¹²^x². Hence,g may be understood as a standardized kernel estimate of g with kernel ' and bandwidth h. We have chosen h = n^;¹⁵ as the rate common in kernel smoothing.

For the consistency proof of the bootstrap proposal we need the fact that, at least on average, ^_2t is a good estimate for_2t, as well as the consistency of g and the moments of the bootstrap process. Let ^kf^k¹ = supx²IR^jf(x)^j for f:IR^!IR.

Lemma 4.1

Let Assumption S hold and suppose E^jX0^j2p <¹ for somep > 4. (a) For any r1

n1 ⁿ

X

t=1

^2t^;2tr =OP

n^;^{p^r}² :

(b) ~G =OP

n^;¹² _2~G^;1=OP

n^;¹²: (c) If g is uniformly continuous then

kg^;g^k¹ =oP(1):

(d) For q ²(02p), ki ²^f02^g and ti²^Z^Z, i = 1:::4, E^j"0^jq

!E^j"0^jq E^Y⁴

i=1_tⁱ^kⁱ ^!E^Y⁴

i=1^k_tⁱⁱ E^Y⁴

i=1X_tⁱ^kⁱ ^!E^Y⁴

i=1X_t^kⁱⁱ

in probability. Furthermore, there is a constant c > 0 such that for any subsequence (k)IN there exists a subsequence (k`)(k) such that almost surely

limsup_`

!1

E0^q c limsup_`

!1

E^jX0^jq

c:

In particular,

E0^q=OP(1) E^jX0^jq =OP(1):

As already observed, the Markov process ^fYt^gis -mixing with exponential decay of the mixing coecients. The parameters ^ and g determining the process ^fYt^g converge in probability to and g, respectively, and so even more can be said about the -mixing coecients_n(j) = (j), j ² IN, of ^fY_t^g. This is done in the following theorem where we nd it convenient to phrase arguments concerning convergence in probability in terms of almost sure convergence along subsequences.

9

(10)

Theorem 4.2

Let Assumption S hold. Suppose that EX₄₀ < ¹ ^{and that} g is uniformly continuous. Then for any subsequence (k) IN there exist a subsequence (k`)(k) ^and constants Cb > 0 andb > 1 such that almost surely

k^`(j)Cb^;b^j for all `j ²IN:

As a corollary we obtain the consistency of the bootstrap estimators in the sense of, for example,P(^j^^;^j > ) = oP(1) for all > 0. This is not immediate from the ergodic theorem applied to the bootstrap process ^fYt^g, as for each sampleX1:::Xn there is a dierent process ^fYt^gunder consideration.

Corollary 4.3

Let Assumption S hold and suppose E^jX0^j8+ <¹ for some > 0. Then ^^, ^h, h0, and ^ are consistent for^, h, h0, and in the sense mentioned above.

Now we are ready for the main result of this section, which states the consistency of the proposed bootstrap procedure.

Theorem 4.4

Let Assumption S hold. Suppose that E^jX0^j8+ < ¹ for some > 0 and that g is uniformly continuous. Then

L

pn^^;^^;^D^! ^N((000)⁰) in prabability (4.28) if as dened in Theorem 3:1 is positive denite.

5 Simulations

In order to illustrate the performance of the bootstrap procedures described in the preced- ing section, we show some results of simulation experiments. We simulate GARCH(1,1) processes of lengthn = 1000 with standard normal error distribution and with parameter = (!). The parameter is estimated by the Yule{Walker type estimator. We repeat this procedure to estimate the distribution of the YW estimator. More specically, the distribution of the standardized estimator is approximated by the estimated density calculated from 2500 Monte Carlo replications. Then the bootstrap approximation of this distribution is calculated. To this aim we calculate the YW estimator ^n from one simulated GARCH(1,1) process of lengthn = 1000. Based on ^n we generate 2500 bootstrap processes and calculate the bootstrap YW estimator for each bootstrap sample of length n = 1000. The estimated density for the standardized bootstrap estimator calculated

10

(11)

from the 2500 bootstrap replications is plotted against the distribution of the original YW estimator.

It should be remarked that the YW estimation procedure is not very stable if () are chosen near the admissible parameter space with respect to the moment condition.

We show the results for = (0:10:40:1). Figure 1 compares the distribution of

pn(^n^;) with the bootstrap distribution of ^pn(^n ^;^n). Note that the bootstrap procedure is based on only one (randomly chosen) sample of the underlying GARCH process.

Figure 2 and Figure 3 show the results for the parameters and !, respectively.

A Appendix

A.1 Proofs

Proof of Lemma2:1:We will show that^fYt^gis-irreducible with being the Lebesgue measure restricted to IR^q₁^;^!¹), aperiodic, and that compact sets are small. Then we shall apply the drift criterion given in Theorem A.4 to obtain geometric ergodicity.

In order to avoid a parameter-dependent state space and with an eye to situations where^fYt^gmay be started non-stationarily at timet = 0, we will formulate the proof for the state space IRIR⁺.

Let A IR ^q₁^;^!¹) be measurable with (A) > 0, denoting the Lebesgue measure. Without loss of generality we may assume A IR^q₁^;^! +¹) for some > 0.

We will show that, given y ² IR IR⁺, there exists an m ² IN such that for all measurable sets A⁰ ^q₁^;^! +¹) with positive Lebesgue measure we have P (m ²A⁰^jY0 =y) > 0. This implies P (Ym ²A ^jY0 =y) > 0 since Ym = m("m1)⁰ where "m is independent of m and has positive Lebesgue density.

Let m(xs) = !^P^mk=0^;¹^k +^m^;¹(s²+x²). Then given Y0 = y ² IRIR⁺ we can nd positive functions hi("i+1:::"m^;1),i = 1:::m^;1, such that (2.7) takes the form

2m =m(y) +^m^X^;¹

i=1 "2ihi("i+1:::"m^;1):

LetA⁰⁰=^fr² ^jr ²A⁰^g. Choosing m large enough, we have m(y)^;₁^;^!< and hence P (m ²A⁰^jY0 =y) = P2m ²A⁰⁰ ^jY0 =y

11

(12)

Fig. 1: Distribution of ^pⁿ(^n^;) with bootstrap approximation and for simulated GARCH(1,1) processes with (^!) = (0^:10^:40^:1), sample size ⁿ= 1000. All plots are based

on 2500 Monte Carlo replications.

12

(13)

Fig. 2: Distribution of^pⁿ(^n^;) with bootstrap approximation. Parameter as in Figure 1.

13

(14)

Fig. 3: Distribution of^pⁿ(^^!n^;^!) with bootstrap approximation. Parameter as in Figure 1.

14

(15)

= ^Z :::^Z 1

(mX^;1

i=1 z2ihi(zi+1:::zm^;1)²A⁰⁰^;m(y)

)mY^;1

i=1 g(zi)dz1:::dzm^;1

> 0

as g > 0 and A⁰⁰^;m(y) IR⁺. Thus ^fYt^g is -irreducible with being the Lebesgue measure restricted to IR^q₁^;^!¹).

Using Lemma A:3, the aperiodicity follows similarly. Choose the compact set A = 01]^q₁^;^!^q₁^;^! + 1], say, and m1 ²IN such that

max_y

2A

m(y)^; ! 1^;

< mm1:

Then, with the same arguments as above, for all B A with (B) > 0 we obtain P (Ym ²B ^jY0 =y) > 0 and P (Ym+1 ²B ^jY0 =y) > 0 for all y²B:

In order to prove that compact sets are small, consider the 2-step transition probability and letC = ^;MM]0M] for some M > 0. We obtain for any Borel set AIRIR⁺ and anyy²C

P (Y2 ²A^jY0 =y) = ^{Z Z} 1^q! + (z21+)²(y)(z21)⁰²Ag(z1)g(z2)dz1dz2

Z Z 1^fu(z²1)⁰²A^ggC(u)g(z²)dudz² (1.29) where we substituted u =^q! + (z21+)²(y), used u^p!, ²(y)M²+! and, if z1 = 0, uM + 2!, and put

gC(u) = 1M+2!¹)(u)^q_(M²^!_+!)u _yinf

2Cg ^r^u²^;^!^;²_(y)²^(y)

!

: As g is positive and continuous, gC is positive on M + 2!¹), and

C() =^ZZ 1^fu(z1)⁰²^ggC(u)g(z)dudz denes a non-trivial measure onIRIR⁺ with

P (Y2 ²A^j Y0 =y) C(A) for all y ²C and all Borel setsAIRIR⁺: Thus we have shown that C is small.

15

(16)

Now we are ready to apply the drift criterion given in Theorem A.4. Set d =

12

1

+ ₁^;¹^;1 and dene the test functionV (y) = 1+dx²+s² fory = (xs)⁰²IRIR⁺. Then we haved > 0 and, as E"₂₁ = 1,

V (y) = E^h1 +dX1² +²1 ^jY⁰ =yⁱ^;(1 +dx²+s²)

= (1 +d)(! + x²+s²)^;(dx²+s²)

= (1^;)x²

1^; ^;d+s² 1 +d^; 1

!

+ (1 +d)!

= ^;^h(1^;)x²+s²)ⁱ1 2 1

^; 1

1^;

!

+ (1 +d)!:

Now + < 1 implies < 1 ^; and ¹ ^; ₁^;¹ > 0. Dening the constants =

14 max(1d)

1

^; 1

1^;

= _4max(1d)(1¹^;⁽⁺⁾^;₎ and b = (1 + d)! + 2, and the compact set C = ^;^q^b^q^b]0^q^b] we thus have ²(0¹₂) and

V (y) ^;2V (y) + (1 + d)! + 2

;V (y) + b1C(y) for all y ²IRIR⁺: (1.30) Thus the drift criterion holds. As EV (Y0) = 1 + (d + 1) < ¹, this concludes the proof of the lemma.

Proof of Theorem 3:1^: In order to avoid cumbersome notation, we will not distinguish between, say,₁ ^P_nt=1xt and ^P_nt=h+1xt. So, for instance, we will write ^h =

n^PX2tX2t^;h^;^² neglecting terms of orderOP(n^;¹).

In a rst step we will prove joint asymptotic normality of ( + )d _n = ^^²1

(^;g¹^;)_n = ^2^;^0

^1^;( + )^0

!~n = ^(1^;( + )):

By (3.12) we have

pn(~!n^;!) = ^pn(^^;)(1^;( + )) = (1^;)n^;¹²^Xt: (1.31) Furthermore

pn( + )^d _n^;( + ) = ^1^;¹

pn^2^;( + )^1

16

(17)

= ^1^;¹ 1

pn

XX2t^;2

hX2t^;^^;( + )X2t^;1 ^;^ⁱ

= ^1^;¹ 1

pn

XX2t^;2(t^;t^;1+ (1^;( + ))(^;))^

= ^1^;¹ 1

pn

Xt

;X2t^;1+X2t^;2^;(1^;)^

using (1.31). In a similar way we get ^1^;( + )^0 = 1n^Xt

;X2t+X2t^;1 ^;(1^;)^

^2 ^;^0 = 1n^X^h( + )X2t^;1+t^;t^;1

i

X2t^;2^;X2t

= 1n^Xt

X2t+1^;X2t^;X2t^;1+X2t^;2

and therefore

^2^;^0^;

^;¹^;^1^;( + )^0

= 1n^Xt

X_2t+1^;X_2t^; + ^;¹^;X_2t^;₁ +X_2t^;₂+^;¹^;(1^;)^: Observing that

X2t+1^;X2t = ! + X2t+t+1^;t = !(1 + ) + ( + )X2t^;1^;²t^;1+t+1

we conclude that

pn(^;^g¹^;)_n^;^;¹^; = (^1^;( + )^0)^;¹ 1

pn

Xt

(1^;²)t^;1+ +²( + ) + ^;^;^;¹X2t^;1+X2t^;2+^;¹^;(1^;)^ + !(1 + ): By the ergodic theorem we have ^ ^! and ^h ^!h almost surely. In order to prove

pn( + )^d _n^;( + )(^;^g¹^;)_n^;^;¹^; ~!n^;!^;^D^!^N0 ~ (1.32) it is therefore sucient to show

1

pn

Xtc⁰Zt D

;!N

0c⁰~c c²IR³ (1.33) 17

(18)

using the Cramer-Wold device. AsZt is^Ft^;1-measurable and E t ^j^Ft^;1] = 0 we observe that ^ftZt(c)^gis a martingale dierence sequence and hence

1

X

t=^;1Cov(tc⁰Zt0c⁰Z0) =c⁰E0²Z0²c:

An application of the CLT for strongly mixing sequences, see Appendix A.2, now gives (1.33).

For the next step we note

(^;d¹^;)_n^;(^;^g¹^;)_n = (^;^d¹^;)_n^0

^¹^;( + )^⁰

( + )d _n^;( + )

!^n^;!~n = ^;^( + )^d _n^;( + )

Hence (1.32), the relationship ¹^;( + )⁰ =^;2 and the ergodic theorem imply

pn( + )^d _n^;( + )(^;^d¹^;)_n^;^;¹^; ^!n^;!^;^D^!^N0D1~D1⁰

: An application of the delta method, using the function

T(xyz) =

0

B

@

x^; ^x+y₂ +^q(x + y)²=4^;1

x+y2 ^;

q(x + y)²=4^;1 z

1

C

A (1.34)

with derivative ^rT( + ^;¹^;!) = D2, concludes the proof of the theorem.

Proof of Lemma 4:1: (a) >From (2.6) and (4.25) we have ^2t^;2t =^t^X^;¹

k=0

h!^^ ^k ^;!^k+^^ ^k^;^kX2t^;1^;k

i+ ^^t^^;^t20:

Choose b ² (1) and set Bn = ^f^ b^g. If 1 < r p, application of the Holder inequality withr and q = (1^; ¹_r)^;¹ gives

t^;1

X

k=0b^kX2t^;k^;1

!r

t^;1

X

k=0b^{k q}²

! r

q ^Xt^;1

k=0b^{k r}² Xt^2r^;k^;1

!

and hence

E 1n^X_t=1ⁿ _k=0^t^X^;¹b^kX2t^;k^;1

!r

1^;b^q²^;^r^q 1^;b²^r^;¹EX0^2r 18