Yule{Walker type estimators in GARCH(1,1) models:
Asymptotic normality and bootstrap
Gisela Maercker Martin Moser
Abstract
We investigate GARCH(1,1) processes and rst prove their stability. Using the representation of the squared GARCH model as an ARMA model we then consider Yule{Walker type estimators for the parameters of the GARCH(1,1) model and derive their asymptotic normality. We use a residual bootstrap to dene bootstrap estimators for the Yule{Walker estimates and prove the consistency of this bootstrap method. Some simulation results will demonstrate the small sample behaviour of the bootstrap procedure.
1 Introduction
Many time series exhibit non-constant conditional variance (conditional heteroskedastic- ity). Nonlinear processes capable of modelling such volatility have come into particular interest in time series analysis, especially in econometrics.
Conditional heteroskedasticity can be modelled by processes of the form
Xt =t"t t = 012::: (1.1)
where the innovations f"tg are independent identically distributed (i.i.d.) random vari- ables with mean zero and unit variance, and the volatility t describes the change of (conditional) variance.
In nancial time series such as stock returns or foreign exchange rates, volatility clus- tering has been observed for a long time, i.e. periods of large price changes are followed by periods of small price changes. This phenomenon can be modelled by autoregressive condi- tional heteroskedastic (ARCH) models introduced by Engle (1982), where the conditional variance 2t is a linear function of the squared past observations.
Bollerslev (1986) proposed the generalized ARCH or GARCH model by including also lagged values of 2t in the conditional variance equation. The GARCH(pq) model is
1
dened by (1.1) with 2t =0+Xp
j=1jX2t;j +Xq
i=1i2t;i 0 > 0 1:::p1:::q0 p > 0q > 0:
GARCH modelling allows a more exible lag structure than ARCH models and often permits a more parsimonious parametrization. Especially the GARCH(1,1) model has served as an appropriate model in many applications.
Over the past years, many semiparametric and nonparametric approaches to the ARCH model have been studied. For a variety of other extensions and applications of the ARCH model, we refer to the survey article by Bollerslev, Chou and Kroner (1992) and Engle (1995).
In the present work we investigateparameter estimationand bootstrap in GARCH(1,1) models. Rewriting the GARCH(1,1) model it becomes obvious that the squared process can be represented as an ARMA(1,1) (autoregressive moving-average) process. According to this Yule{Walker type estimators for the parameters of the GARCH(1,1) processes can be dened (Bollerslev (1986)). We prove stability of the GARCH(1,1) model and then derive asymptotic normality of the Yule{Walker estimators.
The bootstrap is a method for estimating or approximating the distribution of a statis- tic and its characteristics based on a resampling of the observed data. Since its introduc- tion by Efron for models with i.i.d. observations, there have been many applications and extensions of the bootstrap principle, also in case of dependent data. Bootstrap methods for e.g. nonparametric ARCH models have been studied by Franke, Kreiss and Mammen (1997), but they do not cover the GARCH case. A wild bootstrap method for quasi- maximum likelihood estimators of GARCH(1,1) models is proposed in Maercker (1998).
We use a residual bootstrap to dene bootstrap estimators for the Yule{Walker es- timates and prove the consistency of the bootstap procedure. Some simulation results demonstrate the small sample behaviour of the bootstrap method.
The paper is organized as follows. In Section 2 we state sucient conditions for the stability of GARCH(1,1) models. In Section 3 we dene Yule{Walker type estimators for the parameters in GARCH(1,1) processes and derive asymptotic normality of these estimators. In Section 4 we construct bootstrap estimators for the Yule{Walkerestimators.
We show that this bootstrap method works in the sense that it is consistent. In Section 5 some simulation results for the bootstrap method are presented. All proofs are deferred to the Appendix. Some additional tools and notations about Markov chain theory and mixing which are used for the proofs are also provided in the Appendix.
2
2 Stability of the GARCH(1,1) model
Assume that we are given observations X1:::Xn from the heteroskedastic model
Xt=t"t t2ZZ (2.2)
with innovations
f"tg i.i.d E"0 = 0 E"20 = 1 (2.3) and conditional variance
2t =! + X2t;1 +2t;1 t2ZZ where ! > 0: (2.4) The process dened by (2.2) and (2.4) is called GARCH(1,1) (generalized autoregres- sive conditional heteroskedastic) process (Bollerslev (1986)).
Combining (2.4) and (2.2) the conditional variance may be written as
2t =! + 2t;1( + "2t;1): (2.5) Iterating t in equation (2.4) and (2.5) respectively yields for h1
2t = !hX;1
k=0k +hX;1
k=0kX2t;1;k+h2t;h (2.6) 2t = !hX;1
k=0 k
Y
i=1
+ "2t;i
+2t;hYh i=1
+ "2t;i
(2.7)
where, as usual, empty products are set equal to one.
We make the following stability assumption on the model.
Assumption S
(S1) + < 1.
(S2) The distributionG of "0 has a Lebesgue densityg which is positive and continuous.
By Jensen's inequality, (S1) implies
Ehln( + "20)i< 0: (2.8) As is shown in Nelson (1990), condition (2.8) is necessary and sucient for the existence of a unique stationary solution of (2.5), which then is given by the innite series
2t =!X1
k=0 k
Y
i=1( + "2t;i) t2ZZ: (2.9) 3
In the following we will assume that f2tg is the stationary solution of (2.5) and hence may be represented by (2.9). In particular, as E"2t = 1, we have
= EX2t =E2t = !
1;( + ) <1: (2.10)
Furthermore, EP1h=0h2t;h = 1; < 1 impliesh2t;h ! 0 as h ! 1 almost surely, thus (2.6) may be extended to
2t = !1; +
1
X
k=0kX2t;1;k t2 ZZ: Set (xs) = (! + x2+s2)12. Then
Yt:= (Xtt)0=(Yt;1)("t1)0 t2ZZ
is a bivariate Markov process with state space IRq1;!1). With the help of a drift criterion we will establish geometric ergodicity and absolute regularity for this process.
Whereas stability of ARCH processes was investigated before by e.g. Guegan and Diebolt (1992), Doukhan (1994), and Borkovec and Kluppelberg (1998), the results do not cover the GARCH case.
For a denition and discussion of geometric ergodicity and absolute regularity we refer to the Appendix.
Lemma 2.1
Let Assumption S hold. Then the process fYtg is geometrically ergodic and absolutely regular. Furthermore, there exist constants c > 0 and > 1 such that the -mixing coecients of fYtg satisfy k c;k, k 2IN.We conclude this section with an ARMA-representation of the squared process fX2tg
(cf. Bollerslev (1986)) which will be used frequently in the sequel. Set t=X2t;2t =2t("2t;1):
Then, by (2.4), we have
X2t =! + ( + )X2t;1;t;1+t: (2.11) ThereforefX2tgis an ARMA(1,1) process with parameters+ and; and innovations
ftg. Dening Ft as the -eld generated byf"s:stg we note EtjFt;1] =2tE"2t;1] = 0 thus the innovations ftgform a martingale dierence sequence.
4
3 Yule{Walker type estimators { denition and asymptotic normality
Consider the centered squared GARCH process fX2t;g in the ARMA-representation
X2t;;( + )X2t;1;=t;t;1 (3.12) t=2t("2t;1), derived from (2.11). As is well established in ARMA models the empirical autocovariances of the process can be used to obtain Yule{Walker (YW) type estimators for the parameters , + , and , respectively. Observe that the squared processfX2tg
exhibits autocorrelation whereas the process itself is not correlated over time.
Let us assume that EX40 < 1. Conditions for the existence of moments are given in Bollerslev (1986) and Nelson (1990). As we will need even more stringent conditions for the proof of asymptotic normality of the YW type estimators the discussion of those conditions will be postponed, see Remark 3:2 below.
Set 2=E2t. RecallingEtjFt;1] = 0 we note thatEtX2t =2,EtX2t+1 =2 and EtX2s = 0 fors < t. For the derivation of suitable identitiesinvolvingthe covariances h = Cov(X20X2h), h 0, we now proceed in the usual way, see also Bollerslev (1986, 1988), by multiplying both sides in (3.12) with X2t;h, h = 0,1,..., and computing expectations.
This yields the following identities.
0;( + )1 = (1;)2 (3.13)
1;( + )0 = ;2 (3.14)
h;( + )h;1 = 0 h2: (3.15)
Elimination of 2 gives the system
+ = 21 (3.16)
;1; = ( + )1 ;0
1;( + )0 = 2;0
1 ;( + )0: (3.17) Using the empirical moments ^ = n1Pnt=1X2t and ^h = 1nPnt=1;h(X2t;)(X^ 2t+h;) we get^ the following YW-type estimators
( + )d n = ^^21 (3.18)
(;d1;)n = ^2 ;^0
^1;( + )d n^0 (3.19)
!^n = ^1;( + )d n: (3.20) 5
In order to derive estimators ^n and ^n of and , we set
^n;1+ ^n= ( + )d n+(;d1;)n: (3.21) Denoting the right-hand side of (3.21) by ^cnwe obtain ^2n;^cn^n+1 = 0. Hence, if ^cn 2, we set
^n= ^cn=2;qc^2n=4;1
so that 0 < ^n 1 and ^n < 1 if ^cn > 2. In practice it might happen that ^cn < 2, then set ^n = 0. But by construction and the ergodic theorem ^cn is a consistent estimate for 1= + and therefore, almost surely, ^cn > 2 for suciently large n. Finally we dene
^n =( + )d n; ^n:
Again, by the ergodic theorem, ^n = (^n ^n ^!n)0 is a consistent estimator for = (!)0.
Next, under additional moment assumptions, the asymptotic normality of ^n will be shown. We may use this to construct condence sets for . However, the normal distri- bution is only an approximation to the exact distribution of n. An alternative approach which often yields a better approximation is the bootstrap. Bootstrap condence intervals are obtained by replacing the unknown distribution with its bootstrap estimator. We shall introduce a bootstrap method and study the consistency of this procedure.
Before doing so it should be mentioned that, as well in GARCH as in ARMA mod- els, the YW estimators are less ecient than maximum likelihood (ML) estimators, or, depending on the error distribution, quasi-maximum likelihood (QML) estimators which are commonly used in practice. Indeed, simulation experiments underline this fact. Under normality of the innovations, ML estimators outperform the YW estimators. This is not necessarily true for QML estimators in case of non-normal innovation distributions.
QML estimates are found by an iterative procedure where the YW estimates may be used as initial estimates. In contrast to this in the YW type estimation procedure the observed data are used in a more direct way. This estimation procedure will be imitated by an appropriate bootstrap technique. A bootstrap method for QML estimators is discussed in Maercker (1997, 1998).
For the derivation of asymptotic normality of ^n we shall make use of a central limit theorem (CLT) for strongly mixing processes.
Theorem 3.1
Let Assumption S hold and suppose EjX0j8+ <1 for some > 0. Thenpn^n;;D!N((000)0) (3.22) 6
if is positive denite where =D2D1~D01D20, D1 =
0
B
B
@
1 0 0
;
1; 2 0
2 1 0
; 0 1
1
C
C
A D2 =
0
B
@
1; ; 0
0
0 0 1
1
C
A = 12 ; (;1+) 4q(;1+)2=4;1 and ~ = E20Z0Z00 with
Zt1 = 1;1
h
;X2t;1+X2t;2 ;(1;)i
Zt2 = (1;( + )0);1(1;2)t;1 +2( + ) + ;;;1Xt;1
+X2t;2+;1;(1;) + !(1;) Zt3 = 1;:
Remark 3.2
As ^nis based on the empirical autocovariances ^h, i.e. on lagged empirical fourth moments ofXt, the need for the rather stringent moment conditionEjX0j8+ <1 in Theorem 3.1 becomes obvious. Under Assumption S expansion (2.9) holds and, by Minkowski's inequality, for any p > 0 a sucient condition for EjX0j2p < 1 is given by E ( + "20)p < 1. This condition is also necessary, see Nelson (1990). For the case of normally distributed innovations "t the restrictions on the parameter space implied by EX80 < 1 and EX010 < 1 are, among others, illustrated in Figure 3.1 of Bollerslev (1986).4 The bootstrap procedure
We now discuss a bootstrap method for estimating the distribution of pn^n;. We use a residual bootstrap to construct bootstrap estimators. It will be shown that the (conditional) distribution of these bootstrap estimators converges in probability to the same asymptotic distribution as given in Theorem 3:1 for the original estimators, that is, the bootstrap procedure is (weakly) consistent.
Given a sampleX1:::Xn, the bootstrap process fXtg will be of the form
Xt =t"t t2 = ^! + ^Xt;21+ ^t;21 t2ZZ (4.23)
f"tgi.i.d E"0 = 0 E"02= 1 (4.24) 7
where the distributionG of "0 is an estimate of the distributionG of "0 and E denotes the conditional expectationE jX1:::Xn]. The distribution L(pn(^;)) will then be approximated by the (conditional) distributionL(pn(^;^)) where ^ = (^ ^ ^!) is calculated in the same way as ^, with X1:::Xn replaced byX1:::Xn. For notational simplicity here and later the indexn indicating the dependence of the estimators and the bootstrap process on the number of observations will be omitted.
In detail, the construction of fXtg consists of the following steps. Compute the Yule- Walker estimate ^ as described in Section 2. Set ^20 = ^ and dene
^2t = ^! + ^X2t;1 + ^^2t;1 t = 1:::n or equivalently,
^2t = ^!Xt;1
k=0 ^k+ ^tX;1
k=0 ^kX2t;1;k + ^t t = 1:::n:^ (4.25) Calculate empirical residuals
"^t= X^tt t = 1:::n
and let ^G(x) = n1 Pnt=11f"^t xg denote their empirical distribution. In view of Assump- tion S smooth ^G by convolution and set ~G = ^GN(0h2) where h = n;15. Dene the distribution G of "0 as the standardized form of ~G, i.e. G(x) = ~G(~Gx + ~G) where ~G = 1nPnt=1"^t and 2~G = 1nPnt=1(^"t;~G)2+h2 are the mean and variance of ~G, respec- tively. As ^! almost surely we may assume
! ^ ^ > 0 and ^ + ^ < 1:^ (4.26) Finally, dene the bootstrap GARCH process fXtg as the stationary solution of (4.23).
In particular, we have as in (2.9) t2 = ^!X1
k=0 k
Y
i=1
^+ ^"t;2i
t2ZZ: (4.27)
Furthermore, by construction and (4.26),fXtgfullls Assumption S. Hence, the conclu- sions of Lemma 2.1 apply and Yt = (Xtt)0 is a geometrically ergodic and absolutely regular process with exponential decay of the mixing coecients.
The density g of the bootstrap innovations is given by g(x) = nh~G
n
X
t=1' ~Gx + ~G;"^t
h
!
8
where '(x) = p12e;12x2. Hence,g may be understood as a standardized kernel estimate of g with kernel ' and bandwidth h. We have chosen h = n;15 as the rate common in kernel smoothing.
For the consistency proof of the bootstrap proposal we need the fact that, at least on average, ^2t is a good estimate for2t, as well as the consistency of g and the moments of the bootstrap process. Let kfk1 = supx2IRjf(x)j for f:IR!IR.
Lemma 4.1
Let Assumption S hold and suppose EjX0j2p <1 for somep > 4. (a) For any r1n1 n
X
t=1
^2t;2tr =OP
n;p^r2 :
(b) ~G =OP
n;12 2~G;1=OP
n;12: (c) If g is uniformly continuous then
kg;gk1 =oP(1):
(d) For q 2(02p), ki 2f02g and ti2ZZ, i = 1:::4, Ej"0jq
!Ej"0jq EY4
i=1tiki !EY4
i=1ktii EY4
i=1Xtiki !EY4
i=1Xtkii
in probability. Furthermore, there is a constant c > 0 such that for any subsequence (k)IN there exists a subsequence (k`)(k) such that almost surely
limsup`
!1
E0q c limsup`
!1
EjX0jq
c:
In particular,
E0q=OP(1) EjX0jq =OP(1):
As already observed, the Markov process fYtgis -mixing with exponential decay of the mixing coecients. The parameters ^ and g determining the process fYtg converge in probability to and g, respectively, and so even more can be said about the -mixing coecientsn(j) = (j), j 2 IN, of fYtg. This is done in the following theorem where we nd it convenient to phrase arguments concerning convergence in probability in terms of almost sure convergence along subsequences.
9
Theorem 4.2
Let Assumption S hold. Suppose that EX40 < 1 and that g is uniformly continuous. Then for any subsequence (k) IN there exist a subsequence (k`)(k) and constants Cb > 0 andb > 1 such that almost surelyk`(j)Cb;bj for all `j 2IN:
As a corollary we obtain the consistency of the bootstrap estimators in the sense of, for example,P(j^;j > ) = oP(1) for all > 0. This is not immediate from the ergodic theorem applied to the bootstrap process fYtg, as for each sampleX1:::Xn there is a dierent process fYtgunder consideration.
Corollary 4.3
Let Assumption S hold and suppose EjX0j8+ <1 for some > 0. Then ^, ^h, h0, and ^ are consistent for, h, h0, and in the sense mentioned above.Now we are ready for the main result of this section, which states the consistency of the proposed bootstrap procedure.
Theorem 4.4
Let Assumption S hold. Suppose that EjX0j8+ < 1 for some > 0 and that g is uniformly continuous. ThenL
pn^;^;D! N((000)0) in prabability (4.28) if as dened in Theorem 3:1 is positive denite.
5 Simulations
In order to illustrate the performance of the bootstrap procedures described in the preced- ing section, we show some results of simulation experiments. We simulate GARCH(1,1) processes of lengthn = 1000 with standard normal error distribution and with parameter = (!). The parameter is estimated by the Yule{Walker type estimator. We repeat this procedure to estimate the distribution of the YW estimator. More specically, the distribution of the standardized estimator is approximated by the estimated density cal- culated from 2500 Monte Carlo replications. Then the bootstrap approximation of this distribution is calculated. To this aim we calculate the YW estimator ^n from one simu- lated GARCH(1,1) process of lengthn = 1000. Based on ^n we generate 2500 bootstrap processes and calculate the bootstrap YW estimator for each bootstrap sample of length n = 1000. The estimated density for the standardized bootstrap estimator calculated
10
from the 2500 bootstrap replications is plotted against the distribution of the original YW estimator.
It should be remarked that the YW estimation procedure is not very stable if () are chosen near the admissible parameter space with respect to the moment condition.
We show the results for = (0:10:40:1). Figure 1 compares the distribution of
pn(^n;) with the bootstrap distribution of pn(^n ;^n). Note that the bootstrap procedure is based on only one (randomly chosen) sample of the underlying GARCH process.
Figure 2 and Figure 3 show the results for the parameters and !, respectively.
A Appendix
A.1 Proofs
Proof of Lemma2:1:We will show thatfYtgis-irreducible with being the Lebesgue measure restricted to IRq1;!1), aperiodic, and that compact sets are small. Then we shall apply the drift criterion given in Theorem A.4 to obtain geometric ergodicity.
In order to avoid a parameter-dependent state space and with an eye to situations wherefYtgmay be started non-stationarily at timet = 0, we will formulate the proof for the state space IRIR+.
Let A IR q1;!1) be measurable with (A) > 0, denoting the Lebesgue measure. Without loss of generality we may assume A IRq1;! +1) for some > 0.
We will show that, given y 2 IR IR+, there exists an m 2 IN such that for all measurable sets A0 q1;! +1) with positive Lebesgue measure we have P (m 2A0jY0 =y) > 0. This implies P (Ym 2A jY0 =y) > 0 since Ym = m("m1)0 where "m is independent of m and has positive Lebesgue density.
Let m(xs) = !Pmk=0;1k +m;1(s2+x2). Then given Y0 = y 2 IRIR+ we can nd positive functions hi("i+1:::"m;1),i = 1:::m;1, such that (2.7) takes the form
2m =m(y) +mX;1
i=1 "2ihi("i+1:::"m;1):
LetA00=fr2 jr 2A0g. Choosing m large enough, we have m(y);1;!< and hence P (m 2A0jY0 =y) = P2m 2A00 jY0 =y
11
Fig. 1: Distribution of pn(^n;) with bootstrap approximation and for simulated GARCH(1,1) processes with (!) = (0:10:40:1), sample size n= 1000. All plots are based
on 2500 Monte Carlo replications.
12
Fig. 2: Distribution ofpn(^n;) with bootstrap approximation. Parameter as in Figure 1.
13
Fig. 3: Distribution ofpn(^!n;!) with bootstrap approximation. Parameter as in Figure 1.
14
= Z :::Z 1
(mX;1
i=1 z2ihi(zi+1:::zm;1)2A00;m(y)
)mY;1
i=1 g(zi)dz1:::dzm;1
> 0
as g > 0 and A00;m(y) IR+. Thus fYtg is -irreducible with being the Lebesgue measure restricted to IRq1;!1).
Using Lemma A:3, the aperiodicity follows similarly. Choose the compact set A = 01]q1;!q1;! + 1], say, and m1 2IN such that
maxy
2A
m(y); ! 1;
< mm1:
Then, with the same arguments as above, for all B A with (B) > 0 we obtain P (Ym 2B jY0 =y) > 0 and P (Ym+1 2B jY0 =y) > 0 for all y2B:
In order to prove that compact sets are small, consider the 2-step transition probability and letC = ;MM]0M] for some M > 0. We obtain for any Borel set AIRIR+ and anyy2C
P (Y2 2AjY0 =y) = Z Z 1q! + (z21+)2(y)(z21)02Ag(z1)g(z2)dz1dz2
Z Z 1fu(z21)02AggC(u)g(z2)dudz2 (1.29) where we substituted u =q! + (z21+)2(y), used up!, 2(y)M2+! and, if z1 = 0, uM + 2!, and put
gC(u) = 1M+2!1)(u)q(M2!+!)u yinf
2Cg ru2;!;2(y)2(y)
!
: As g is positive and continuous, gC is positive on M + 2!1), and
C() =ZZ 1fu(z1)02 ggC(u)g(z)dudz denes a non-trivial measure onIRIR+ with
P (Y2 2Aj Y0 =y) C(A) for all y 2C and all Borel setsAIRIR+: Thus we have shown that C is small.
15
Now we are ready to apply the drift criterion given in Theorem A.4. Set d =
12
1
+ 1;1;1 and dene the test functionV (y) = 1+dx2+s2 fory = (xs)02IRIR+. Then we haved > 0 and, as E"21 = 1,
V (y) = Eh1 +dX12 +21 jY0 =yi;(1 +dx2+s2)
= (1 +d)(! + x2+s2);(dx2+s2)
= (1;)x2
1; ;d+s2 1 +d; 1
!
+ (1 +d)!
= ;h(1;)x2+s2)i1 2 1
; 1
1;
!
+ (1 +d)!:
Now + < 1 implies < 1 ; and 1 ; 1;1 > 0. Dening the constants =
14 max(1d)
1
; 1
1;
= 4max(1d)(11;(+);) and b = (1 + d)! + 2, and the compact set C = ;qbqb]0qb] we thus have 2(012) and
V (y) ;2V (y) + (1 + d)! + 2
;V (y) + b1C(y) for all y 2IRIR+: (1.30) Thus the drift criterion holds. As EV (Y0) = 1 + (d + 1) < 1, this concludes the proof of the lemma.
Proof of Theorem 3:1: In order to avoid cumbersome notation, we will not distinguish between, say,1 Pnt=1xt and Pnt=h+1xt. So, for instance, we will write ^h =
nPX2tX2t;h;^2 neglecting terms of orderOP(n;1).
In a rst step we will prove joint asymptotic normality of ( + )d n = ^^21
(;g1;)n = ^2;^0
^1;( + )^0
!~n = ^(1;( + )):
By (3.12) we have
pn(~!n;!) = pn(^;)(1;( + )) = (1;)n;12Xt: (1.31) Furthermore
pn( + )d n;( + ) = ^1;1
pn^2;( + )^1
16
= ^1;1 1
pn
XX2t;2
hX2t;^;( + )X2t;1 ;^i
= ^1;1 1
pn
XX2t;2(t;t;1+ (1;( + ))(;))^
= ^1;1 1
pn
Xt
;X2t;1+X2t;2;(1;)^
using (1.31). In a similar way we get ^1;( + )^0 = 1nXt
;X2t+X2t;1 ;(1;)^
^2 ;^0 = 1nXh( + )X2t;1+t;t;1
i
X2t;2;X2t
= 1nXt
X2t+1;X2t;X2t;1+X2t;2
and therefore
^2;^0;
;1;^1;( + )^0
= 1nXt
X2t+1;X2t; + ;1;X2t;1 +X2t;2+;1;(1;)^: Observing that
X2t+1;X2t = ! + X2t+t+1;t = !(1 + ) + ( + )X2t;1;2t;1+t+1
we conclude that
pn(;g1;)n;;1; = (^1;( + )^0);1 1
pn
Xt
(1;2)t;1+ +2( + ) + ;;;1X2t;1+X2t;2+;1;(1;)^ + !(1 + ): By the ergodic theorem we have ^ ! and ^h !h almost surely. In order to prove
pn( + )d n;( + )(;g1;)n;;1; ~!n;!;D!N0 ~ (1.32) it is therefore sucient to show
1
pn
Xtc0Zt D
;!N
0c0~c c2IR3 (1.33) 17
using the Cramer-Wold device. AsZt isFt;1-measurable and E t jFt;1] = 0 we observe that ftZt(c)gis a martingale dierence sequence and hence
1
X
t=;1Cov(tc0Zt0c0Z0) =c0E02Z02c:
An application of the CLT for strongly mixing sequences, see Appendix A.2, now gives (1.33).
For the next step we note
(;d1;)n;(;g1;)n = (;d1;)n^0
^1;( + )^0
( + )d n;( + )
!^n;!~n = ;^( + )d n;( + )
Hence (1.32), the relationship 1;( + )0 =;2 and the ergodic theorem imply
pn( + )d n;( + )(;d1;)n;;1; ^!n;!;D!N0D1~D10
: An application of the delta method, using the function
T(xyz) =
0
B
B
@
x; x+y2 +q(x + y)2=4;1
x+y2 ;
q(x + y)2=4;1 z
1
C
C
A (1.34)
with derivative rT( + ;1;!) = D2, concludes the proof of the theorem.
Proof of Lemma 4:1: (a) >From (2.6) and (4.25) we have ^2t;2t =tX;1
k=0
h!^^ k ;!k+^^ k;kX2t;1;k
i+ ^t^;t20:
Choose b 2 (1) and set Bn = f^ bg. If 1 < r p, application of the Holder inequality withr and q = (1; 1r);1 gives
t;1
X
k=0bkX2t;k;1
!r
t;1
X
k=0bk q2
! r
q Xt;1
k=0bk r2 Xt2r;k;1
!
and hence
E 1nXt=1n k=0tX;1bkX2t;k;1
!r
1;bq2;rq 1;b2r;1EX02r 18