QML estimation of a class of multivariate GARCH models without moment conditions on the observed process

(1)

QML estimation of a class of multivariate GARCH models without moment

conditions on the observed process

Francq, Christian and Zakoian, Jean-Michel

Université Lille 3 GREMARS-EQUIPPE, CREST

February 2010

Online at https://mpra.ub.uni-muenchen.de/20779/

MPRA Paper No. 20779, posted 19 Feb 2010 23:38 UTC

(2)

MODELS WITHOUT MOMENT CONDITIONS ON THE OBSERVED PROCESS

By Christian Francq

University Lille 3, EQUIPPE-GREMARS and

By Jean-Michel Zakoïan

CREST and University Lille 3

We establish the strong consistency and asymptotic normality of the quasi-maximum likelihood estimator of the parameters of a class of multivariate GARCH processes. The conditions are mild and coincide with the minimal ones in the univariate case. In particular, contrary to the current literature on the estimation of multivariate GARCH models, no moment assumption is made on the observed process. Instead, we require strict stationarity, for which a necessary and sufficient condition is established.

1. Introduction. Since the inception of the univariate ARCH and GARCH models by Engle (1982) and Bollerslev (1990), a wide variety of multivariate ex- tensions have been proposed. Recent reviews on the rapidly changing literature on multivariate GARCH models are Bauwens, Laurent and Rombouts (2006), Silven- noinen and Teräsvirta (2009).

Although the asymptotic theory for multivariate GARCH has been less investi- gated than for univariate models, several papers have established asymptotic results for different specifications. Jeantheau (1998) gave general conditions for the strong consistency of the QMLE for multivariate GARCH models. Comte and Lieberman (2003) showed the consistency and the asymptotic normality of the Quasi Maximum Likelihood Estimator (QMLE) for the BEKK formulation. Asymptotic results were

1

(3)

established by Ling and McAleer (2003) for the CCC formulation of an ARMA- GARCH, by Hafner and Preminger (2009a) for the Vec model.

In all these references, moment assumptions are made on the observed process.

Given that the existence of such moments is doubtful for many financial series, such conditions can be restrictive. To our knowledge, consistency and asymptotic normality results for multivariate GARCH without moments restriction have only been established by Hafner and Preminger (2009b), for a factor model of the form FF-GARCH. However, their model is a first-order model (it reduces to the stan- dard GARCH(1,1) when the dimension is one). For univariate GARCH(p, q), it took almost twenty years to reach minimal assumptions for the strong consistency (SC) and the asymptotic normality (AN) of the QMLE. The most significant break- through in this direction was the paper by Berkes, Horváth and Kokoszka (2003), although slightly weaker conditions can be found in Francq and Zakoian (2004).

The main contribution of this article is to provide asymptotic results for the Con- stant Conditional Correlation (CCC) GARCH(p, q) under conditions which parallel those used in the univariate setting. The CCC-GARCH(p, q), introduced by Boller- slev (1990) and generalized by Jeantheau (1998), is undoubtedly one of the most popular multivariate GARCH models. The attractiveness of this class follows from its tractability: i) the number of unknown coefficients is less than in other specifications; ii) the conditions ensuring definite positiveness of the conditional variance are simple and explicit. Moreover, as we will see, the conditions ensuring the existence of strictly stationary solutions are explicit. Of course, more sophisticated classes of models can be seen as more realistic. This is in particular the case of the Dynamic Conditional Correlation (DCC) model introduced by Engle (2002), and studied by Engle and Sheppard (2001) and Nakatani and Teräsvirta (2009), among others. For such models, however, establishing a sound asymptotic theory of estimation seems a formidable task. We view the results of this paper as a first step in this direction.

An outline of the paper can be given as follows. In Section 2, we discuss the model assumptions and establish the strict stationarity condition. In Section 3 our

(4)

main results concerning the asymptotic properties of the QMLE are stated. Proofs are relegated to Section 4,

2. Model and strict stationarity condition. Let(ǫt)denote a vector process with dimension m×1. The process (ǫt) is called a CCC-GARCH(p, q) if it verifies











ǫt = H_t^1/2ηt,

Ht = DtRDt, D_t²=diag(h_t)

h_t = ω+

q

X

i=1

A_iǫ_t−i+

p

X

j=1

B_jh_t−j, ǫ_t= ǫ²_1t,· · ·, ǫ²_mt′

(2.1)

where R is a correlation matrix, ω is a vector of size m×1 with strictly positive coefficients, theA_iandB_jare matrices of sizem×mwith positive coefficients, and (ηt)is an iid sequence of centered variables onR^mwith identity covariance matrix.

The CCC model was introduced by Bollerslev (1990) in a simplest version, assuming that the matricesA_iandB_j are diagonal. By contrast, in (2.1) the conditional variancehkk,t of thek-th component ofǫtdepends not only on its past values but also on the past values of the other components. For this reason, Model (2.1) is referred to as theExtendedCCC model by He and Teräsvirta (2004).

In the latter reference, a sufficient condition for second-order and strict stationarity of a CCC-GARCH(1,1) is given. A sufficient condition for strict stationarity and the existence of fourth-order moments of the CCC-GARCH(p, q) is established in Aue, Hörmann, Horváth, and Reimherr (2009). Our first result provides a necessary and sufficient strict stationarity condition for the same model.

Write

ǫt=Dtη˜t, where η˜t=R^1/2ηt (2.2)

(5)

is a centered vector with covariance matrixR. Thus

ǫ_t= Υth_t, where Υt=







˜

η²_1t 0 . . . 0 0 . ..

... . ..

0 . . . η˜_mt²





 .

Let the(p+q)m×(p+q)mmatrix

Ct=







ΥtA₁ · · · ΥtA_q ΥtB₁ · · · ΥtB_p

Im 0 · · · 0 0 · · · 0

0 Im · · · 0 0 · · · 0

... . .. ... ... ... . .. ... ...

0 . . . Im 0 0 . . . 0 0

A₁ · · · A_q B₁ · · · B_p

0 · · · 0 Im 0 · · · 0

0 · · · 0 0 Im · · · 0

... . .. ... ... ... . .. ... ...

0 . . . 0 0 0 . . . Im 0







(2.3)

We are now in a position to state the following result.

Theorem 2.1. A necessary and sufficient condition for the existence of a strictly stationary and non anticipative solution process to Model (2.1) isγ(C₀)<0, where γ(C₀) is the top Lyapunov exponent of the sequence C₀ = {Ct, t ∈ Z} defined in (2.3). This stationary and non anticipative solution, whenγ(C₀)< 0, is unique and ergodic.

The following result provides a necessary strict stationarity condition which is simple to check. Denote by det(A)or |A|the determinant of a square matrixA.

(6)

Corollary2.1. Let the matrix polynomial defined by:B(z) =Im−zB₁−. . .− z^pB_p, z∈C.Let

B=







B₁ B₂ · · · B_p Im 0 · · · 0

0 Im · · · 0 ... . .. ... ... 0 · · · Im 0





 .

Then, ifγ(C₀)<0 the following equivalent properties hold:

1. The roots ofdetB(z)are outside the unit disk, 2.ρ(B)<1.

The following result will be extremely useful to prove the CAN of the QMLE under minimal conditions.

Corollary2.2. Supposeγ(C0)<0. Letǫtbe the strictly stationary and non anticipative solution of Model (2.1). There existss >0 such thatEkh_tk^s<∞and Ekǫtk^2s<∞.

3. QML estimation. The parameters consist of the coefficients of the matrices ω,A_i and B_j, and the coefficients of the lower triangular part (excluding the diagonal) of the correlation matrixR= (ρij). The number of unknown parameters is thus

s0=m+m²(p+q) +m(m−1)

2 .

The parameter vector is denoted

θ= (θ1, . . . , θs0)^′= (ω^′, α^′₁, . . . , α^′_q, β₁^′, . . . , β_p^′, ρ^′)^′:= (ω^′, α^′, β^′, ρ^′)^′,

whereρ^′= (ρ21, . . . , ρm1, ρ32, . . . , ρm2, . . . , ρm,m−1),αi= vec(A_i), i= 1, . . . , q,and βj= vec(B_j), j= 1, . . . , p. The parameter space is a sub-spaceΘof

]0,+∞[^m×[0,∞[^m²^(p+q)×]−1,1[^m(m−1)/2.

(7)

The true parameter valued is denoted

θ0= (ω^′₀, α^′₀₁, . . . , α^′_0q, β₀₁^′ , . . . , β_0p^′ , ρ^′₀)^′ = (ω^′₀, α^′₀, β^′₀, ρ^′₀)^′.

Before detailing the estimation procedure and its properties, we discuss conditions to impose on the matricesA_iandB_j in order to ensure the uniqueness of the parameterization.

3.1. Identifiability Conditions. Let A^θ(z) = Pq

i=1A_izⁱand B^θ(z) = Im − Pp

j=1B_jz^j.By convention,A^θ(z) = 0 ifq= 0andB^θ(z) =Im ifp= 0.

If the roots of det(B^θ(z)) = 0 are outside the unit disk, we deduce from B^θ(B)h_t=ω+A^θ(B)ǫ_t the representation

h_t=B^θ(1)⁻¹ω+B^θ(B)⁻¹A^θ(B)ǫ_t. (3.1) In the vector case, assuming that the polynomials A^θ⁰ and B^θ⁰ have no common root does not suffice to ensure that there exists no other pair (A^θ,B^θ), with the same degrees(p, q), such that

B^θ(B)⁻¹A^θ(B) =B^θ⁰(B)⁻¹A^θ⁰(B). (3.2) This condition is equivalent to the existence of an operatorU(B)such that

A^θ(B) =U(B)A^θ⁰(B) and B^θ(B) =U(B)B^θ⁰(B), this common factor vanishing inB^θ(B)⁻¹A^θ(B)

The polynomialU(B)is calledunimodularifdet{U(B)} is a non-zero constant.

When the only common factors of the polynomialsP(B)andQ(B)are unimodular, that is when

P(B) =U(B)P1(B), Q(B) =U(B)Q1(B) =⇒det{U(B)}= cst, P(B)andQ(B)are calledleft coprime.

The following example shows that, in the vector case, assuming thatA^θ⁰(B)and B^θ⁰(B)are left coprime is not sufficient to ensure that (3.2) has no solutionθ6=θ0

(8)

(in the univariate case this is sufficient because the condition B^θ(0) =B^θ⁰(0) = 1 imposesU(B) =U(0) = 1).

Example3.1 (Non identifiable bivariate model). Form= 2, let A_θ₀(B) = a11(B) a12(B)

a21(B) a22(B)

!

, B_θ₀(B) = b11(B) b12(B) b21(B) b22(B)

!

,

U(B) = 1 0 B 1

!

with deg(a21) = deg(a22) = q, deg(a11) < q, deg(a12) < q and deg(b21) = deg(b22) = p, deg(b¹¹) < p, deg(b¹²) < p.The polynomial A(B) = U(B)Aθ0(B)has the same degree q asAθ0(B), and B(B) =U(B)Bθ0(B)is a polynomial of the same degree pas Bθ0(B).On the other hand,U(B)has a non-zero determinant which is independent ofB, hence is it unimodular.

MoreoverB(0) =B_θ₀(0) =Im and A(0) =A_θ₀(0) = 0.It is thus possible to findθ such that B(B) = Bθ(B),A(B) =Aθ(B)and ω=U(1)ω⁰. The model is thus non identifiable,θ and θ0

corresponding to the same representation (3.1).

Identifiability can be insured by several types of conditions (see for instance Reinsel, 1997, p. 37-40). To obtain a mild condition define, for any column i of the matrix operators A^θ(B) and B^θ(B), the maximal degreesqi(θ) and pi(θ), respectively. Suppose that these maximal values are imposed for these orders, that is

∀θ∈Θ, ∀i= 1, . . . , m, qi(θ)≤qi and pi(θ)≤pi (3.3) where qi ≤ q and pi ≤ p are fixed integers. Denote by aq_i(i) (resp. bp_i(i)) the column vector of the coefficients of B^qⁱ (resp. B^pⁱ) in the i^th column of A^θ⁰(B) (resp.B^θ⁰(B)).

Example3.2 (Illustration of the notations on an example). For A_θ₀(B) = 1 +a11B² a12B

a21B²+a^∗₂₁B 1 +a22B

!

, B_θ₀(B) = 1 +b11B⁴ b12B b21B⁴ 1 +b22B

!

,

witha11a21a12a22b11b21b12b226= 0, we have

q1(θ0) = 2, q2(θ0) = 1, p1(θ0) = 4, p2(θ0) = 1

(9)

and

a2(1) = a11

a21

!

, a1(2) = a12

a22

!

, b4(1) = b11

b21

!

, b1(2) = b12

b22

!

.

Proposition 3.1 (A simple identifiability condition). If the matrix

M(A^θ⁰,B^θ⁰) = [aq1(1)· · ·aq_m(m) bp1(1)· · ·bp_m(m)] (3.4) has full rank m, the parameters α0 and β0 are identified by the constraints (3.3) withqi=qi(θ0)andpi=pi(θ0)for any value ofi.

Proof. Indeed, letU(B) =U0+U1B+. . .+UkB^k.Since the term of highest degree (column by column) ofA^θ⁰(B)is[aq1(1)B^q¹· · ·aqm(m)B^q^m], theith column ofA^θ(B) =U(B)A^θ⁰(B)is a polynomial inB of degree less thanqi if and only if Ujaq_i(i) = 0, forj= 1, . . . , k.Similarly we must haveUjbp_i(i) = 0, forj= 1, . . . , k and i = 1, . . . m. It follows that UjM(A^θ⁰,B^θ⁰) = 0, which implies Uj = 0 for j= 1, . . . , kthanks to Condition (3.4). ConsequentlyU(B) =U0 and, since for all

θB^θ(0) =Im, we haveU(B) =Im.

Example3.3 (Illustration of the identifiability condition). In example 3.1, M(Aθ0,B_θ₀) = [aq(1)aq(2)bp(1)bp(2)] =

"

0 0 0 0

× × × ×

#

is not a full-rank matrix. Hence, the identifiability condition of Proposition 3.1 is not satisfied.

Indeed, the model is not identifiable.

A simpler, but more restrictive, condition is obtained by imposing that M1(A^θ⁰,B^θ⁰) = [A_q B_p]

has full rank m. This entails uniqueness under the constraint that the degrees of A^θandB^θ are less thanpandq, respectively.

Example 3.4 (Another illustration of the identifiability condition). Turning again to Example 3.2 with a12b21 = a22b11 and, for instance, a21 = 0 and a22 6= 0, observe

(10)

that the matrix

M1(Aθ0,B_θ₀) =

"

0 a12 b11 0 0 a22 b21 0

#

does not have full rank, but the matrix

M(Aθ0,Bθ0) =

"

a11 a12 b11 b12

0 a22 b21 b22

#

has full rank.

3.2. Asymptotic Properties of the QML Estimator of the CCC-GARCH. Let (ǫ1, . . . , ǫn)be an observation of lengthnof the unique non anticipative and strictly stationary solution(ǫt)of Model (2.1). Conditionally to nonnegative initial values ǫ0, . . . , ǫ1−q,˜h₀, . . . ,˜h_1−p, the Gaussian quasi-likelihood writes

Ln(θ) =Ln(θ;ǫ1, . . . , ǫn) =

n

Y

t=1

1

(2π)^m/2|H˜t|^1/2exp

−1

2ǫ^′_tH˜_t⁻¹ǫt

,

where theH˜tare recursively defined, fort≥1, by











H˜t = D˜tRD˜t, D˜t={diag(˜h_t)}^1/2

˜h_t = ˜h_t(θ) =ω+

q

X

i=1

A_iǫ_t−i+

p

X

j=1

B_j˜h_t−j

A QML estimator ofθis defined as any measurable solutionθˆn of θˆn=arg max

θ∈Θ

Ln(θ) =arg min

θ∈Θ

˜l_n(θ). (3.5)

where

˜l_n(θ) =n⁻¹

n

X

t=1

ℓ˜t, et ℓ˜t= ˜ℓt(θ) =ǫ^′_tH˜_t⁻¹ǫt+ log|H˜t|.

The following assumptions will be used to establish the strong consistency of the QML estimator.

A1: θ0∈ΘandΘis compact.

A2: γ(C₀)<0 and ∀θ∈Θ, |B^θ(z)|= 0⇒ |z|>1.

A3: The components ofηt are independent and their squares have non degen- erate distributions.

(11)

A4: Ifp >0,A^θ⁰(z)andB^θ⁰(z)are left coprime andM1(A^θ⁰,B^θ⁰)has full rank m.

A5: Ris a positive-definite correlation matrix for allθ∈Θ.

If the spaceΘis constrained by (3.3), that is if maximal orders are imposed for each component ofǫ_t andh_tin each equation, AssumptionA4can be replaced by the more general condition:

A4’: If p >0, A^θ⁰(z) andB^θ⁰(z)are left coprime and M(A^θ⁰,B^θ⁰)has full rankm.

It will be useful to approximate the sequence(˜ℓt(θ))by an ergodic and stationary sequence. Assumption A2 implies that there exists a strictly stationary, non anticipative and ergodic solution(h_t)_t={h_t(θ)}tof

h_t=ω+

q

X

i=1

A_iǫ_t−i+

p

X

j=1

B_jh_t−j, ∀t. (3.6)

Now, lettingDt={diag(h_t)}^1/2andHt=DtRDt,we define l_n(θ) =l_n(θ;ǫn, ǫn−1. . . ,) =n⁻¹

n

X

t=1

ℓt, ℓt=ℓt(θ) =ǫ^′_tH_t⁻¹ǫt+ log|Ht|.

We are now in a position to state the following consistency theorem.

Theorem 3.1 (Strong consistency). Let (ˆθn) a sequence of QML estimators satisfying (3.5). Then, underA1-A5(orA1-A4’-A5),

θˆn→θ0, almost surely whenn→ ∞.

To establish the asymptotic normality we require the following additional assumptions.

A6: θ0∈Θ, where^◦ Θ^◦ is the interior ofΘ.

A7: Ekηtη^′_tk²<∞.

(12)

Theorem3.2 (Asymptotic normality). Under the assumptions of Theorem 3.1 andA6-A7√n(ˆθn−θ0)converges in distribution toN(0, J⁻¹IJ⁻¹),whereJ is a positive-definite matrix andI is a semi positive-definite matrix, defined by

I = E

∂ℓt(θ0)

∂θ

∂ℓt(θ0)

∂θ^′

, J =E

∂²ℓt(θ0)

∂θ∂θ^′

.

It is worth noting that the conditions ensuring the CAN are mild. Whenm= 1, they reduce to the minimal ones in the univariate setting. In particular, no assumption is made concerning the existence of moments of the observed process.

4. Proofs.

4.1. Proof of Theorem 2.1. The proof is similar to that given by Bougerol and Picard (1992) for univariate GARCH(p, q) models. The variables ηt admitting a variance, the conditionElog⁺kCtk<∞is satisfied.

It follows that whenγ(C0)<0 the series

˜

z_t=b_t+

∞

X

n=0

CtCt−1. . . Ct−nb_t−n−1 (4.1) converges almost surely for all t. A strictly stationary solution to model (2.1) is obtained asǫt ={diag(˜z_q+1,t)}^1/2R^1/2ηt where z˜_q+1,t denotes the(q+ 1)th sub- vector of sizemofz˜_t. This solution is thus non anticipative and ergodic. The proof of the uniqueness is exactly the same as in the univariate case.

The proof of the necessary part can also be easily adapted. From Bougerol and Picard (1992) Lemma 3.4, it is sufficient to prove thatlimt→∞kC0. . . C−tk= 0. It suffices to show that, for1≤i≤p+q

t→∞lim C0. . . C−te_i= 0, a.s. (4.2) wheree_i=ei⊗Imandeiis theith element of the canonical base ofR^p+q, since any vectorxof R^m(p+q)can be decomposed, in a unique way, as x=Pp+q

i=1e_ixi where xi ∈R^m. As in the univariate case, the existence of a strictly stationary solution

(13)

implies thatC0. . . C−kb_−k−1 tends to 0, almost surely, as k→ ∞. It follows that, using the relationb_−k−1=e₁Υ−k−1ω+e_q+1ω,we have

k→∞lim C0. . . C−ke₁Υ−k−1ω= 0, lim

k→∞C0. . . C−ke_q+1ω= 0, a.s. (4.3) Since the components ofωare strictly positive, (4.2) thus holds fori=q+ 1. Using C−ke_q+i= Υ−kB_ie₁+B_ie_q+1+e_q+i+1, i= 1, . . . , p (4.4) with by conventione_p+q+1= 0, fori= 1we obtain

0 = lim

t→∞C0. . . C−ke_q+1≥ lim

k→∞C0. . . C−k+1e_q+2≥0,

where the inequalities are taken componentwise. Therefore, (4.2) holds true for i=q+ 2, and by induction, for i=q+j, j= 1, . . . , pin view of (4.4). Moreover, sinceC−ke_q = Υ−kA_qe₁+A_qe_q+1, (4.2) holds fori=q. We conclude for the other values ofiusing an ascendent recursion, as in the univariate case.

4.2. Proof of Corollary 2.1. Because all the entries of the matricesCtare positive, it is clear thatγ(C₀)is larger than the top Lyapunov exponent of the sequence (C_t^∗)obtained by replacing the matricesA_i by 0 inCt. It is easily seen that the top Lyapunov coefficient of(C_t^∗)coincides with that of the constant sequence equal to B, that is withρ(B). It follows thatγ(C₀)≥logρ(B). Henceγ(C₀)<0entails that all the eigenvalues ofBare outside the unit disk. Finally, the equivalence between the two properties follows from

det(B−λImp) = (−1)^mpdet

λ^pIm−λ^p−1B₁− · · · −λB_p−1−B_p

= (−λ)^mpdetB(1

λ), λ6= 0.

4.3. Proof of Corollary 2.2. It follows from the proof of Lemma 2.3 in Berkes, Horváth and Kokoszka (2003), that the strictly stationary solution defined by (4.1)

(14)

satisfiesEkz˜_tk^s<∞for somes >0.The conclusion follows from:kǫ_tk ≤ kz˜_tk and

kh_tk ≤ kz˜_tk.

4.4. Proof of the Consistency and the Asymptotic Normality of the QML. The proof follows the lines of that of Theorems 2.1 and 2.2 in Francq and Zakoian (2004) for the univariate case.

We shall use the multiplicative norm defined by:

kAk:= sup

kxk≤1kAxk=ρ^1/2(A^′A), (4.5) whereAis ad1×d2 matrix,kxkis the euclidian norm of vectorx∈R^d², andρ(·) denotes the spectral radius. This norm verifies, for anyd2×d1matrixB,

kAk² ≤ X

i,j

a²_i,j =Tr(A^′A)≤d2kAk², |A^′A| ≤ kAk^2d², (4.6)

|Tr(AB)| ≤



 X

i,j

a²_i,j





1/2

 X

i,j

b²_i,j





1/2

≤ {d2d1}^1/2kAkkBk. (4.7)

4.4.1. Proof of Theorem 3.1. Rewrite (3.6) in matrix form as

H_t=c_t+BH_t−1 (4.8)

whereBis defined in Corollary 2.1 and

H_t=





 h_t h_t−1

... h_t−p+1







, c_t=





 ω+

q

X

i=1

A_iǫ_t−i

0 ... 0







. (4.9)

We will establish the following intermediate results.

i) limn→∞sup_θ∈Θ|l_n(θ)−˜l_n(θ)|= 0, a.s.

ii) (∃t∈Z such thath_t(θ) =h_t(θ0) Pθ0 a.s. and R(θ) =R(θ0))

=⇒ θ=θ0,

(15)

iii) Eθ0|ℓt(θ0)|<∞, and ifθ6=θ0, Eθ0ℓt(θ)> Eθ0ℓt(θ0), iv) for anyθ6=θ0there exists a neighborhoodV(θ)such that

lim inf

n→∞ inf

θ^∗∈V(θ)

˜l_n(θ^∗)> Eθ0ℓ1(θ0), a.s.

Proof of i).In view of AssumptionA2and Corollary 2.1, we have ρ(B)<1.By the compactness ofΘwe even have

sup

θ∈Θ

ρ(B)<1. (4.10)

Using iteratively Equation (4.8), we deduce that, almost surely sup

θ∈ΘkH_t−H˜_tk ≤Kρ^t, ∀t, (4.11) whereH˜_tdenotes the vector obtained by replacing the variablesh_t−iby˜h_t−iinH_t. Observe thatK is a random variable which depends on the past values{ǫt, t≤0}. SinceK does not depend onn, it can be considered as a constant, such asρ. From (4.11) we deduce that, almost surely,

sup

θ∈ΘkHt−H˜tk ≤ Kρ^t, ∀t. (4.12) Noting thatkR⁻¹kis the inverse of the eigenvalue of smaller module ofR, and that kD˜⁻¹_t k={mini(hii,t)}⁻¹, we have

sup

θ∈ΘkH˜_t⁻¹k ≤sup

θ∈ΘkD˜_t⁻¹k²kR⁻¹k ≤sup

θ∈Θ{min

i ω(i)}⁻²kR⁻¹k ≤K, (4.13) usingA5, the compactness ofΘand the strict positivity of the components of ω.

Similarly we have

sup

θ∈ΘkH_t⁻¹k ≤K. (4.14)

Now

sup

θ∈Θ|l_n(θ)−˜l_n(θ)| ≤ n⁻¹

n

X

t=1

sup

θ∈Θ

ǫ^′_t(H_t⁻¹−H˜_t⁻¹)ǫt

(4.15)

+n⁻¹

n

X

t=1

sup

θ∈Θ

log|Ht| −log|H˜t| .

(16)

The first sum can be written as n⁻¹

n

X

t=1

sup

θ∈Θ

ǫ^′_tH˜_t⁻¹(Ht−H˜t)H_t⁻¹ǫt

= n⁻¹

n

X

t=1

sup

θ∈Θ

Tr {ǫ^′_tH˜_t⁻¹(Ht−H˜t)H_t⁻¹ǫt}

= n⁻¹

n

X

t=1

sup

θ∈Θ

Tr {H˜_t⁻¹(Ht−H˜t)H_t⁻¹ǫtǫ^′_t}

≤ Kn⁻¹

n

X

t=1

sup

θ∈ΘkH˜_t⁻¹kkHt−H˜tkkH_t⁻¹kkǫtǫ^′_tk

≤ Kn⁻¹

n

X

t=1

ρ^tkǫtǫ^′_tk →0

as n→ ∞, using (4.7), (4.12), (4.13), (4.14), the Cesàro lemma and the fact that ρ^tkǫtǫ^′_tk=ρ^tǫ^′_tǫt→0 a.s. The latter statement can be shown by using the Borel- Cantelli lemma, the Markov inequality and by applying Corollary 2.2:

∞

X

t=1

P(ρ^tǫ^′_tǫt> ε)≤

∞

X

t=1

ρ^stE(ǫ^′_tǫt)^s

ε^s =

∞

X

t=1

ρ^stEkǫtk^2s ε^s <∞.

Now, using (4.6), the triangle inequality and, forx≥ −1,log(1 +x)≤x,we have log|Ht| −log|H˜t| = log|Im+ (Ht−H˜t) ˜H_t⁻¹|

≤ mlogkIm+ (Ht−H˜t) ˜H_t⁻¹k

≤ mlog(kImk+k(Ht−H˜t) ˜H_t⁻¹k)

≤ mlog(1 +k(Ht−H˜t) ˜H_t⁻¹k)

≤ mkHt−H˜tkkH˜_t⁻¹k, and, by symmetry,

log|H˜t| −log|Ht| ≤mkHt−H˜tkkH_t⁻¹k.

Using again (4.12), (4.13) and (4.14) we deduce that, in (4.15), the second sum tends to 0. We thus have shown i).

Proof of ii).Suppose that for someθ6=θ0, the following holds h_t(θ) =h_t(θ0), Pθ0-a.s. and R(θ) =R(θ0).

(17)

Then, it readily follows thatρ=ρ0 and, using the invertibility of the polynomial B^θ(B)under AssumptionA2, by (3.1)

B^θ(1)⁻¹ω+B^θ(B)⁻¹A^θ(B)ǫ_t=B^θ⁰(1)⁻¹ω₀+B^θ⁰(B)⁻¹A^θ⁰(B)ǫ_t

that is

B^θ(1)⁻¹ω− B^θ⁰(1)⁻¹ω₀ = {B^θ⁰(B)⁻¹A^θ⁰(B)− B^θ(B)⁻¹A^θ(B)}ǫ_t := P(B)ǫ_t a.s. ∀t.

Write P(B) = P∞

i=0PⁱBⁱ. Noting that P⁰ = P(0) = 0 and isolating the terms functions of the components ofηt−1, we obtain

P¹(h11,t−1η_1,t−1² , . . . , hmm,t−1η²_m,t−1)^′ =Zt−2, a.s.

where Zt−2 belongs to the σ-field generated by {ηt−2, ηt−3, . . .}.Since ηt−1 is independent from this σ-field, the latter equality contradicts A3 unless if, for i, j = 1, . . . , m, pijhjj,t = 0, a.s., where the pij are the entries of P¹. Because hjj,t > 0 for all j, we thus have P¹ = 0. Similarly, we show that P(B) = 0 by successively considering the past values ofηt−1. Therefore, in view ofA4(orA4’), we haveα=α0 andβ =β0(see Section 3.1). It readily follows thatω=ω₀. Hence θ=θ0. We thus have establishedii).

Proof of iii). We first show that Eθ0ℓt(θ)is well defined in R∪ {+∞} for all θ, and inRforθ=θ0. We have

Eθ0ℓ⁻_t(θ)≤Eθ0log⁻|Ht| ≤max{0,−log(|R|miniω(i)^m)}<∞.

Atθ0, Jensen’s inequality, the second inequality in (4.6) and Corollary 2.2 entail Eθ0log|Ht(θ0)|=Eθ0

m

s log|Ht(θ0)|^s/m

≤ m

s logEθ0kHt(θ0)k^s≤ m

s logEθ0kRk^skDt(θ0)k^2s (Pb?)

≤ K+m

s logEθ0kDt(θ0)k^2s=K+m

s logEθ0(max

i hii,t(θ0))^s

≤ K+m s logEθ0

( X

i

h²_ii,t(θ0) )s/2

=K+m

s logEθ0kh_t(θ0)k^s<∞.

(18)

It follows that

Eθ0ℓt(θ0) = Eθ0

nη^′_tHt(θ0)^1/2^′Ht(θ0)⁻¹Ht(θ0)^1/2ηt+ log|Ht(θ0)|o

= m+Eθ0log|Ht(θ0)|<∞.

BecauseEθ0ℓ⁻_t (θ0)<∞, the existence ofEθ0ℓt(θ0)inRholds. It is thus not restrictive to study the minimum ofEθ0ℓt(θ)for the values ofθsuch thatEθ0|ℓt(θ)|<∞. Denoting byλi,t, the positive eigenvalues ofHt(θ0)H_t⁻¹(θ), we have

Eθ0ℓt(θ)−Eθ0ℓt(θ0)

= Eθ0log |Ht(θ)|

|Ht(θ0)| +Eθ0

nη^′_t[H_t^1/2(θ0)^′H_t⁻¹(θ)H_t^1/2(θ0)−Im]ηt

o

= Eθ0log{|Ht(θ)H_t⁻¹(θ0)|}

+Tr Eθ0

n[H_t^1/2(θ0)^′H_t⁻¹(θ)H_t^1/2(θ0)−Im]o

E(ηtη_t^′)

= Eθ0log{|Ht(θ)H_t⁻¹(θ0)|}+Eθ0 Tr

[Ht(θ0)H_t⁻¹(θ)−Im]

= Eθ0

(_m X

i=1

(λit−1−logλit) )

≥0

becauselogx≤x−1, ∀x >0. Sincelogx=x−1if and only ifx= 1, the inequality is strict unless if, for alli,λit= 1Pθ0-a.s. , that is ifHt(θ) =Ht(θ0), Pθ0-a.s. . This equality is equivalent to

h_t(θ) =h_t(θ0), Pθ0-a.s. and R(θ) =R(θ0) and thus toθ=θ0, from ii).

Proof of iv). The last part of the proof of the consistency uses the compactness ofΘand the ergodicity of(ℓt(θ)), as in the univariate case. Therefore is it omitted.

Theorem 3.1 is thus established.

4.4.2. Proof of Theorem 3.2. We start by stating a few elementary results on the differentiation of expressions involving matrices. Iff(A)is a real valued function of a matrixAwhose entriesaij are functions of some variable x, the chain rule for

(19)

differentiation of composed functions entails

∂f(A)

∂x =X

i,j

∂f(A)

∂aij

∂x =Tr

∂f(A)

∂A^′

∂A

∂x

. (4.16)

Moreover, forA invertible we have

∂c^′Ac

∂A^′ = cc^′ (4.17)

∂Tr(CA^′BA^′)

∂A^′ = C^′AB^′+B^′AC^′ (4.18)

∂log|det(A)|

∂A^′ = A⁻¹ (4.19)

∂A⁻¹

∂x = −A⁻¹∂A

∂xA⁻¹ (4.20)

∂Tr(CA⁻¹B)

∂A^′ = −A⁻¹BCA⁻¹ (4.21)

∂Tr(CAB)

∂A^′ = BC (4.22)

The proof is divided into several steps.

a) First derivative of the criterion. Applying (4.16) and (4.17), then (4.18), (4.19) and (4.20), we obtain

∂ℓt(θ)

∂θi

= Tr

ǫtǫ^′_t∂D_t⁻¹R⁻¹D_t⁻¹

∂θi

+ 2∂log|detDt|

∂θi

= −Tr

ǫtǫ^′_tD_t⁻¹R⁻¹+R⁻¹D_t⁻¹ǫtǫ^′_t

D_t⁻¹∂Dt

∂θi

D⁻¹_t

+2Tr

D⁻¹_t ∂Dt

∂θi

(4.23) fori= 1, . . . , s1=m+ (p+q)m², and using (4.21)

∂ℓt(θ)

∂θi

= −Tr

R⁻¹D⁻¹_t ǫtǫ^′_tD_t⁻¹R⁻¹∂R

∂θi

+Tr

R⁻¹∂R

∂θi

(4.24) fori=s1+ 1, . . . , s0. LettingD0t=Dt(θ0),R0=R(θ0),

D⁽ⁱ⁾_0t =∂Dt

∂θi

(θ0), R⁽ⁱ⁾₀ =∂R

∂θi

(θ0), D_0t^(i,j)= ∂²Dt

∂θi∂θj

(θ0) R₀^(i,j)= ∂²R

∂θi∂θj

(θ0), andη˜t=R^1/2ηt, the score vector writes

∂ℓt(θ0)

∂θi = Trn

Im−R₀⁻¹η˜tη˜^′_t

D⁽ⁱ⁾_0tD⁻¹_0t (4.25) + Im−η˜tη˜_t^′R⁻¹₀

D⁻¹_0tD⁽ⁱ⁾_0to ,

(20)

fori= 1, . . . , s1, and

∂ℓt(θ0)

∂θi

= Trn

Im−R⁻¹₀ η˜tη˜_t^′

R⁻¹₀ R₀⁽ⁱ⁾o

, (4.26)

fori=s1+ 1, . . . , s0.

b) Existence of moments at any order for the score.In view of (4.7) and the Cauchy-Schwarz inequality, we obtain

E

∂ℓt(θ0)

∂θi

∂ℓt(θ0)

∂θj

≤ K

E

D⁻¹_0tD⁽ⁱ⁾_0t

2

E

D⁻¹_0tD^(j)_0t

21/2

,

fori, j= 1, . . . , s1, E

∂ℓt(θ0)

∂θi

∂ℓt(θ0)

∂θj

< KE

D_0t⁻¹D⁽ⁱ⁾_0t ,

for i = 1, . . . , s1 and j = s1+ 1, . . . , s0, and E

∂ℓt(θ0)

∂θ_i

∂ℓt(θ0)

∂θ_j

< K for i, j = s1+ 1, . . . , s0. Note also that

D_0t⁽ⁱ⁾= 1

2D_0t⁻¹diag ∂h_t

∂θi

(θ0)

.

To show that the score admits a second-order moment, it is thus sufficient to prove that

E

1 h_t(i1)

∂h_t(i1)

∂θi

(θ0)

r0

<∞

for alli1= 1, . . . , m, alli= 1, . . . , s1andr0= 2. By (4.8) and (4.10), sup

θ∈Θ

∂H_t

∂θi

<∞, i= 1, . . . , m and, settings2=m+qm²,

∂H_t

∂θi

≤ǫ²_t−j(i) inf

m<k≤s2

θk, i=m+ 1, . . . , s2, (???)

wherej(i)∈ {1, . . . , q}.On the other hand we have

∂H_t

∂θi

=

∞

X

k=1







k

X

j=1

B^j−1B⁽ⁱ⁾B^k−j







c_t−k, i=s2+ 1, . . . , s1,

whereB⁽ⁱ⁾=∂B/∂θi is a matrix whose entries are all 0, apart from a 1 located at the same place asθi in B. By abuse of notation, we denote byH_t(i1) andh_0t(i1)

(21)

thei^th₁ components ofH_tandh_t(θ0). With arguments similar to those used in the univariate case, that is the inequalityx/(1 +x)≤x^s for all x≥0 and s∈ [0,1], and the inequalities

θi

∂H_t

∂θi ≤

∞

X

k=1

kB^kc_t−k, θi

∂H_t(i1)

∂θi ≤

∞

X

k=1

k

m

X

j1=1

B^k(i1, j1)c_t−k(j1)

and, settingω= inf1≤i≤mω(i), H_t(i1)≥ω+

m

X

j1=1

B^k(i1, j1)c_t−k(j1), ∀k,

we obtain θi

H_t(i1)

∂H_t(i1)

∂θi ≤

m

X

j1=1

∞

X

k=1

k

B^k(i1, j1)c_t−k(j1) ω

r0^s

≤K

m

X

j1=1

∞

X

k=1

kρ^k_j₁c^s/r_t−k⁰(j1),

where the constantsρj1 (which also depend of i1,sand r0) belong to the interval [0,1). Noting that these inequalities are uniform on a neighborhood ofθ0∈Θ, that^◦ they can be extended to higher-order derivatives, as in the univariate case, and that Corollary 2.2 implieskc_tk^s<∞, we can show a stronger result than the one announced: for alli1= 1, . . . , m, alli, j, k= 1, . . . , s1and allr0≥0, there exists a neighborhoodV(θ0)ofθ0 such that

E sup

θ∈V(θ0)

1 h_t(i1)

∂h_t(i1)

∂θi

(θ)

r0

<∞, (4.27)

E sup

θ∈V(θ0)

1 h_t(i1)

∂²h_t(i1)

∂θi∂θj (θ)

r0

<∞ (4.28)

and

E sup

θ∈V(θ0)

1 h_t(i1)

∂³h_t(i1)

∂θi∂θj∂θk

(θ)

r0

<∞. (4.29)

c) Asymptotic normality of the score vector. Clearly,{∂ℓt(θ0)/∂θ}t is stationary and∂ℓt(θ0)/∂θ is measurable with respect to the σ-field F^t generated by {ηu, u≤t}. From (4.25) and (4.26) we haveE{∂ℓt(θ0)/∂θ| F^t−1}= 0. The prop- erty b), and in particular (4.27), ensures the existence of the matrixI in Theorem 3.2. It follows that∀λ∈R^p+q+1, the sequence

λ^′_∂θ^∂ℓt(θ0),F^t _tis an ergodic, stationary and square integrable martingale difference. The central limit theorem of